Lda Caret Something Is Wrong; All The Accuracy Metric Values Are Missing

I want to predict a variable with Naive Bayes. I tried it with another one from the same dataset and it worked perfect but not with the desired. The variable to predict contains values like 'OL','D. Jan 02, 2018 In what way is it problematic? Also, consider that using parRF has the potential to square the number of processes that you create (train does things in parallel and in each worker, parRF does more in parallel). I think that using the sequential random forest in parallel (instead of using parRF) is more efficient since there is a lot less I/O and worker startups but I don't have a lot of data.

Permalink

Join GitHub today

GitHub is home to over 36 million developers working together to host and review code, manage projects, and build software together.

Find file Copy path

topepoStopped loading namespaces up-front5d12e7bNov 11, 2017

2 contributors

timestamp<- Sys.time()

library(caret)

library(plyr)

library(recipes)

library(dplyr)

model<-'lda'

#########################################################################

set.seed(1)

training<- twoClassSim(50, linearVars=2)

testing<- twoClassSim(500, linearVars=2)

trainX<-training[, -ncol(training)]

trainY<-training$Class

rec_cls<- recipe(Class~ ., data=training) %>%

step_center(all_predictors()) %>%

step_scale(all_predictors())

cctrl1<- trainControl(method='cv', number=3, returnResamp='all',

classProbs=TRUE,

summaryFunction=twoClassSummary)

cctrl2<- trainControl(method='LOOCV',

classProbs=TRUE, summaryFunction=twoClassSummary)

cctrl3<- trainControl(method='none',

classProbs=TRUE, summaryFunction=twoClassSummary)

set.seed(849)

test_class_cv_model<- train(trainX, trainY,

method='lda',

trControl=cctrl1,

metric='ROC',

preProc= c('center', 'scale'))

set.seed(849)

test_class_cv_form<- train(Class~ ., data=training,

method='lda',

trControl=cctrl1,

metric='ROC',

preProc= c('center', 'scale'))

test_class_pred<- predict(test_class_cv_model, testing[, -ncol(testing)])

test_class_prob<- predict(test_class_cv_model, testing[, -ncol(testing)], type='prob')

test_class_pred_form<- predict(test_class_cv_form, testing[, -ncol(testing)])

test_class_prob_form<- predict(test_class_cv_form, testing[, -ncol(testing)], type='prob')

set.seed(849)

test_class_loo_model<- train(trainX, trainY,

method='lda',

trControl=cctrl2,

metric='ROC',

preProc= c('center', 'scale'))

set.seed(849)

test_class_none_model<- train(trainX, trainY,

method='lda',

trControl=cctrl3,

tuneGrid=test_class_cv_model$bestTune,

metric='ROC',

preProc= c('center', 'scale'))

test_class_none_pred<- predict(test_class_none_model, testing[, -ncol(testing)])

test_class_none_prob<- predict(test_class_none_model, testing[, -ncol(testing)], type='prob')

set.seed(849)

test_class_rec<- train(x=rec_cls,

data=training,

method='lda',

trControl=cctrl1,

metric='ROC')

if(

!isTRUE(

all.equal(test_class_cv_model$results,

test_class_rec$results))

)

stop('CV weights not giving the same results')

test_class_imp_rec<- varImp(test_class_rec)

test_class_pred_rec<- predict(test_class_rec, testing[, -ncol(testing)])

test_class_prob_rec<- predict(test_class_rec, testing[, -ncol(testing)],

type='prob')

test_levels<- levels(test_class_cv_model)

if(!all(levels(trainY) %in%test_levels))

cat('wrong levels')

#########################################################################

test_class_predictors1<- predictors(test_class_cv_model)

#########################################################################

tests<- grep('test_', ls(), fixed=TRUE, value=TRUE)

sInfo<- sessionInfo()

timestamp_end<- Sys.time()

save(list= c(tests, 'sInfo', 'timestamp', 'timestamp_end'),

file= file.path(getwd(), paste(model, '.RData', sep='')))

if(!interactive())

q('no')

Copy lines
Copy permalink

I have the same problem with RFE. I do not understand since I explicitly add a function created from twoClassSummary (I tried to use twoClassSumary as well, it did not work). What am I doing wrong?

Example:

Session:
R version 3.4.1 (2017-06-30)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 14.04.5 LTS

Matrix products: default
BLAS: /usr/lib/libblas/libblas.so.3.0
LAPACK: /usr/lib/lapack/liblapack.so.3.0

locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=hu_HU.UTF-8 LC_COLLATE=en_US.UTF-8 LC_MONETARY=hu_HU.UTF-8
[6] LC_MESSAGES=en_US.UTF-8 LC_PAPER=hu_HU.UTF-8 LC_NAME=C LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=hu_HU.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics grDevices utils datasets methods base

other attached packages:
[1] plyr_1.8.4 kernlab_0.9-25 caret_6.0-76 ggplot2_2.2.1 lattice_0.20-35

loaded via a namespace (and not attached):
[1] Rcpp_0.12.12 magrittr_1.5 splines_3.4.1 MASS_7.3-47 munsell_0.4.3 colorspace_1.3-2 rlang_0.1.2
[8] foreach_1.4.3 minqa_1.2.4 stringr_1.2.0 car_2.1-5 tools_3.4.1 nnet_7.3-12 parallel_3.4.1
[15] pbkrtest_0.4-7 grid_3.4.1 gtable_0.2.0 nlme_3.1-131 mgcv_1.8-18 quantreg_5.33 e1071_1.6-8
[22] class_7.3-14 MatrixModels_0.4-1 iterators_1.0.8 lme4_1.1-13 lazyeval_0.2.0 tibble_1.3.3 Matrix_1.2-11
[29] nloptr_1.0.4 reshape2_1.4.2 ModelMetrics_1.1.0 codetools_0.2-15 stringi_1.1.5 compiler_3.4.1 scales_0.4.1
[36] doMC_1.3.4 stats4_3.4.1 SparseM_1.77

Error:
Warning messages:
1: In rfe.default(x = iris[, -c(5, 6)], y = iris[, 6], sizes = c(1, :
Metric 'ROC' is not created by the summary function; 'Accuracy' will be used instead
2: In train.default(x, y, ...) :
The metric 'Accuracy' was not in the result set. ROC will be used instead.