r - Issues when using randomForest in caret with ROC as optimization metric -
i'm having issue when constructing random forest models using caret. have dataset of 46k rows , 10 columns (one of optimization target). dataset, i'm trying compare different classifiers. did following:
ctrl = traincontrol(method="boot" ,classprobs=true ,summaryfunction=twoclasssummary ) #glm model: model.glm = train(x=d[,2:10] ,y=d$conv_bt, method='glm' ,trcontrol=ctrl, metric="roc" ,family="binomial") #random forest model: model.rf = train(x=d[,2:10] ,y=d$conv_bt, method='rf' ,trcontrol=ctrl, metric="roc") #naive bayes model: model.nb = train(x=d[,2:10] ,y=d$conv_bt, method='nb' ,trcontrol=ctrl, metric="roc" )
then, model.glm , model.nb both pretty decent. can @ 25 bootstrap replications, , each case has roc of around .7. however, appears wrong model.rf, because reported roc scores around .3. suggests me being specified incorrectly, because switch predictions rf model p 1-p , roc .7, right?
i'm sorry can't provide data (because it's pretty big upload , it's proprietary). other bizarre thing when simulate data, no longer have issue. idea be??? help!
Comments
Post a Comment