python - grid search cross-validation on SVC probability output in sci-kit learn -

September 15, 2011

i'd run grid search cross-validation on probability outputs of svc classifier. in particular i'd minimize negative log likelihood. documentation seems gridsearchcv calls predict() method of estimator passed , predict() method of svc returns class predictions not probabilities (predict_proba() returns class probabilities).

1) need subclass svc , give predict() method returns probabilities rather classes accomplish log likelihood cross validation? guess need write own score_func or loss_func?

2) cross-validating on negative log likelihood dumb? i'm doing b/c dataset is: a) imbalanced 5:1 , b) not @ separable i.e. "worst" observations have > 50% chance of being in "good" class. (will post 2nd question on stats q&a)

yes, would, on both accounts.

class probsvc(svc):     def predict(self, x):         return super(probsvc, self).predict_proba(x)

i'm not sure if work since majority class may still dominate log-likelihood scores , final estimator might still produce >.5 positive samples of minority class. i'm not sure, though, please post stats.

Search This Blog

KHS

python - grid search cross-validation on SVC probability output in sci-kit learn -

Comments

Post a Comment

Popular posts from this blog

blackberry 10 - how to add multiple markers on the google map just by url? -

php - guestbook returning database data to flash -

java - Using an Integer ArrayList in Android -