java - Multi-Label Document Classification -


i have database in store data based upon following 3 fields: id, text, {labels}. note each text has been assigned more 1 label \ tag \ class. want build model (weka \ rapidminer \ mahout) able recommend \ classify bunch of labels \ tags \ classes given text.

i have heard svm , naive bayes classifier, not sure whether support multi-label classification or not. guides me right direction more welcome!

the basic multilabel classification method one-vs.-the-rest (ovr), called binary relevance (br). basic idea take off-the-shelf binary classifier, such naive bayes or svm, create k instances of solve k independent classification problems. in python-like pseudocode:

for each class k:     learner = svm(settings)  # example     labels = [class_of(x) == k x in samples]     learner.learn(samples, labels) 

then @ prediction time, run each of binary classifiers on sample , collect labels predict positive.

(both training , prediction can done in parallel, since problems assumed independent. see wikipedia links 2 java packages multi-label classification.)


Comments

Popular posts from this blog

blackberry 10 - how to add multiple markers on the google map just by url? -

php - guestbook returning database data to flash -

delphi - Dynamic file type icon -