Big in Twitter is small on accuracy

Last few days have been a good example of what’s wrong with relying on a plain old Bayes classifier. Big in Twitter has had the band “The Field” up as the most popular with 671 mentions. The next, Peter King, only has 51 mentions. There’s nothing technically wrong with the classifier (I’m using the gem) however the training needs improving. I don’t want to do it. It’s tedious.

Now I have a research problem: exploring machine learning & data mining approaches. Also means going back to Java since it has Weka and LingPipe. Good old Java. Anyone have any favorite parts of those libraries?

This entry was posted in Uncategorized. Bookmark the permalink. Both comments and trackbacks are currently closed.