Monday 27 July 2009

Some Machine Learning Libraries

I've been doing some experiments using "machine learning" on several projects and I would like to talk a bit about them. For now all I am coding in Python, but also I'll comment on some Java and C++ libraries.

A simple to use is FANN (Fast Artificial Neural Network). It also has ports for Python and other languages (PHP, Java, Perl, etc.. Although the Python version of Python did not work for me for some reason).

For Support Vector Machines I used LIBSVM (A Library for Support Vector Machines). In the website you can even find a number of recommendations for using SVMs. Other libraries supporting SVM are PyML and MLPy (but for some reason the compilation did not work on my machine, so I used LIBSVM).

A very interesting library implementing a Naive Bayes Classifier is Orange. I have not tested but it looks good, plus, it has good documentation and links to various datasets.

If you are interested in Reinforcement Learning, Tiles is a library in Python (also in C + + and Lisp) that allows you to "transform" the inputs to a value function represented by an array of tiles. In general, to represent a state in high resolution tiles are better than just simple states.

If you want a "decision tree" you can use this that is included and explained in the book "Collective Intelligence". I think that the algorithm used is based on ID3.

And finally, mahout. This is an Apache Foundation project. For now is out of my reach to test it. I do not have the infrastructure or the need to use it. It is based on Hadoop and mapreduce concepts. Very interesting.


PS: If you want more resources about machine learning, these are my delicious bookmarks on the topic.