Monday, 27 July 2009

Some Machine Learning Libraries

I've been doing some experiments using "machine learning" on several projects and I would like to talk a bit about them. For now all I am coding in Python, but also I'll comment on some Java and C++ libraries.

A simple to use is FANN (Fast Artificial Neural Network). It also has ports for Python and other languages (PHP, Java, Perl, etc.. Although the Python version of Python did not work for me for some reason).

For Support Vector Machines I used LIBSVM (A Library for Support Vector Machines). In the website you can even find a number of recommendations for using SVMs. Other libraries supporting SVM are PyML and MLPy (but for some reason the compilation did not work on my machine, so I used LIBSVM).

A very interesting library implementing a Naive Bayes Classifier is Orange. I have not tested but it looks good, plus, it has good documentation and links to various datasets.

If you are interested in Reinforcement Learning, Tiles is a library in Python (also in C + + and Lisp) that allows you to "transform" the inputs to a value function represented by an array of tiles. In general, to represent a state in high resolution tiles are better than just simple states.

If you want a "decision tree" you can use this that is included and explained in the book "Collective Intelligence". I think that the algorithm used is based on ID3.

And finally, mahout. This is an Apache Foundation project. For now is out of my reach to test it. I do not have the infrastructure or the need to use it. It is based on Hadoop and mapreduce concepts. Very interesting.


PS: If you want more resources about machine learning, these are my delicious bookmarks on the topic.

3 comments:

Davide said...

Hi, what is the problem with the MLPY module?

Arturo Servin said...

I had a problem with the install in my Mac (OSX 10.5), basically the install does not found gsl_wavelet.h (it's there) I could not find how to tell to the installer where it is (sorry, kinda newbie).

I haven't tried in my Linux machine, but I will. The library looks very complete.

By the way, do you know how I tell the installer that gsl_wavelet.h is in /opt/local/include/ instead of whatever it thinks the file is?

Davide said...

Hi, you cabn try this:

$ export CPPFLAGS="-I/opt/local/include/"
$ export CPATH="/opt/local/include/"
$ sudo python setup.py install