Lars Buitinck
2011-05-02 12:40:31 UTC
Hi,
I have the intention of adding perceptrons to scikit-learn and was
urged by Gael to first discuss the addition on this mailing list
because of plans to add online learning. So, here goes.
I found a very clean, numpy-based library for (averaged) perceptrons
at http://code.google.com/p/python-perceptron/ and thought this would
make a good candidate for scikit-learn after some cleanup. I contacted
the author about this and he said he wasn't maintaining it anymore and
that incorporation into scikit-learn would be the best thing. This
library seems to do 1-vs-all classification when facing >2 classes.
I'm currently rewriting parts of it to conform with the scikit-learn
interface.
Now, as for online learning: this is not my prime interest, but
python-perceptron does already have such an interface: labeled
examples must be fed to it one by one.
I'm currently modifying it to do have a fit method that does multiple
iterations over its dataset, as recommended by Freund and Schapire in
*Large margin classification using the perceptron algorithm*. I can of
course stipulate in the interface that classification and prediction
may be interleaved. I can also add a method that learns from a single
instance, but I don't know what to call that; I was thinking of
'update', but maybe someone else has a better suggestion. (Matthieu
Blondel mentioned partial_fit?)
Regards,
I have the intention of adding perceptrons to scikit-learn and was
urged by Gael to first discuss the addition on this mailing list
because of plans to add online learning. So, here goes.
I found a very clean, numpy-based library for (averaged) perceptrons
at http://code.google.com/p/python-perceptron/ and thought this would
make a good candidate for scikit-learn after some cleanup. I contacted
the author about this and he said he wasn't maintaining it anymore and
that incorporation into scikit-learn would be the best thing. This
library seems to do 1-vs-all classification when facing >2 classes.
I'm currently rewriting parts of it to conform with the scikit-learn
interface.
Now, as for online learning: this is not my prime interest, but
python-perceptron does already have such an interface: labeled
examples must be fed to it one by one.
I'm currently modifying it to do have a fit method that does multiple
iterations over its dataset, as recommended by Freund and Schapire in
*Large margin classification using the perceptron algorithm*. I can of
course stipulate in the interface that classification and prediction
may be interleaved. I can also add a method that learns from a single
instance, but I don't know what to call that; I was thinking of
'update', but maybe someone else has a better suggestion. (Matthieu
Blondel mentioned partial_fit?)
Regards,
--
Lars Buitinck
Scientific programmer, ILPS
University of Amsterdam
Lars Buitinck
Scientific programmer, ILPS
University of Amsterdam