Hi Bela.
I'm not sure how this usually goes but here is my current wish list.
We'd have to discuss whether any of that actually fits into the scikits,
thou ;)
- Multilayer Perceptron and Multinomial Logistic regression
I have been working on that so maybe there is not enough
left to do there for a GSoC. Not sure, though
- Graph Cut Energy minimization
This is an inference technique so I'm not totally sure
if this should go into scikit-learn. Could also be a
candidate for scikit-image.
The main work would be to implement an efficient
max flow algorithm and then do graph constructions
for alpha expansion and alpha-beta swaps.
- Averaged gradient descent
I think this is on everybody's wish list. Not
sure how much work this will be.
I'm sure lot's of people will have to say something to that ;)
See issue #543: https://github.com/scikit-learn/scikit-learn/issues/543
- Structured SVM / CRF learning
This is a big one. Not sure what other people think of it.
I think having a structured SVM would be great.
At the moment, the most commonly used implementation
is Joachim's SVMstruct.
This has licensing issues but talking to him might help.
Another option is implementing optimization via SGD
or, if you want to go crazy, cutting plane techniques
or bundle methods yourself.
Designing the interface is also non-trivial.
One would have to think about whether / how it
is possible to use structured SVMs just from Python,
without writing Cython functions.
- Low rank kernel approximations (Nystrom methods)
This is mainly interesting for SVMs.
The idea is to approximate the kernel matrix with
a low rank factorization and use this to construct
a linear SVM problem.
This is related to the current kernel approximation
module but has a somewhat other approach.
This method makes large scale SVMs fast / possible
- Kernel Perceptron
There is a (I think) pure Python implementation
by Mathieu that could be Cythonized.
That's it for the moment, I think.
I'd be happy to mentor any of the above projects
if the others agree that they are sensible.
Maybe we should update the wiki for the next GSoC?
Cheers,
Andy
Post by Bala Subrahmanyam VaranasiDear all,
I would like to participate in Google Summer of Code this year. Please
let me know the ideas which you would like to implement in
scikit-learn in GSoC - 2012.
Also... I'm attending to Stanford's Online courses - ML class and NLP
class. I believe this is the right time to discuss. Because, I can
learn new things before the start of GSoC and can work on challenging
implementations in scikit-learn.
Thank you.
Bala Subrahmanyam Varanasi
IV B.Tech, Information Technology
Vishnu Institute of Technology
contact number: +919985415959