[Scikit-learn-general] Fwd: [Scikit-learn-commits] [scikit-learn/scikit-learn] cd8c6b: one more test for SVD

Discussion:

Olivier Grisel

2010-11-29 04:09:41 UTC

Hi again,

I finally found the time to finish watching Gunnar Martinsson's NIPS
tutorial on videolecture.net and the fast_svd method is indeed able to
recover a fairly accurate variant of the singular vectors even if the
data is not low rank (as this is often the case in practice) as long
as we perform a couple of power iteration steps (q = 3 sounds like a
good default parameter).

Furthermore in scikit-learn we don't really care of the singular
values / vectors are exact to the 7th decimal. We are mostly using PCA
/ SVD as a feature extractor / normalizer. Hence I think we could make
the PCA class use the approximate randomized method by default.

We need to investigate further on the kmeans SVD seeding strategy that
could get a really great boost from this method too when k is small.

Also NNMF can be seeded by a 2 SVDs:
http://www.cs.rpi.edu/~boutsc/files/nndsvd.pdf hence it might be
possible to make NNMF fast by combining both strategies (even though
it is probably not as interesting as the state of the art online
dictionary learning stuff).

Vlad, if you decide to start working on sparse PCA, NNMF and friends
be sure to familiarize yourself with this technique first:

http://videolectures.net/nips09_martinsson_mvll/ (this tutorial is a
bit long but excellent)

The manifold module might also benefit from this: if I understand
correctly some of those algorithms are based on SVDs of non linear
similarity matrices of the raw data (I am not sure though).

The spanning col / rows extraction (a.k.a skeleton summary extraction)
mentioned in the tutorial is another really interesting unsupervised
algorithms that would be very interesting to have in the scikit.

Good night (whatever your timezone is :)

---------- Forwarded message ----------
From: <***@github.com>
Date: 2010/11/29
Subject: [Scikit-learn-commits] [scikit-learn/scikit-learn] cd8c6b:
one more test for SVD
To: scikit-learn-***@lists.sourceforge.net

Branch: refs/heads/master
Home: https://github.com/scikit-learn/scikit-learn

Commit: cd8c6b00d390b61aaa7d6fd7a391c128cf132e42
https://github.com/scikit-learn/scikit-learn/commit/cd8c6b00d390b61aaa7d6fd7a391c128cf132e42
Author: Olivier Grisel <***@ensta.org>
Date: 2010-11-28 (Sun, 28 Nov 2010)

Changed paths:
M scikits/learn/utils/tests/test_svd.py

Log Message:
-----------
one more test for SVD

--
Olivier
http://twitter.com/ogrisel - http://github.com/ogrisel

Alexandre Passos

2010-11-29 09:24:32 UTC

Permalink

Post by Olivier Grisel
Hi again,
I finally found the time to finish watching Gunnar Martinsson's NIPS
tutorial on videolecture.net and the fast_svd method is indeed able to
recover a fairly accurate variant of the singular vectors even if the
data is not low rank (as this is often the case in practice) as long
as we perform a couple of power iteration steps (q = 3 sounds like a
good default parameter).
Furthermore in scikit-learn we don't really care of the singular
values / vectors are exact to the 7th decimal. We are mostly using PCA
/ SVD as a feature extractor / normalizer. Hence I think we could make
the PCA class use the approximate randomized method by default.
We need to investigate further on the kmeans SVD seeding strategy that
could get a really great boost from this method too when k is small.
http://www.cs.rpi.edu/~boutsc/files/nndsvd.pdf hence it might be
possible to make NNMF fast by combining both strategies (even though
it is probably not as interesting as the state of the art online
dictionary learning stuff).
Vlad, if you decide to start working on sparse PCA, NNMF and friends
http://videolectures.net/nips09_martinsson_mvll/ (this tutorial is a
bit long but excellent)

--
- Alexandre

Vlad Niculae

2010-11-29 10:53:08 UTC

Permalink

Thank you for the input! I was considering using abs(randn()) as
initial values. Will update my implementation accordingly.

I am indeed working on this. At the moment I'm cleaning up the NMF
implementation based on
alternating non negative least squares that was mentioned.

I hope it will be runnable soon.

On Mon, Nov 29, 2010 at 11:24 AM, Alexandre Passos

Post by Alexandre Passos

If the tutorial leaves some gaps in the understanding, the
accompanying survey is long, bur really useful. It's called "Finding
structure from randomness" (
http://amath.colorado.edu/faculty/martinss/Pubs/2009_HMT_random_review.pdf
) and has more proofs, more applications, and a better discussion of
what is going on in the more complex power iteration method (as well
as an explanation of how to solve some other problems with similar
methods).
--
- Alexandre
------------------------------------------------------------------------------
Increase Visibility of Your 3D Game App & Earn a Chance To Win $500!
Tap into the largest installed PC base & get more eyes on your game by
optimizing for Intel(R) Graphics Technology. Get started today with the
Intel(R) Software Partner Program. Five $500 cash prizes are up for grabs.
http://p.sf.net/sfu/intelisp-dev2dev
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Gael Varoquaux

2010-11-29 11:37:51 UTC

Permalink

Post by Vlad Niculae
Thank you for the input! I was considering using abs(randn()) as
initial values. Will update my implementation accordingly.

Just a remark with regards to random number in the code: it is important
to be able to control the entropy of the pseudo random number generation
in the code. The best way to do this in Python is to open the door to
passing in a pseudo random number generator. You can see how Alex
(Passos) implemented this in his fast_svd:
https://github.com/scikit-learn/scikit-learn/blob/master/scikits/learn/utils/extmath.py#L97
(the rng keyword argument). I think that this is a very good way of
dealing with this issue.

And please pardon me if I am stating the obvious.

Gael

Alexandre Gramfort

2010-11-30 02:31:31 UTC

Permalink

Post by Olivier Grisel
The manifold module might also benefit from this: if I understand
correctly some of those algorithms are based on SVDs of non linear
similarity matrices of the raw data (I am not sure though).

indeed with a k-NN based affinity matrix a laplacian eigenmap for
example is obtain by computation the first eigenvector of a sparse matrix.
By the way (might be trivial answer) can you compute generalized
eigenvectors ie. x such that Ax = lambda Bx ? Without inverting B of
course.

Alex

Alexandre Passos

2010-11-30 08:40:21 UTC

Permalink

On Tue, Nov 30, 2010 at 00:31, Alexandre Gramfort

Post by Alexandre Gramfort

I think these methods only work for the eigenvectors with largest
eigenvalue, not with smallest, or generalized eigenvectors, so lanczos
is still probably best for that.

--
- Alexandre

Olivier Grisel

2010-11-30 09:22:33 UTC

Permalink

Post by Alexandre Gramfort

Probably not this morning without a couple more black coffees. Why
would I want to do such a thing in the first place?

--
Olivier
http://twitter.com/ogrisel - http://github.com/ogrisel

Alexandre Gramfort

2010-11-30 16:13:21 UTC

Permalink

Post by Olivier Grisel
Probably not this morning without a couple more black coffees. Why
would I want to do such a thing in the first place?

generalized eigenvectors pbs appear in manifold learning for example

Alex