I do confirm that Lasso and LassoLars both minimize
and that the n should not be present in the sparse coding context.
is not correct. I don't know if this also affects the doc of the SGD.
(etc.) docstrings.
regarding the shapes using sparse_encode I'll let Vlad comment.
Post by David Warde-FarleyPost by Olivier GriselPost by David Warde-FarleyPost by Alexandre GramfortPost by David Warde-FarleyThis actually gets at something I've been meaning to fiddle with and report but haven't had time: I'm not sure I completely trust the coordinate descent implementation in scikit-learn, because it seems to give me bogus answers a lot (i.e., the optimality conditions necessary for it to be an actual solution are not even approximately satisfied). Are you guys using something weird for the termination condition?
can you give us a sample X and y that shows the pb?
it should ultimately use the duality gap to stop the iterations but
there might be a corner case …
In [34]: rng = np.random.RandomState(0)
In [35]: dictionary = rng.normal(size=(100, 500)) / 1000; dictionary /=
np.sqrt((dictionary ** 2).sum(axis=0))
In [36]: signal = rng.normal(size=100) / 1000
In [37]: from sklearn.linear_model import Lasso
In [38]: lasso = Lasso(alpha=0.0001, max_iter=1e6, fit_intercept=False,
tol=1e-8)
In [39]: lasso.fit(dictionary, signal)
Lasso(alpha=0.0001, copy_X=True, fit_intercept=False, max_iter=1000000.0,
normalize=False, precompute='auto', tol=1e-08)
In [40]: max(abs(lasso.coef_))
Out[40]: 0.0
In [41]: from pylearn2.optimization.feature_sign import feature_sign_search
In [42]: coef = feature_sign_search(dictionary, signal, 0.0001)
In [43]: max(abs(coef))
Out[43]: 0.0027295761244725018
And I'm pretty sure the latter result is the right one, since
....: gram = np.dot(dictionary.T, dictionary)
....: corr = np.dot(dictionary.T, signal)
....: return - 2 * corr + 2 * np.dot(gram, coefs) + 0.0001 *
np.sign(coefs)
Actually, alpha in scikit-learn is multiplied by n_samples. I agree
this is misleading and not documented in the docstring.
Post by David Warde-FarleyPost by Alexandre GramfortPost by David Warde-Farleylasso = Lasso(alpha=0.0001 / dictionary.shape[0], max_iter=1e6, fit_intercept=False, tol=1e-8).fit(dictionary, signal)
max(abs(lasso.coef_))
0.0027627270397484554
0.00019687294269977963
Seems like there's an added factor of 2 in there as well,
In [94]: lasso = Lasso(alpha=0.0001 / (2 * dictionary.shape[0]),
max_iter=1e8, fit_intercept=False, tol=1e-8).fit(dictionary, signal)
In [95]: coef = feature_sign_search(dictionary, signal, 0.0001)
In [96]: allclose(lasso.coef_, coef, atol=1e-7)
Out[96]: True
I think you're right that the precise cost function definitely ought to be
documented in the front-facing classes rather than just the low-level Cython
routines.
I also think that scaling the way Lasso/ElasticNet does in the context of
sparse coding may be very confusing, since in sparse coding it corresponds
not to a number of training samples in a regression problem but to the number
of input dimensions.
The docstring of sparse_encode is quite confusing in that X, the dictionary,
says "n_samples, n_components". The number of samples (in the context of
sparse coding) should have no influence over the shape of the dictionary;
this seems to have leaked over from the Lasso documentation.
The shape and mathematical definition of cov doesn't make much sense to me
given this change, though (or to begin with, for that matter): In the case of
a single problem, the desired covariance is X^T y, with y a column vector,
yielding another column vector of (n_components, 1). So the shape, if you
have multiple examples you're precomputing for, should end up being
(n_components, n_samples), and given the shape of Y that would be achieved by
X^T Y^T.
David
------------------------------------------------------------------------------
Cloud Services Checklist: Pricing and Packaging Optimization
This white paper is intended to serve as a reference, checklist and point of
discussion for anyone considering optimizing the pricing and packaging model
of a cloud services business. Read Now!
http://www.accelacomm.com/jaw/sfnl/114/51491232/
_______________________________________________
Scikit-learn-general mailing list
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general