celer.GroupLasso#

class celer.GroupLasso(groups=1, alpha=1.0, max_iter=100, max_epochs=50000, p0=10, verbose=0, tol=0.0001, prune=True, fit_intercept=True, weights=None, warm_start=False)[source]#

Group Lasso scikit-learn estimator based on Celer solver

The optimization objective for the Group Lasso is:

(1 / (2 * n_samples)) * ||y - X w||^2_2 + alpha * \sum_g weights_g ||w_g||_2

where w_g are the regression coefficients of group number g.

Parameters:
groupsint | list of ints | list of lists of ints.

Partition of features used in the penalty on w. If an int is passed, groups are contiguous blocks of features, of size groups. If a list of ints is passed, groups are assumed to be contiguous, group number g being of size groups[g]. If a list of lists of ints is passed, groups[g] contains the feature indices of the group number g.

alphafloat, optional

Constant that multiplies the penalty term. Defaults to 1.0.

max_iterint, optional

The maximum number of iterations (subproblem definitions)

max_epochsint

Maximum number of BCD epochs on each subproblem.

p0int

First working set size.

verbosebool or integer

Amount of verbosity.

tolfloat, optional

Stopping criterion for the optimization: the solver runs until the duality gap is smaller than tol * norm(y) ** 2 / len(y) or the maximum number of iteration is reached.

prunebool, optional (default=True)

Whether or not to use pruning when growing working sets.

fit_interceptbool, optional (default=True)

Whether or not to fit an intercept.

weightsarray, shape (n_groups,), optional (default=None)

Strictly positive weights used in the L2 penalty part of the GroupLasso objective. If None, weights equal to 1 are used.

warm_startbool, optional (default=False)

When set to True, reuse the solution of the previous call to fit as initialization, otherwise, just erase the previous solution.

References

[1]

M. Massias, A. Gramfort, J. Salmon “Celer: a Fast Solver for the Lasso wit Dual Extrapolation”, ICML 2018, http://proceedings.mlr.press/v80/massias18a.html

[2]

M. Massias, S. Vaiter, A. Gramfort, J. Salmon “Dual extrapolation for sparse Generalized Linear Models”, JMLR 2020, https://arxiv.org/abs/1907.05830

Examples

>>> from celer import GroupLasso
>>> clf = GroupLasso(alpha=0.5, groups=[[0, 1], [2]])
>>> clf.fit([[0, 0, 1], [1, -1, 2], [2, 0, -1]], [1, 1, -1])
GroupLasso(alpha=0.5, fit_intercept=True,
groups=[[0, 1], [2]], max_epochs=50000, max_iter=100,
p0=10, prune=True, tol=0.0001, verbose=0, warm_start=False)
>>> print(clf.coef_)
[-0.         -0.          0.39285714]
>>> print(clf.intercept_)
0.07142857142857145
Attributes:
coef_array, shape (n_features,)

parameter vector (w in the cost function formula)

sparse_coef_scipy.sparse matrix, shape (n_features, 1)

Sparse representation of the fitted coef_.

intercept_float

constant term in decision function.

n_iter_int

Number of subproblems solved by Celer to reach the specified tolerance.

__init__(groups=1, alpha=1.0, max_iter=100, max_epochs=50000, p0=10, verbose=0, tol=0.0001, prune=True, fit_intercept=True, weights=None, warm_start=False)[source]#

Methods

__init__([groups, alpha, max_iter, ...])

fit(X, y[, sample_weight, check_input])

Fit model with coordinate descent.

get_params([deep])

Get parameters for this estimator.

path(X, y, alphas[, coef_init, return_n_iter])

Compute Group Lasso path with Celer.

predict(X)

Predict using the linear model.

score(X, y[, sample_weight])

Return the coefficient of determination of the prediction.

set_params(**params)

Set the parameters of this estimator.

Attributes

sparse_coef_

Sparse representation of the fitted coef_.