celer.LogisticRegression#

class celer.LogisticRegression(C=1.0, penalty='l1', solver='celer-pn', tol=0.0001, fit_intercept=False, max_iter=50, verbose=False, max_epochs=50000, p0=10, warm_start=False)[source]#

Sparse Logistic regression scikit-learn estimator based on Celer solver.

The optimization objective for sparse Logistic regression is:

\sum_1^n_samples log(1 + e^{-y_i x_i^T w}) + 1. / C * ||w||_1

The solvers use a working set strategy. To solve problems restricted to a subset of features, Celer uses coordinate descent while PN-Celer uses a Prox-Newton strategy (detailed in [1], Sec 5.2).

Parameters:
Cfloat, default=1.0

Inverse of regularization strength; must be a positive float.

penalty‘l1’.

Other penalties are not supported.

solver“celer” | “celer-pn”, default=”celer-pn”

Algorithm to use in the optimization problem.

  • celer-pn uses working sets and prox-Newton solver on the working set.

  • celer uses working sets and coordinate descent

tolfloat, optional

Stopping criterion for the optimization: the solver runs until the duality gap is smaller than tol * len(y) * log(2) or the maximum number of iteration is reached.

fit_interceptbool, optional (default=False)

Whether or not to fit an intercept. Currently True is not supported.

max_iterint, optional

The maximum number of iterations (subproblem definitions)

verbosebool or integer

Amount of verbosity.

max_epochsint

Maximum number of CD epochs on each subproblem.

p0int

First working set size.

warm_startbool, optional (default=False)

When set to True, reuse the solution of the previous call to fit as initialization, otherwise, just erase the previous solution. Only False is supported so far.

See also

celer_path

References

[1]

M. Massias, S. Vaiter, A. Gramfort, J. Salmon “Dual Extrapolation for Sparse Generalized Linear Models”, JMLR 2020, https://arxiv.org/abs/1907.05830

Examples

>>> from celer import LogisticRegression
>>> clf = LogisticRegression(C=1.)
>>> clf.fit([[0, 0], [1, 1], [2, 2]], [0, 1, 1])
LogisticRegression(C=1.0, penalty='l1', tol=0.0001, fit_intercept=False,
max_iter=50, verbose=False, max_epochs=50000, p0=10, warm_start=False)
>>> print(clf.coef_)
[[0.4001237  0.01949392]]
Attributes:
classes_ndarray of shape (n_classes, )

A list of class labels known to the classifier.

coef_ndarray of shape (1, n_features) or (n_classes, n_features)

Coefficient of the features in the decision function.

coef_ is of shape (1, n_features) when the given problem is binary.

intercept_ndarray of shape (1,) or (n_classes,)

constant term in decision function. Not handled yet.

n_iter_int

Number of subproblems solved by Celer to reach the specified tolerance.

__init__(C=1.0, penalty='l1', solver='celer-pn', tol=0.0001, fit_intercept=False, max_iter=50, verbose=False, max_epochs=50000, p0=10, warm_start=False)[source]#

Methods

__init__([C, penalty, solver, tol, ...])

decision_function(X)

Predict confidence scores for samples.

densify()

Convert coefficient matrix to dense array format.

fit(X, y)

Fit the model according to the given training data.

get_params([deep])

Get parameters for this estimator.

path(X, y, Cs, solver[, coef_init])

Compute sparse Logistic Regression path with Celer-PN.

predict(X)

Predict class labels for samples in X.

predict_log_proba(X)

Predict logarithm of probability estimates.

predict_proba(X)

Probability estimates.

score(X, y[, sample_weight])

Return the mean accuracy on the given test data and labels.

set_params(**params)

Set the parameters of this estimator.

sparsify()

Convert coefficient matrix to sparse format.