celer.LogisticRegression¶

class celer.LogisticRegression(C=1.0, penalty='l1', solver='celer-pn', tol=0.0001, fit_intercept=False, max_iter=50, verbose=False, max_epochs=50000, p0=10, warm_start=False)[source]¶

Sparse Logistic regression scikit-learn estimator based on Celer solver.

The optimization objective for sparse Logistic regression is:

\sum_1^n_samples log(1 + e^{-y_i x_i^T w}) + 1. / C * ||w||_1

The solvers use a working set strategy. To solve problems restricted to a subset of features, Celer uses coordinate descent while PN-Celer uses a Prox-Newton strategy (detailed in [1], Sec 5.2).

Parameters:

Cfloat, default=1.0

Inverse of regularization strength; must be a positive float.

penalty‘l1’.

Other penalties are not supported.

solver“celer” | “celer-pn”, default=”celer-pn”

Algorithm to use in the optimization problem.

celer-pn uses working sets and prox-Newton solver on the working set.
celer uses working sets and coordinate descent

tolfloat, optional

Stopping criterion for the optimization: the solver runs until the duality gap is smaller than tol * len(y) * log(2) or the maximum number of iteration is reached.

fit_interceptbool, optional (default=False)

Whether or not to fit an intercept. Currently True is not supported.

max_iterint, optional

The maximum number of iterations (subproblem definitions)

verbosebool or integer

Amount of verbosity.

max_epochsint

Maximum number of CD epochs on each subproblem.

p0int

First working set size.

warm_startbool, optional (default=False)

When set to True, reuse the solution of the previous call to fit as initialization, otherwise, just erase the previous solution. Only False is supported so far.

See also

celer_path

References

[1]

M. Massias, S. Vaiter, A. Gramfort, J. Salmon “Dual Extrapolation for Sparse Generalized Linear Models”, JMLR 2020, https://arxiv.org/abs/1907.05830

Examples

>>> from celer import LogisticRegression
>>> clf = LogisticRegression(C=1.)
>>> clf.fit([[0, 0], [1, 1], [2, 2]], [0, 1, 1])
LogisticRegression(C=1.0, penalty='l1', tol=0.0001, fit_intercept=False,
max_iter=50, verbose=False, max_epochs=50000, p0=10, warm_start=False)

>>> print(clf.coef_)
[[0.4001237  0.01949392]]

Attributes:

classes_ndarray of shape (n_classes, )

A list of class labels known to the classifier.

coef_ndarray of shape (1, n_features) or (n_classes, n_features)

Coefficient of the features in the decision function.

coef_ is of shape (1, n_features) when the given problem is binary.

intercept_ndarray of shape (1,) or (n_classes,)

constant term in decision function. Not handled yet.

n_iter_int

Number of subproblems solved by Celer to reach the specified tolerance.

__init__(C=1.0, penalty='l1', solver='celer-pn', tol=0.0001, fit_intercept=False, max_iter=50, verbose=False, max_epochs=50000, p0=10, warm_start=False)[source]¶

Methods

`__init__`([C, penalty, solver, tol, ...])
`decision_function`(X)	Predict confidence scores for samples.
`densify`()	Convert coefficient matrix to dense array format.
`fit`(X, y)	Fit the model according to the given training data.
`get_metadata_routing`()	Get metadata routing of this object.
`get_params`([deep])	Get parameters for this estimator.
`path`(X, y, Cs, solver[, coef_init])	Compute sparse Logistic Regression path with Celer-PN.
`predict`(X)	Predict class labels for samples in X.
`predict_log_proba`(X)	Predict logarithm of probability estimates.
`predict_proba`(X)	Probability estimates.
`score`(X, y[, sample_weight])	Return the mean accuracy on the given test data and labels.
`set_fit_request`(*[, sample_weight])	Request metadata passed to the `fit` method.
`set_params`(**params)	Set the parameters of this estimator.
`set_score_request`(*[, sample_weight])	Request metadata passed to the `score` method.
`sparsify`()	Convert coefficient matrix to sparse format.