Get started#

In this starter examples, we will fit a Lasso estimator on a toy dataset.

Beforehand, make sure to install celer:

$ pip install -U celer

Generate a toy dataset#

celer comes with a module, Datasets fetchers, that expose several functions to fetch/generate datasets. We are going to use make_correlated_data to generate a toy dataset.

# imports
from celer.datasets import make_correlated_data
from sklearn.model_selection import train_test_split

# generate the toy dataset
X, y, _ = make_correlated_data(n_samples=500, n_features=5000)
# split the dataset into training and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2,
                                                    random_state=42)

Fit and score a Lasso estimator#

celer exposes easy-to-use to use estimators as it was designed under the scikit-learn API. celer also integrates well with it (e.g. the Pipeline and GridSearchCV).

# import model
from celer import Lasso

# init and fit
model = Lasso()
model.fit(X_train, y_train)

# print R²
print(model.score(X_test, y_test))

Perform cross-validation#

celer Lasso estimator comes with native cross-validation. The following snippets performs cross-validation on a grid 100 alphas using 5 folds. And look how fast celer is compared to the scikit-learn.

# imports
import time
from celer import LassoCV
from sklearn.linear_model import LassoCV as sk_LassoCV

# fit for celer
start = time.time()
celer_lassoCV = LassoCV(n_alphas=100, cv=5)
celer_lassoCV.fit(X, y)
print(f"time elapsed for celer LassoCV: {time.time() - start}")

# fit for scikit-learn
start = time.time()
sk_lassoCV = sk_LassoCV(n_alphas=100, cv=5)
sk_lassoCV.fit(X, y)
print(f"time elapsed for scikit-learn LassoCV: {time.time() - start}")

Out:

time elapsed for celer LassoCV: 5.062559127807617
time elapsed for scikit-learn LassoCV: 27.427260398864746