Run MultitaskLassoCV and compare performance with scikit-learn#

The example runs the MultitaskLassoCV scikit-learn like estimator.

import time
import numpy as np
import matplotlib.pyplot as plt

from celer import MultiTaskLassoCV
from numpy.linalg import norm
from sklearn.utils import check_random_state
from sklearn import linear_model

rng = check_random_state(0)

Generate some 2D coefficients with sine waves with random frequency and phase

n_samples, n_features, n_tasks = 100, 500, 50
n_relevant_features = 50
support = rng.choice(n_features, n_relevant_features, replace=False)
coef = np.zeros((n_tasks, n_features))
times = np.linspace(0, 2 * np.pi, n_tasks)
for k in support:
    coef[:, k] = np.sin((1. + rng.randn(1)) * times + 3 * rng.randn(1))


X = rng.randn(n_samples, n_features)
Y = np.dot(X, coef.T) + rng.randn(n_samples, n_tasks)
Y /= norm(Y, ord='fro')

Fit with sklearn and celer, using the same API

params = dict(tol=1e-6, cv=4, n_jobs=1, n_alphas=20)
t0 = time.perf_counter()
clf = MultiTaskLassoCV(**params).fit(X, Y)
t_celer = time.perf_counter() - t0

t0 = time.perf_counter()
clf_sklearn = linear_model.MultiTaskLassoCV(**params).fit(X, Y)
t_sklearn = time.perf_counter() - t0

Celer is faster

print("Time for celer  : %.2f s" % t_celer)
print("Time for sklearn: %.2f s" % t_sklearn)
Time for celer  : 20.73 s
Time for sklearn: 24.60 s

Both packages find the same solution

print("Celer's optimal regularizer  : %s" % clf.alpha_)
print("Sklearn's optimal regularizer: %s" % clf_sklearn.alpha_)

print("Relative norm difference between optimal coefs: %.2f %%" %
      (100 * norm(clf.coef_ - clf_sklearn.coef_) / norm(clf.coef_)))
Celer's optimal regularizer  : 0.00134781376541042
Sklearn's optimal regularizer: 0.00134781376541042
Relative norm difference between optimal coefs: 0.00 %
fig, axarr = plt.subplots(2, 1, constrained_layout=True)
axarr[0].spy(clf.coef_, aspect="auto")
axarr[0].xaxis.tick_bottom()
axarr[0].set_title("celer")
axarr[0].set_ylabel("tasks")
axarr[1].spy(clf_sklearn.coef_, aspect="auto")
axarr[1].xaxis.tick_bottom()
axarr[1].set_title("sklearn")
plt.suptitle("Sparsity patterns")
plt.ylabel("tasks")
plt.xlabel("features")
plt.show(block=False)
Sparsity patterns, celer, sklearn

Total running time of the script: (0 minutes 45.646 seconds)

Gallery generated by Sphinx-Gallery