Lasso path computation on Finance/log1p dataset#

The example runs the Celer algorithm on the Finance dataset which is a large sparse dataset.

Running time is not compared with the scikit-learn implementation as it makes the example too long to run.

import time

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from libsvmdata import fetch_libsvm

from celer import celer_path

print(__doc__)

print("*** Warning: this example may take more than 5 minutes to run ***")
X, y = fetch_libsvm('finance')
y -= np.mean(y)
n_samples, n_features = X.shape
alpha_max = np.max(np.abs(X.T.dot(y))) / n_samples
print("Dataset size: %d samples, %d features" % X.shape)

# construct grid of regularization parameters alpha
n_alphas = 11
alphas = alpha_max * np.geomspace(1, 0.1, n_alphas)
*** Warning: this example may take more than 5 minutes to run ***
Dataset size: 16087 samples, 4272227 features

Run Celer on a grid of regularization parameters, for various tolerances:

tols = [1e-2, 1e-4, 1e-6]
results = np.zeros([1, len(tols)])
gaps = np.zeros((len(tols), len(alphas)))

print("Starting path computation...")
for tol_ix, tol in enumerate(tols):
    t0 = time.time()
    res = celer_path(X, y, 'lasso', alphas=alphas,
                     tol=tol, prune=True, verbose=1)
    results[0, tol_ix] = time.time() - t0
    _, coefs, gaps[tol_ix] = res


labels = [r"\sc{Celer}"]
figsize = (4, 3.5)

df = pd.DataFrame(results.T, columns=["Celer"])
df.index = [str(tol) for tol in tols]
df.plot.bar(rot=0, figsize=figsize)
plt.xlabel("stopping tolerance")
plt.ylabel("path computation time (s)")
plt.tight_layout()
plt.show(block=False)
plot finance path
Starting path computation...
##########################
##### Computing alpha 1/11
##########################
Iter 0: primal 0.1998879584, gap -2.78e-17
Early exit, gap: -2.78e-17 < 4.00e-03
##########################
##### Computing alpha 2/11
##########################
Iter 0: primal 0.1998879584, gap 8.46e-03, 1 feats in subpb (6958 left)
Iter 1: primal 0.1996695929, gap 1.10e-05
Early exit, gap: 1.10e-05 < 4.00e-03
##########################
##### Computing alpha 3/11
##########################
Iter 0: primal 0.1993226851, gap 8.66e-03, 1 feats in subpb (7235 left)
Iter 1: primal 0.1991849058, gap 1.25e-02, 2 feats in subpb (7235 left)
Iter 2: primal 0.1988140591, gap 2.51e-03
Early exit, gap: 2.51e-03 < 4.00e-03
##########################
##### Computing alpha 4/11
##########################
Iter 0: primal 0.1974240538, gap 1.60e-02, 2 feats in subpb (8287 left)
Iter 1: primal 0.1967819954, gap 1.40e-02, 4 feats in subpb (8119 left)
Iter 2: primal 0.1956875558, gap 4.25e-03, 8 feats in subpb (7120 left)
Iter 3: primal 0.1955729679, gap 2.08e-03
Early exit, gap: 2.08e-03 < 4.00e-03
##########################
##### Computing alpha 5/11
##########################
Iter 0: primal 0.1921130066, gap 1.30e-02, 7 feats in subpb (8744 left)
Iter 1: primal 0.1911454767, gap 1.60e-02, 14 feats in subpb (8744 left)
Iter 2: primal 0.1899680054, gap 8.91e-03, 22 feats in subpb (8155 left)
Iter 3: primal 0.1893305596, gap 1.66e-03
Early exit, gap: 1.66e-03 < 4.00e-03
##########################
##### Computing alpha 6/11
##########################
Iter 0: primal 0.1824875036, gap 9.78e-03, 15 feats in subpb (9114 left)
Iter 1: primal 0.1799473516, gap 7.77e-03, 26 feats in subpb (8703 left)
Iter 2: primal 0.1796551628, gap 1.91e-03
Early exit, gap: 1.91e-03 < 4.00e-03
##########################
##### Computing alpha 7/11
##########################
Iter 0: primal 0.1702584473, gap 9.24e-03, 17 feats in subpb (10076 left)
Iter 1: primal 0.1679647590, gap 3.80e-03
Early exit, gap: 3.80e-03 < 4.00e-03
##########################
##### Computing alpha 8/11
##########################
Iter 0: primal 0.1578431041, gap 1.13e-02, 15 feats in subpb (12375 left)
Iter 1: primal 0.1559409606, gap 1.19e-02, 28 feats in subpb (12370 left)
Iter 2: primal 0.1554690765, gap 4.51e-03, 48 feats in subpb (9557 left)
Iter 3: primal 0.1552945554, gap 1.59e-03
Early exit, gap: 1.59e-03 < 4.00e-03
##########################
##### Computing alpha 9/11
##########################
Iter 0: primal 0.1444814441, gap 7.00e-03, 27 feats in subpb (12393 left)
Iter 1: primal 0.1433953652, gap 4.09e-03, 50 feats in subpb (10579 left)
Iter 2: primal 0.1432862619, gap 9.43e-04
Early exit, gap: 9.43e-04 < 4.00e-03
###########################
##### Computing alpha 10/11
###########################
Iter 0: primal 0.1330512676, gap 5.52e-03, 27 feats in subpb (13599 left)
Iter 1: primal 0.1322840728, gap 6.95e-03, 50 feats in subpb (13599 left)
Iter 2: primal 0.1322135168, gap 1.61e-03
Early exit, gap: 1.61e-03 < 4.00e-03
###########################
##### Computing alpha 11/11
###########################
Iter 0: primal 0.1227863821, gap 5.71e-03, 34 feats in subpb (16753 left)
Iter 1: primal 0.1222932946, gap 5.24e-03, 68 feats in subpb (16147 left)
Iter 2: primal 0.1222004777, gap 1.52e-03
Early exit, gap: 1.52e-03 < 4.00e-03
##########################
##### Computing alpha 1/11
##########################
Iter 0: primal 0.1998879584, gap -2.78e-17
Early exit, gap: -2.78e-17 < 4.00e-05
##########################
##### Computing alpha 2/11
##########################
Iter 0: primal 0.1998879584, gap 8.46e-03, 1 feats in subpb (6958 left)
Iter 1: primal 0.1996695929, gap 1.10e-05
Early exit, gap: 1.10e-05 < 4.00e-05
##########################
##### Computing alpha 3/11
##########################
Iter 0: primal 0.1993226851, gap 8.66e-03, 1 feats in subpb (7235 left)
Iter 1: primal 0.1991849058, gap 1.25e-02, 2 feats in subpb (7235 left)
Iter 2: primal 0.1988140591, gap 2.51e-03, 4 feats in subpb (6738 left)
Iter 3: primal 0.1986319559, gap 8.96e-05, 8 feats in subpb (6572 left)
Iter 4: primal 0.1985975194, gap 5.60e-06
Early exit, gap: 5.60e-06 < 4.00e-05
##########################
##### Computing alpha 4/11
##########################
Iter 0: primal 0.1966411542, gap 8.02e-03, 4 feats in subpb (7571 left)
Iter 1: primal 0.1956396632, gap 4.42e-03, 8 feats in subpb (7139 left)
Iter 2: primal 0.1955088303, gap 1.32e-03, 14 feats in subpb (6737 left)
Iter 3: primal 0.1953583493, gap 3.59e-04, 22 feats in subpb (6622 left)
Iter 4: primal 0.1953550691, gap 8.50e-05, 22 feats in subpb (6592 left)
Iter 5: primal 0.1953542215, gap 1.76e-05
Early exit, gap: 1.76e-05 < 4.00e-05
##########################
##### Computing alpha 5/11
##########################
Iter 0: primal 0.1911486634, gap 7.47e-03, 10 feats in subpb (8016 left)
Iter 1: primal 0.1892308803, gap 1.83e-03, 18 feats in subpb (6978 left)
Iter 2: primal 0.1891865754, gap 4.08e-04, 26 feats in subpb (6677 left)
Iter 3: primal 0.1891781142, gap 8.21e-05, 30 feats in subpb (6611 left)
Iter 4: primal 0.1891760672, gap 1.51e-05
Early exit, gap: 1.51e-05 < 4.00e-05
##########################
##### Computing alpha 6/11
##########################
Iter 0: primal 0.1823058700, gap 6.59e-03, 15 feats in subpb (8436 left)
Iter 1: primal 0.1796740896, gap 3.24e-03, 28 feats in subpb (7671 left)
Iter 2: primal 0.1795190552, gap 9.22e-04, 36 feats in subpb (6911 left)
Iter 3: primal 0.1795008613, gap 1.79e-04, 34 feats in subpb (6650 left)
Iter 4: primal 0.1794978457, gap 2.64e-05
Early exit, gap: 2.64e-05 < 4.00e-05
##########################
##### Computing alpha 7/11
##########################
Iter 0: primal 0.1695199681, gap 5.56e-03, 17 feats in subpb (9000 left)
Iter 1: primal 0.1678527940, gap 3.80e-03, 32 feats in subpb (8434 left)
Iter 2: primal 0.1677070264, gap 9.60e-04, 44 feats in subpb (7177 left)
Iter 3: primal 0.1676670113, gap 2.60e-04, 46 feats in subpb (6744 left)
Iter 4: primal 0.1676578698, gap 4.32e-05, 44 feats in subpb (6615 left)
Iter 5: primal 0.1676538570, gap 1.17e-05
Early exit, gap: 1.17e-05 < 4.00e-05
##########################
##### Computing alpha 8/11
##########################
Iter 0: primal 0.1566656320, gap 4.85e-03, 22 feats in subpb (9735 left)
Iter 1: primal 0.1553682933, gap 4.19e-03, 40 feats in subpb (9409 left)
Iter 2: primal 0.1552842045, gap 1.23e-03, 52 feats in subpb (7787 left)
Iter 3: primal 0.1552501166, gap 2.41e-04, 50 feats in subpb (6878 left)
Iter 4: primal 0.1552418801, gap 6.19e-05, 50 feats in subpb (6658 left)
Iter 5: primal 0.1552385519, gap 1.24e-05
Early exit, gap: 1.24e-05 < 4.00e-05
##########################
##### Computing alpha 9/11
##########################
Iter 0: primal 0.1442155552, gap 4.32e-03, 25 feats in subpb (10738 left)
Iter 1: primal 0.1433134332, gap 1.93e-03, 46 feats in subpb (8983 left)
Iter 2: primal 0.1432812633, gap 4.33e-04, 54 feats in subpb (7459 left)
Iter 3: primal 0.1432733612, gap 1.09e-04, 52 feats in subpb (6843 left)
Iter 4: primal 0.1432697466, gap 2.92e-05
Early exit, gap: 2.92e-05 < 4.00e-05
###########################
##### Computing alpha 10/11
###########################
Iter 0: primal 0.1330093991, gap 4.02e-03, 26 feats in subpb (12154 left)
Iter 1: primal 0.1322799311, gap 5.91e-03, 50 feats in subpb (12154 left)
Iter 2: primal 0.1322107912, gap 1.48e-03, 66 feats in subpb (9392 left)
Iter 3: primal 0.1321983433, gap 4.12e-04, 72 feats in subpb (7833 left)
Iter 4: primal 0.1321922241, gap 1.19e-04, 70 feats in subpb (7109 left)
Iter 5: primal 0.1321903367, gap 3.31e-05
Early exit, gap: 3.31e-05 < 4.00e-05
###########################
##### Computing alpha 11/11
###########################
Iter 0: primal 0.1227429613, gap 3.70e-03, 35 feats in subpb (14035 left)
Iter 1: primal 0.1222465727, gap 6.04e-03, 68 feats in subpb (14035 left)
Iter 2: primal 0.1221697440, gap 1.38e-03, 98 feats in subpb (10352 left)
Iter 3: primal 0.1221572420, gap 4.05e-04, 98 feats in subpb (8326 left)
Iter 4: primal 0.1221502318, gap 1.20e-04, 96 feats in subpb (7403 left)
Iter 5: primal 0.1221491296, gap 3.57e-05
Early exit, gap: 3.57e-05 < 4.00e-05
##########################
##### Computing alpha 1/11
##########################
Iter 0: primal 0.1998879584, gap -2.78e-17
Early exit, gap: -2.78e-17 < 4.00e-07
##########################
##### Computing alpha 2/11
##########################
Iter 0: primal 0.1998879584, gap 8.46e-03, 1 feats in subpb (6958 left)
Iter 1: primal 0.1996695929, gap 1.10e-05, 2 feats in subpb (6560 left)
Iter 2: primal 0.1996653872, gap 2.41e-06, 4 feats in subpb (6560 left)
Iter 3: primal 0.1996638191, gap 6.88e-07, 4 feats in subpb (6560 left)
Iter 4: primal 0.1996631483, gap 0.00e+00
Early exit, gap: 0.00e+00 < 4.00e-07
##########################
##### Computing alpha 3/11
##########################
Iter 0: primal 0.1993025719, gap 8.37e-03, 1 feats in subpb (7213 left)
Iter 1: primal 0.1991579884, gap 1.29e-02, 2 feats in subpb (7213 left)
Iter 2: primal 0.1987790779, gap 2.48e-03, 4 feats in subpb (6737 left)
Iter 3: primal 0.1986004339, gap 1.86e-05, 8 feats in subpb (6567 left)
Iter 4: primal 0.1985978375, gap 3.78e-06, 8 feats in subpb (6565 left)
Iter 5: primal 0.1985972494, gap 1.09e-06, 8 feats in subpb (6565 left)
Iter 6: primal 0.1985972479, gap 2.13e-07
Early exit, gap: 2.13e-07 < 4.00e-07
##########################
##### Computing alpha 4/11
##########################
Iter 0: primal 0.1966470055, gap 8.00e-03, 4 feats in subpb (7571 left)
Iter 1: primal 0.1956436189, gap 4.62e-03, 8 feats in subpb (7167 left)
Iter 2: primal 0.1955280873, gap 2.07e-03, 14 feats in subpb (6839 left)
Iter 3: primal 0.1953637831, gap 2.68e-04, 20 feats in subpb (6617 left)
Iter 4: primal 0.1953540802, gap 1.70e-05, 20 feats in subpb (6577 left)
Iter 5: primal 0.1953540747, gap 1.83e-06, 20 feats in subpb (6572 left)
Iter 6: primal 0.1953540729, gap 3.57e-07
Early exit, gap: 3.57e-07 < 4.00e-07
##########################
##### Computing alpha 5/11
##########################
Iter 0: primal 0.1911517820, gap 7.40e-03, 10 feats in subpb (8004 left)
Iter 1: primal 0.1892324878, gap 3.22e-03, 18 feats in subpb (7249 left)
Iter 2: primal 0.1892149478, gap 1.13e-03, 28 feats in subpb (6811 left)
Iter 3: primal 0.1891914163, gap 2.83e-04, 34 feats in subpb (6651 left)
Iter 4: primal 0.1891758161, gap 7.22e-06, 28 feats in subpb (6587 left)
Iter 5: primal 0.1891752419, gap 1.34e-06, 28 feats in subpb (6581 left)
Iter 6: primal 0.1891751732, gap 8.81e-08
Early exit, gap: 8.81e-08 < 4.00e-07
##########################
##### Computing alpha 6/11
##########################
Iter 0: primal 0.1822945753, gap 6.59e-03, 14 feats in subpb (8438 left)
Iter 1: primal 0.1797111078, gap 3.17e-03, 26 feats in subpb (7657 left)
Iter 2: primal 0.1795420907, gap 9.10e-04, 38 feats in subpb (6906 left)
Iter 3: primal 0.1794902290, gap 8.60e-05, 36 feats in subpb (6623 left)
Iter 4: primal 0.1794862946, gap 2.02e-05, 36 feats in subpb (6602 left)
Iter 5: primal 0.1794854476, gap 4.92e-06, 36 feats in subpb (6586 left)
Iter 6: primal 0.1794851579, gap 5.39e-07, 36 feats in subpb (6580 left)
Iter 7: primal 0.1794851119, gap 1.41e-07
Early exit, gap: 1.41e-07 < 4.00e-07
##########################
##### Computing alpha 7/11
##########################
Iter 0: primal 0.1695265463, gap 5.54e-03, 18 feats in subpb (8995 left)
Iter 1: primal 0.1678317895, gap 3.20e-03, 36 feats in subpb (8198 left)
Iter 2: primal 0.1676902413, gap 9.40e-04, 50 feats in subpb (7172 left)
Iter 3: primal 0.1676606692, gap 2.05e-04, 46 feats in subpb (6713 left)
Iter 4: primal 0.1676559011, gap 2.68e-05, 46 feats in subpb (6612 left)
Iter 5: primal 0.1676522778, gap 4.21e-04, 44 feats in subpb (6612 left)
Iter 6: primal 0.1676514039, gap 3.63e-05, 46 feats in subpb (6612 left)
Iter 7: primal 0.1676509863, gap 6.85e-06, 46 feats in subpb (6600 left)
Iter 8: primal 0.1676463422, gap 2.02e-06, 44 feats in subpb (6592 left)
Iter 9: primal 0.1676449259, gap 1.05e-04, 44 feats in subpb (6592 left)
Iter 10: primal 0.1676447560, gap 2.93e-05, 44 feats in subpb (6592 left)
Iter 11: primal 0.1676447071, gap 5.38e-07, 44 feats in subpb (6586 left)
Iter 12: primal 0.1676444816, gap 7.27e-05, 44 feats in subpb (6586 left)
Iter 13: primal 0.1676444603, gap 1.08e-05, 44 feats in subpb (6586 left)
Iter 14: primal 0.1676444504, gap 1.34e-07
Early exit, gap: 1.34e-07 < 4.00e-07
##########################
##### Computing alpha 8/11
##########################
Iter 0: primal 0.1566505377, gap 4.83e-03, 22 feats in subpb (9728 left)
Iter 1: primal 0.1553465007, gap 6.14e-03, 44 feats in subpb (9728 left)
Iter 2: primal 0.1552831687, gap 1.27e-03, 56 feats in subpb (7813 left)
Iter 3: primal 0.1552531557, gap 3.71e-04, 52 feats in subpb (7022 left)
Iter 4: primal 0.1552441381, gap 7.76e-05, 52 feats in subpb (6678 left)
Iter 5: primal 0.1552382264, gap 2.32e-05, 50 feats in subpb (6620 left)
Iter 6: primal 0.1552375216, gap 5.51e-06, 50 feats in subpb (6603 left)
Iter 7: primal 0.1552359330, gap 2.16e-04, 50 feats in subpb (6603 left)
Iter 8: primal 0.1552357877, gap 5.29e-05, 50 feats in subpb (6603 left)
Iter 9: primal 0.1552354892, gap 6.81e-06, 50 feats in subpb (6603 left)
Iter 10: primal 0.1552353601, gap 1.93e-06, 50 feats in subpb (6597 left)
Iter 11: primal 0.1552348545, gap 2.09e-05, 50 feats in subpb (6597 left)
Iter 12: primal 0.1552347369, gap 5.49e-06, 50 feats in subpb (6597 left)
Iter 13: primal 0.1552346956, gap 6.91e-07, 50 feats in subpb (6588 left)
Iter 14: primal 0.1552344993, gap 2.07e-07
Early exit, gap: 2.07e-07 < 4.00e-07
##########################
##### Computing alpha 9/11
##########################
Iter 0: primal 0.1442124930, gap 4.30e-03, 25 feats in subpb (10731 left)
Iter 1: primal 0.1433084131, gap 2.04e-03, 46 feats in subpb (9075 left)
Iter 2: primal 0.1432800514, gap 5.51e-04, 52 feats in subpb (7615 left)
Iter 3: primal 0.1432706944, gap 1.62e-04, 52 feats in subpb (6975 left)
Iter 4: primal 0.1432691522, gap 4.21e-05, 52 feats in subpb (6694 left)
Iter 5: primal 0.1432676116, gap 1.26e-05, 54 feats in subpb (6629 left)
Iter 6: primal 0.1432669666, gap 3.43e-06, 52 feats in subpb (6608 left)
Iter 7: primal 0.1432664770, gap 6.88e-07, 50 feats in subpb (6594 left)
Iter 8: primal 0.1432664688, gap 1.60e-07
Early exit, gap: 1.60e-07 < 4.00e-07
###########################
##### Computing alpha 10/11
###########################
Iter 0: primal 0.1329697581, gap 3.94e-03, 25 feats in subpb (12085 left)
Iter 1: primal 0.1323010613, gap 6.41e-03, 50 feats in subpb (12085 left)
Iter 2: primal 0.1322423763, gap 1.73e-03, 70 feats in subpb (9724 left)
Iter 3: primal 0.1322001142, gap 4.84e-04, 64 feats in subpb (7985 left)
Iter 4: primal 0.1321914857, gap 1.45e-04, 70 feats in subpb (7204 left)
Iter 5: primal 0.1321899410, gap 4.11e-05, 70 feats in subpb (6782 left)
Iter 6: primal 0.1321896724, gap 9.22e-06, 70 feats in subpb (6665 left)
Iter 7: primal 0.1321893899, gap 2.40e-06, 70 feats in subpb (6627 left)
Iter 8: primal 0.1321893707, gap 7.15e-07, 70 feats in subpb (6609 left)
Iter 9: primal 0.1321893619, gap 2.14e-07
Early exit, gap: 2.14e-07 < 4.00e-07
###########################
##### Computing alpha 11/11
###########################
Iter 0: primal 0.1227390467, gap 3.65e-03, 35 feats in subpb (13979 left)
Iter 1: primal 0.1222220762, gap 3.43e-03, 68 feats in subpb (13654 left)
Iter 2: primal 0.1221586102, gap 9.99e-04, 100 feats in subpb (9610 left)
Iter 3: primal 0.1221491781, gap 2.95e-04, 98 feats in subpb (8029 left)
Iter 4: primal 0.1221489635, gap 8.50e-05, 96 feats in subpb (7259 left)
Iter 5: primal 0.1221482704, gap 2.55e-05, 100 feats in subpb (6845 left)
Iter 6: primal 0.1221482150, gap 3.99e-06, 100 feats in subpb (6671 left)
Iter 7: primal 0.1221481890, gap 8.08e-07, 100 feats in subpb (6637 left)
Iter 8: primal 0.1221481800, gap 2.40e-07
Early exit, gap: 2.40e-07 < 4.00e-07

Measure the influence of regularization on the sparsity of the solutions:

fig, ax = plt.subplots(figsize=(8, 5), constrained_layout=True)
plt.bar(np.arange(n_alphas), (coefs != 0).sum(axis=0))
plt.title("Sparsity of solution along regularization path")
ax.set_ylabel(r"$||\hat w||_0$")
ax.set_xlabel(r"$\lambda / \lambda_{\mathrm{max}}$")
ax.set_yscale('log')
ax.set_xticks(np.arange(n_alphas)[::2])
ax.set_xticklabels(map(lambda x: "%.2f" % x, alphas[::2] / alphas[0]))
plt.show(block=False)
Sparsity of solution along regularization path

Check convergence guarantees: gap is inferior to tolerance

df = pd.DataFrame(gaps.T, columns=map(lambda x: r"tol=%.0e" % x, tols))
df.index = map(lambda x: "%.2f" % x, alphas / alphas[0])
ax = df.plot.bar(figsize=(7, 4))
ax.set_ylabel("duality gap reached")
ax.set_xlabel(r"$\lambda / \lambda_{\mathrm{max}}$")
ax.set_yscale('log')
ax.set_yticks(tols)
plt.tight_layout()
plt.show(block=False)
plot finance path

Total running time of the script: (4 minutes 49.475 seconds)

Gallery generated by Sphinx-Gallery