Syllabus of the “Large scale optimization for machine and deep learning” class
The class studies the success of algorithms for large scale deep learning problems, covering both recent theoretical results as well as practical Python implementations of popular optimization algorithms.
Syllabus
- basics of convex analysis: convex sets, convex functions, strong convexity, smoothness, subdifferential, Fenchel-Legendre transform, infimal convolution, Moreau envelope
- gradient descent and subgradient descent, fixed point iterations, proximal point method (Lab 1)
- acceleration of first order methods: Nesterov and momentum
- algorithms for Deep Learning: stochastic gradient descent, ADAM, Adagrad (Lab 2)
- automatic differentiation
- second order algorithms: Newton and quasi-Newton methods
- implicit regularization, duality, Bregman geometry, mirror descent
- recent results in non convex optimization
- other algorithms: Frank-Wolfe, primal-dual algorithms, extragradient.
Schedule
15 x 2 h of class/labs, oral presentation
Validation
2 Labs, weekly homeworks and one oral written exam
Ressources
- Introductory lectures on convex optimization: a basic course, Y. Nesterov, 2004. A reference book in optimization, updated in 2018: Lectures on Convex Optimization.
- First order methods in optimization, A. Beck, 2019.
- Convex optimization: algorithms and complexity, S. Bubeck, 2015. A short monograph (100 pages) covering basic topics.
Prerequisite
- Differential calculus: gradient, Hessian
- Notions of convexity
- Linear algebra: eigenvalue decomposition, singular value decomposition