Towards understanding the role of noise in non-convex machine learning dynamics

13.04.2022 18:00 - 19:00

Loucas Pillaud-Vivien (EPFL)

Abstract: It has been empirically shown that the noise induced by the Stochastic Gradient Descent algorithm when training neural networks generally enhances its generalisation performance in comparison to full-batch training (gradient descent). In this talk, we will try to understand how SGD-like noise biases the training dynamics towards specific prediction functions for regression tasks. More precisely, we will first show that the dynamics of SGD over diagonal linear networks converges towards a sparser linear estimator than the one retrieved by GD. Going further, we will also show that adding label noise biases the dynamics towards implicitly solving a Lasso program. Our findings highlight the fact that structured noise can induce better generalisation and they help explain the greater performances of stochastic dynamics over deterministic ones, as observed in practice.

Organiser:
P. Petersen
Location:
Zoom Meeting