The dual of the margin: improved analyses and rates for gradient descent’s implicit bias

One World Seminar on the Mathematics of Machine Learning | 16.12.2020 18:00 - 19:00

Ziwei Ji (University of Illinois)

Abstract: The implicit bias of gradient descent, and specifically its margin maximization properties, have arisen as a promising explanation for the good generalization of deep networks. The purpose of this talk is to demonstrate the effectiveness of a dual problem to smoothed margin maximization. Concretely, this talk will develop this dual, as well as a variety of consequences in linear and nonlinear settings. In the linear case, this dual perspective firstly will yield fast 1/t rates for margin maximization and implicit bias. This is faster than any prior first-order hard-margin SVM solver, which achieves 1/sqrt{t} at best. Secondly, the dual analysis also allows a characterization of the implicit bias, even outside the standard setting of exponentially-tailed losses; in this sense, it is gradient descent and not a particular loss structure which leads to implicit bias. In the nonlinear case, duality will enable the proof of a gradient alignment property: asymptotically, the parameters and their gradients become colinear. Although abstract, this property in turn implies various existing and new margin maximization results.

Joint work with Matus Telgarsky.

Organiser:

Philipp Petersen

Location:

Online via Zoom, Informationen zur Teilnahme werden rechtzeitig auf der Webseite veröffentlicht.

The dual of the margin: improved analyses and rates for gradient descent’s implicit bias

Verwandte Links