When are solutions connected in deep networks?
Nguyen Q, Bréchet P, Mondelli M. 2021. When are solutions connected in deep networks? 35th Conference on Neural Information Processing Systems. 35th Conference on Neural Information Processing Systems vol. 35.
Download (ext.)
https://arxiv.org/abs/2102.09671
[Preprint]
Conference Paper
| Published
| English
Author
Nguyen, Quynh;
Bréchet, Pierre;
Mondelli, MarcoISTA
Corresponding author has ISTA affiliation
Department
Abstract
The question of how and why the phenomenon of mode connectivity occurs in training deep neural networks has gained remarkable attention in the research community. From a theoretical perspective, two possible explanations have been proposed: (i) the loss function has connected sublevel sets, and (ii) the solutions found by stochastic gradient descent are dropout stable. While these explanations provide insights into the phenomenon, their assumptions are not always satisfied in practice. In particular, the first approach requires the network to have one layer with order of N neurons (N being the number of training samples), while the second one requires the loss to be almost invariant after removing half of the neurons at each layer (up to some rescaling of the remaining ones). In this work, we improve both conditions by exploiting the quality of the features at every intermediate layer together with a milder over-parameterization condition. More specifically, we show that: (i) under generic assumptions on the features of intermediate layers, it suffices that the last two hidden layers have order of N−−√ neurons, and (ii) if subsets of features at each layer are linearly separable, then no over-parameterization is needed to show the connectivity. Our experiments confirm that the proposed condition ensures the connectivity of solutions found by stochastic gradient descent, even in settings where the previous requirements do not hold.
Publishing Year
Date Published
2021-12-01
Proceedings Title
35th Conference on Neural Information Processing Systems
Publisher
Neural Information Processing Systems Foundation
Acknowledgement
MM was partially supported by the 2019 Lopez-Loreta Prize. QN and PB acknowledge support from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (grant agreement no 757983).
Volume
35
Conference
35th Conference on Neural Information Processing Systems
Conference Location
Virtual
Conference Date
2021-12-06 – 2021-12-14
ISBN
ISSN
IST-REx-ID
Cite this
Nguyen Q, Bréchet P, Mondelli M. When are solutions connected in deep networks? In: 35th Conference on Neural Information Processing Systems. Vol 35. Neural Information Processing Systems Foundation; 2021.
Nguyen, Q., Bréchet, P., & Mondelli, M. (2021). When are solutions connected in deep networks? In 35th Conference on Neural Information Processing Systems (Vol. 35). Virtual: Neural Information Processing Systems Foundation.
Nguyen, Quynh, Pierre Bréchet, and Marco Mondelli. “When Are Solutions Connected in Deep Networks?” In 35th Conference on Neural Information Processing Systems, Vol. 35. Neural Information Processing Systems Foundation, 2021.
Q. Nguyen, P. Bréchet, and M. Mondelli, “When are solutions connected in deep networks?,” in 35th Conference on Neural Information Processing Systems, Virtual, 2021, vol. 35.
Nguyen Q, Bréchet P, Mondelli M. 2021. When are solutions connected in deep networks? 35th Conference on Neural Information Processing Systems. 35th Conference on Neural Information Processing Systems vol. 35.
Nguyen, Quynh, et al. “When Are Solutions Connected in Deep Networks?” 35th Conference on Neural Information Processing Systems, vol. 35, Neural Information Processing Systems Foundation, 2021.
All files available under the following license(s):
Copyright Statement:
This Item is protected by copyright and/or related rights. [...]
Link(s) to Main File(s)
Access Level
Open Access
Export
Marked PublicationsOpen Data ISTA Research Explorer
Sources
arXiv 2102.09671