{"status":"public","day":"01","date_published":"2021-12-01T00:00:00Z","publication_identifier":{"issn":["1049-5258"],"isbn":["9781713845393"]},"project":[{"name":"Prix Lopez-Loretta 2019 - Marco Mondelli","_id":"059876FA-7A3F-11EA-A408-12923DDC885E"}],"main_file_link":[{"open_access":"1","url":"https://arxiv.org/abs/2102.09671"}],"quality_controlled":"1","external_id":{"arxiv":["2102.09671"]},"publication":"35th Conference on Neural Information Processing Systems","abstract":[{"text":"The question of how and why the phenomenon of mode connectivity occurs in training deep neural networks has gained remarkable attention in the research community. From a theoretical perspective, two possible explanations have been proposed: (i) the loss function has connected sublevel sets, and (ii) the solutions found by stochastic gradient descent are dropout stable. While these explanations provide insights into the phenomenon, their assumptions are not always satisfied in practice. In particular, the first approach requires the network to have one layer with order of N neurons (N being the number of training samples), while the second one requires the loss to be almost invariant after removing half of the neurons at each layer (up to some rescaling of the remaining ones). In this work, we improve both conditions by exploiting the quality of the features at every intermediate layer together with a milder over-parameterization condition. More specifically, we show that: (i) under generic assumptions on the features of intermediate layers, it suffices that the last two hidden layers have order of N−−√ neurons, and (ii) if subsets of features at each layer are linearly separable, then no over-parameterization is needed to show the connectivity. Our experiments confirm that the proposed condition ensures the connectivity of solutions found by stochastic gradient descent, even in settings where the previous requirements do not hold.","lang":"eng"}],"publisher":"Neural Information Processing Systems Foundation","language":[{"iso":"eng"}],"article_processing_charge":"No","user_id":"2DF688A6-F248-11E8-B48F-1D18A9856A87","citation":{"ama":"Nguyen Q, Bréchet P, Mondelli M. When are solutions connected in deep networks? In: 35th Conference on Neural Information Processing Systems. Vol 35. Neural Information Processing Systems Foundation; 2021.","ieee":"Q. Nguyen, P. Bréchet, and M. Mondelli, “When are solutions connected in deep networks?,” in 35th Conference on Neural Information Processing Systems, Virtual, 2021, vol. 35.","chicago":"Nguyen, Quynh, Pierre Bréchet, and Marco Mondelli. “When Are Solutions Connected in Deep Networks?” In 35th Conference on Neural Information Processing Systems, Vol. 35. Neural Information Processing Systems Foundation, 2021.","short":"Q. Nguyen, P. Bréchet, M. Mondelli, in:, 35th Conference on Neural Information Processing Systems, Neural Information Processing Systems Foundation, 2021.","apa":"Nguyen, Q., Bréchet, P., & Mondelli, M. (2021). When are solutions connected in deep networks? In 35th Conference on Neural Information Processing Systems (Vol. 35). Virtual: Neural Information Processing Systems Foundation.","mla":"Nguyen, Quynh, et al. “When Are Solutions Connected in Deep Networks?” 35th Conference on Neural Information Processing Systems, vol. 35, Neural Information Processing Systems Foundation, 2021.","ista":"Nguyen Q, Bréchet P, Mondelli M. 2021. When are solutions connected in deep networks? 35th Conference on Neural Information Processing Systems. 35th Conference on Neural Information Processing Systems vol. 35."},"conference":{"location":"Virtual","start_date":"2021-12-06","end_date":"2021-12-14","name":"35th Conference on Neural Information Processing Systems"},"year":"2021","type":"conference","author":[{"first_name":"Quynh","full_name":"Nguyen, Quynh","last_name":"Nguyen"},{"last_name":"Bréchet","full_name":"Bréchet, Pierre","first_name":"Pierre"},{"orcid":"0000-0002-3242-7020","last_name":"Mondelli","full_name":"Mondelli, Marco","id":"27EB676C-8706-11E9-9510-7717E6697425","first_name":"Marco"}],"_id":"10594","title":"When are solutions connected in deep networks?","department":[{"_id":"MaMo"}],"date_created":"2022-01-03T10:56:20Z","volume":35,"month":"12","date_updated":"2023-10-17T11:48:40Z","intvolume":" 35","acknowledgement":"MM was partially supported by the 2019 Lopez-Loreta Prize. QN and PB acknowledge support from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (grant agreement no 757983).","oa":1,"oa_version":"Preprint","publication_status":"published"}