Deep neural collapse is provably optimal for the deep unconstrained features model
Súkeník P, Mondelli M, Lampert C. 2023. Deep neural collapse is provably optimal for the deep unconstrained features model. 37th Annual Conference on Neural Information Processing Systems. NeurIPS: Neural Information Processing Systems, NeurIPS, .
Download (ext.)
          
        
            
            
            Conference Paper
            
            
            
            | Published
            
            
              |              English
              
            
          
        Corresponding author has ISTA affiliation
Department
    Series Title
    
    NeurIPS
Abstract
    Neural collapse (NC) refers to the surprising structure of the last layer of deep neural networks in the terminal phase of gradient descent training. Recently, an increasing amount of experimental evidence has pointed to the propagation of NC to earlier layers of neural networks. However, while the NC in the last layer is well studied theoretically, much less is known about its multi-layered counterpart - deep neural collapse (DNC). In particular, existing work focuses either on linear layers or only on the last two layers at the price of an extra assumption. Our paper fills this gap by generalizing the established analytical framework for NC - the unconstrained features model - to multiple non-linear layers. Our key technical contribution is to show that, in a deep unconstrained features model, the unique global optimum for binary classification exhibits all the properties typical of DNC. This explains the existing experimental evidence of DNC. We also empirically show that (i) by optimizing deep unconstrained features models via gradient descent, the resulting solution agrees well with our theory, and (ii) trained networks recover the unconstrained features suitable for the occurrence of DNC, thus supporting the validity of this modeling principle.
    
  Publishing Year
    
  Date Published
    2023-12-15
  Proceedings Title
    37th Annual Conference on Neural Information Processing Systems
  Acknowledgement
    M. M. is partially supported by the 2019 Lopez-Loreta Prize. The authors would like to thank Eugenia Iofinova, Bernd Prach and Simone Bombari for valuable feedback on the manuscript.
  Conference
    
      NeurIPS: Neural Information Processing Systems
    
  Conference Location
    
      New Orleans, LA, United States
    
  Conference Date
    
      2023-12-10 – 2023-12-16
    
  IST-REx-ID
    
  Cite this
Súkeník P, Mondelli M, Lampert C. Deep neural collapse is provably optimal for the deep unconstrained features model. In: 37th Annual Conference on Neural Information Processing Systems. ; 2023.
    Súkeník, P., Mondelli, M., & Lampert, C. (2023). Deep neural collapse is provably optimal for the deep unconstrained features model. In 37th Annual Conference on Neural Information Processing Systems. New Orleans, LA, United States.
    Súkeník, Peter, Marco Mondelli, and Christoph Lampert. “Deep Neural Collapse Is Provably Optimal for the Deep Unconstrained Features Model.” In 37th Annual Conference on Neural Information Processing Systems, 2023.
    P. Súkeník, M. Mondelli, and C. Lampert, “Deep neural collapse is provably optimal for the deep unconstrained features model,” in 37th Annual Conference on Neural Information Processing Systems, New Orleans, LA, United States, 2023.
    Súkeník P, Mondelli M, Lampert C. 2023. Deep neural collapse is provably optimal for the deep unconstrained features model. 37th Annual Conference on Neural Information Processing Systems. NeurIPS: Neural Information Processing Systems, NeurIPS, .
    Súkeník, Peter, et al. “Deep Neural Collapse Is Provably Optimal for the Deep Unconstrained Features Model.” 37th Annual Conference on Neural Information Processing Systems, 2023.
  
      All files available under the following license(s):
      
      
        
          
        
          
          
      
      
    
  
            Copyright Statement:
          
        
            This Item is protected by copyright and/or related rights. [...]
          
        
      Link(s) to Main File(s)
    
  Access Level
     Open Access
 Open Access
    Export
Marked PublicationsOpen Data ISTA Research Explorer
Sources
 arXiv 2305.13165
arXiv 2305.13165


 Google Scholar
Google Scholar