---
res:
  bibo_abstract:
  - "The rising footprint of machine learning has led to a focus on imposing model\r\nsparsity
    as a means of reducing computational and memory costs. For deep neural\r\nnetworks
    (DNNs), the state-of-the-art accuracy-vs-sparsity is achieved by heuristics\r\ninspired
    by the classical Optimal Brain Surgeon (OBS) framework [LeCun et al.,\r\n1989,
    Hassibi and Stork, 1992, Hassibi et al., 1993], which leverages loss curvature\r\ninformation
    to make better pruning decisions. Yet, these results still lack a solid\r\ntheoretical
    understanding, and it is unclear whether they can be improved by\r\nleveraging
    connections to the wealth of work on sparse recovery algorithms. In this\r\npaper,
    we draw new connections between these two areas and present new sparse\r\nrecovery
    algorithms inspired by the OBS framework that comes with theoretical\r\nguarantees
    under reasonable assumptions and have strong practical performance.\r\nSpecifically,
    our work starts from the observation that we can leverage curvature\r\ninformation
    in OBS-like fashion upon the projection step of classic iterative sparse\r\nrecovery
    algorithms such as IHT. We show for the first time that this leads both\r\nto
    improved convergence bounds under standard assumptions. Furthermore, we\r\npresent
    extensions of this approach to the practical task of obtaining accurate sparse\r\nDNNs,
    and validate it experimentally at scale for Transformer-based models on\r\nvision
    and language tasks.@eng"
  bibo_authorlist:
  - foaf_Person:
      foaf_givenName: Diyuan
      foaf_name: Wu, Diyuan
      foaf_surname: Wu
      foaf_workInfoHomepage: http://www.librecat.org/personId=1a5914c2-896a-11ed-bdf8-fb80621a0635
  - foaf_Person:
      foaf_givenName: Ionut-Vlad
      foaf_name: Modoranu, Ionut-Vlad
      foaf_surname: Modoranu
      foaf_workInfoHomepage: http://www.librecat.org/personId=449f7a18-f128-11eb-9611-9b430c0c6333
  - foaf_Person:
      foaf_givenName: Mher
      foaf_name: Safaryan, Mher
      foaf_surname: Safaryan
      foaf_workInfoHomepage: http://www.librecat.org/personId=dd546b39-0804-11ed-9c55-ef075c39778d
  - foaf_Person:
      foaf_givenName: Denis
      foaf_name: Kuznedelev, Denis
      foaf_surname: Kuznedelev
  - foaf_Person:
      foaf_givenName: Dan-Adrian
      foaf_name: Alistarh, Dan-Adrian
      foaf_surname: Alistarh
      foaf_workInfoHomepage: http://www.librecat.org/personId=4A899BFC-F248-11E8-B48F-1D18A9856A87
    orcid: 0000-0003-3650-940X
  bibo_volume: 37
  dct_date: 2024^xs_gYear
  dct_isPartOf:
  - http://id.crossref.org/issn/1049-5258
  dct_language: eng
  dct_publisher: Neural Information Processing Systems Foundation@
  dct_title: 'The iterative optimal brain surgeon: Faster sparse recovery by leveraging
    second-order information@'
...
