---
OA_place: publisher
_id: '10180'
abstract:
- lang: eng
  text: The growing energy and performance costs of deep learning have driven the
    community to reduce the size of neural networks by selectively pruning components.
    Similarly to their biological counterparts, sparse networks generalize just as
    well, sometimes even better than, the original dense networks. Sparsity promises
    to reduce the memory footprint of regular networks to fit mobile devices, as well
    as shorten training time for ever growing networks. In this paper, we survey prior
    work on sparsity in deep learning and provide an extensive tutorial of sparsification
    for both inference and training. We describe approaches to remove and add elements
    of neural networks, different training strategies to achieve model sparsity, and
    mechanisms to exploit sparsity in practice. Our work distills ideas from more
    than 300 research papers and provides guidance to practitioners who wish to utilize
    sparsity today, as well as to researchers whose goal is to push the frontier forward.
    We include the necessary background on mathematical methods in sparsification,
    describe phenomena such as early structure adaptation, the intricate relations
    between sparsity and the training process, and show techniques for achieving acceleration
    on real hardware. We also define a metric of pruned parameter efficiency that
    could serve as a baseline for comparison of different sparse networks. We close
    by speculating on how sparsity can improve future workloads and outline major
    open problems in the field.
acknowledgement: "We thank Doug Burger, Steve Scott, Marco Heddes, and the respective
  teams at Microsoft for inspiring discussions on the topic. We thank Angelika Steger
  for uplifting debates about the connections to biological brains, Sidak Pal Singh
  for his support regarding experimental results, and Utku Evci as well as Xin Wang
  for comments on previous versions of this\r\nwork. Special thanks go to Bernhard
  Schölkopf, our JMLR editor Samy Bengio, and the three anonymous reviewers who provided
  excellent comprehensive, pointed, and deep review comments that improved the quality
  of our manuscript significantly."
article_processing_charge: No
article_type: original
arxiv: 1
author:
- first_name: Torsten
  full_name: Hoefler, Torsten
  last_name: Hoefler
- first_name: Dan-Adrian
  full_name: Alistarh, Dan-Adrian
  id: 4A899BFC-F248-11E8-B48F-1D18A9856A87
  last_name: Alistarh
  orcid: 0000-0003-3650-940X
- first_name: Tal
  full_name: Ben-Nun, Tal
  last_name: Ben-Nun
- first_name: Nikoli
  full_name: Dryden, Nikoli
  last_name: Dryden
- first_name: Elena-Alexandra
  full_name: Peste, Elena-Alexandra
  id: 32D78294-F248-11E8-B48F-1D18A9856A87
  last_name: Peste
citation:
  ama: 'Hoefler T, Alistarh D-A, Ben-Nun T, Dryden N, Krumes A. Sparsity in deep learning:
    Pruning and growth for efficient inference and training in neural networks. <i>Journal
    of Machine Learning Research</i>. 2021;22(241):1-124.'
  apa: 'Hoefler, T., Alistarh, D.-A., Ben-Nun, T., Dryden, N., &#38; Krumes, A. (2021).
    Sparsity in deep learning: Pruning and growth for efficient inference and training
    in neural networks. <i>Journal of Machine Learning Research</i>. ML Research Press.'
  chicago: 'Hoefler, Torsten, Dan-Adrian Alistarh, Tal Ben-Nun, Nikoli Dryden, and
    Alexandra Krumes. “Sparsity in Deep Learning: Pruning and Growth for Efficient
    Inference and Training in Neural Networks.” <i>Journal of Machine Learning Research</i>.
    ML Research Press, 2021.'
  ieee: 'T. Hoefler, D.-A. Alistarh, T. Ben-Nun, N. Dryden, and A. Krumes, “Sparsity
    in deep learning: Pruning and growth for efficient inference and training in neural
    networks,” <i>Journal of Machine Learning Research</i>, vol. 22, no. 241. ML Research
    Press, pp. 1–124, 2021.'
  ista: 'Hoefler T, Alistarh D-A, Ben-Nun T, Dryden N, Krumes A. 2021. Sparsity in
    deep learning: Pruning and growth for efficient inference and training in neural
    networks. Journal of Machine Learning Research. 22(241), 1–124.'
  mla: 'Hoefler, Torsten, et al. “Sparsity in Deep Learning: Pruning and Growth for
    Efficient Inference and Training in Neural Networks.” <i>Journal of Machine Learning
    Research</i>, vol. 22, no. 241, ML Research Press, 2021, pp. 1–124.'
  short: T. Hoefler, D.-A. Alistarh, T. Ben-Nun, N. Dryden, A. Krumes, Journal of
    Machine Learning Research 22 (2021) 1–124.
corr_author: '1'
date_created: 2021-10-24T22:01:34Z
date_published: 2021-09-01T00:00:00Z
date_updated: 2025-06-26T11:53:12Z
day: '01'
ddc:
- '000'
department:
- _id: DaAl
external_id:
  arxiv:
  - '2102.00554'
file:
- access_level: open_access
  checksum: 3389d9d01fc58f8fb4c1a53e14a8abbf
  content_type: application/pdf
  creator: cziletti
  date_created: 2021-10-27T15:34:18Z
  date_updated: 2021-10-27T15:34:18Z
  file_id: '10192'
  file_name: 2021_JMachLearnRes_Hoefler.pdf
  file_size: 3527521
  relation: main_file
  success: 1
file_date_updated: 2021-10-27T15:34:18Z
has_accepted_license: '1'
intvolume: '        22'
issue: '241'
language:
- iso: eng
main_file_link:
- open_access: '1'
  url: https://www.jmlr.org/papers/v22/21-0366.html
month: '09'
oa: 1
oa_version: Published Version
page: 1-124
publication: Journal of Machine Learning Research
publication_identifier:
  eissn:
  - 1533-7928
  issn:
  - 1532-4435
publication_status: published
publisher: ML Research Press
quality_controlled: '1'
scopus_import: '1'
status: public
title: 'Sparsity in deep learning: Pruning and growth for efficient inference and
  training in neural networks'
tmp:
  image: /images/cc_by.png
  legal_code_url: https://creativecommons.org/licenses/by/4.0/legalcode
  name: Creative Commons Attribution 4.0 International Public License (CC-BY 4.0)
  short: CC BY (4.0)
type: journal_article
user_id: 2DF688A6-F248-11E8-B48F-1D18A9856A87
volume: 22
year: '2021'
...
