Sparsity in deep learning: Pruning and growth for efficient inference and training in neural networks

Hoefler T, Alistarh D-A, Ben-Nun T, Dryden N, Peste E-A. 2021. Sparsity in deep learning: Pruning and growth for efficient inference and training in neural networks. Journal of Machine Learning Research. 22(241), 1–124.

Download
OA 2021_JMachLearnRes_Hoefler.pdf 3.53 MB [Published Version]
Download (ext.)
Journal Article | Published | English

Scopus indexed
Author
Hoefler, Torsten; Alistarh, Dan-AdrianISTA ; Ben-Nun, Tal; Dryden, Nikoli; Peste, Elena-AlexandraISTA
Department
Abstract
The growing energy and performance costs of deep learning have driven the community to reduce the size of neural networks by selectively pruning components. Similarly to their biological counterparts, sparse networks generalize just as well, sometimes even better than, the original dense networks. Sparsity promises to reduce the memory footprint of regular networks to fit mobile devices, as well as shorten training time for ever growing networks. In this paper, we survey prior work on sparsity in deep learning and provide an extensive tutorial of sparsification for both inference and training. We describe approaches to remove and add elements of neural networks, different training strategies to achieve model sparsity, and mechanisms to exploit sparsity in practice. Our work distills ideas from more than 300 research papers and provides guidance to practitioners who wish to utilize sparsity today, as well as to researchers whose goal is to push the frontier forward. We include the necessary background on mathematical methods in sparsification, describe phenomena such as early structure adaptation, the intricate relations between sparsity and the training process, and show techniques for achieving acceleration on real hardware. We also define a metric of pruned parameter efficiency that could serve as a baseline for comparison of different sparse networks. We close by speculating on how sparsity can improve future workloads and outline major open problems in the field.
Publishing Year
Date Published
2021-09-01
Journal Title
Journal of Machine Learning Research
Publisher
Journal of Machine Learning Research
Acknowledgement
We thank Doug Burger, Steve Scott, Marco Heddes, and the respective teams at Microsoft for inspiring discussions on the topic. We thank Angelika Steger for uplifting debates about the connections to biological brains, Sidak Pal Singh for his support regarding experimental results, and Utku Evci as well as Xin Wang for comments on previous versions of this work. Special thanks go to Bernhard Schölkopf, our JMLR editor Samy Bengio, and the three anonymous reviewers who provided excellent comprehensive, pointed, and deep review comments that improved the quality of our manuscript significantly.
Volume
22
Issue
241
Page
1-124
ISSN
eISSN
IST-REx-ID

Cite this

Hoefler T, Alistarh D-A, Ben-Nun T, Dryden N, Peste E-A. Sparsity in deep learning: Pruning and growth for efficient inference and training in neural networks. Journal of Machine Learning Research. 2021;22(241):1-124.
Hoefler, T., Alistarh, D.-A., Ben-Nun, T., Dryden, N., & Peste, E.-A. (2021). Sparsity in deep learning: Pruning and growth for efficient inference and training in neural networks. Journal of Machine Learning Research. Journal of Machine Learning Research.
Hoefler, Torsten, Dan-Adrian Alistarh, Tal Ben-Nun, Nikoli Dryden, and Elena-Alexandra Peste. “Sparsity in Deep Learning: Pruning and Growth for Efficient Inference and Training in Neural Networks.” Journal of Machine Learning Research. Journal of Machine Learning Research, 2021.
T. Hoefler, D.-A. Alistarh, T. Ben-Nun, N. Dryden, and E.-A. Peste, “Sparsity in deep learning: Pruning and growth for efficient inference and training in neural networks,” Journal of Machine Learning Research, vol. 22, no. 241. Journal of Machine Learning Research, pp. 1–124, 2021.
Hoefler T, Alistarh D-A, Ben-Nun T, Dryden N, Peste E-A. 2021. Sparsity in deep learning: Pruning and growth for efficient inference and training in neural networks. Journal of Machine Learning Research. 22(241), 1–124.
Hoefler, Torsten, et al. “Sparsity in Deep Learning: Pruning and Growth for Efficient Inference and Training in Neural Networks.” Journal of Machine Learning Research, vol. 22, no. 241, Journal of Machine Learning Research, 2021, pp. 1–124.
All files available under the following license(s):
Creative Commons Attribution 4.0 International Public License (CC-BY 4.0):
Main File(s)
Access Level
OA Open Access
Date Uploaded
2021-10-27
MD5 Checksum
3389d9d01fc58f8fb4c1a53e14a8abbf


Link(s) to Main File(s)
Access Level
OA Open Access

Export

Marked Publications

Open Data ISTA Research Explorer

Sources

arXiv 2102.00554

Search this title in

Google Scholar