7 Publications

Mark all

[7]
2024 | Published | Conference Paper | IST-REx-ID: 15011 | OA
Kurtic E, Hoefler T, Alistarh D-A. 2024. How to prune your language model: Recovering accuracy on the ‘Sparsity May Cry’ benchmark. Proceedings of Machine Learning Research. CPAL: Conference on Parsimony and Learning, PMLR, vol. 234, 542–553.
[Preprint] View | Download Preprint (ext.) | arXiv
 
[6]
2024 | Published | Conference Paper | IST-REx-ID: 18975 | OA
Modoranu I-V, Kalinov A, Kurtic E, Frantar E, Alistarh D-A. 2024. Error feedback can accurately compress preconditioners. 41st International Conference on Machine Learning. ICML: International Conference on Machine Learning, PMLR, vol. 235, 35910–35933.
[Preprint] View | Download Preprint (ext.) | arXiv
 
[5]
2024 | Published | Conference Paper | IST-REx-ID: 19510 | OA
Modoranu I-V, Safaryan M, Malinovsky G, Kurtic E, Robert T, Richtárik P, Alistarh D-A. 2024. MICROADAM: Accurate adaptive optimization with low space overhead and provable convergence. 38th Conference on Neural Information Processing Systems. , Advances in Neural Information Processing Systems, vol. 37.
[Preprint] View | Files available | Download Preprint (ext.) | arXiv
 
[4]
2023 | Published | Conference Paper | IST-REx-ID: 14460 | OA
Nikdan M, Pegolotti T, Iofinova EB, Kurtic E, Alistarh D-A. 2023. SparseProp: Efficient sparse backpropagation for faster training of neural networks at the edge. Proceedings of the 40th International Conference on Machine Learning. ICML: International Conference on Machine Learning, PMLR, vol. 202, 26215–26227.
[Preprint] View | Download Preprint (ext.) | arXiv
 
[3]
2023 | Published | Conference Paper | IST-REx-ID: 13053 | OA
Krumes A, Vladu A, Kurtic E, Lampert C, Alistarh D-A. 2023. CrAM: A Compression-Aware Minimizer. 11th International Conference on Learning Representations . ICLR: International Conference on Learning Representations.
[Published Version] View | Files available | Download Published Version (ext.) | arXiv
 
[2]
2022 | Published | Conference Paper | IST-REx-ID: 17088 | OA
Kurtic E, Campos D, Nguyen T, Frantar E, Kurtz M, Fineran B, Goin M, Alistarh D-A. 2022. The optimal BERT surgeon: Scalable and accurate second-order pruning for large language models. Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing. EMNLP: Conference on Empirical Methods in Natural Language Processing, 4163–4181.
[Published Version] View | Files available | DOI | arXiv
 
[1]
2021 | Published | Conference Paper | IST-REx-ID: 11463 | OA
Frantar E, Kurtic E, Alistarh D-A. 2021. M-FAC: Efficient matrix-free approximations of second-order information. 35th Conference on Neural Information Processing Systems. NeurIPS: Neural Information Processing Systems, Advances in Neural Information Processing Systems, vol. 34, 14873–14886.
[Published Version] View | Download Published Version (ext.) | arXiv
 

Search

Filter Publications

Display / Sort

Citation Style: ISTA Annual Report

Export / Embed

Grants


7 Publications

Mark all

[7]
2024 | Published | Conference Paper | IST-REx-ID: 15011 | OA
Kurtic E, Hoefler T, Alistarh D-A. 2024. How to prune your language model: Recovering accuracy on the ‘Sparsity May Cry’ benchmark. Proceedings of Machine Learning Research. CPAL: Conference on Parsimony and Learning, PMLR, vol. 234, 542–553.
[Preprint] View | Download Preprint (ext.) | arXiv
 
[6]
2024 | Published | Conference Paper | IST-REx-ID: 18975 | OA
Modoranu I-V, Kalinov A, Kurtic E, Frantar E, Alistarh D-A. 2024. Error feedback can accurately compress preconditioners. 41st International Conference on Machine Learning. ICML: International Conference on Machine Learning, PMLR, vol. 235, 35910–35933.
[Preprint] View | Download Preprint (ext.) | arXiv
 
[5]
2024 | Published | Conference Paper | IST-REx-ID: 19510 | OA
Modoranu I-V, Safaryan M, Malinovsky G, Kurtic E, Robert T, Richtárik P, Alistarh D-A. 2024. MICROADAM: Accurate adaptive optimization with low space overhead and provable convergence. 38th Conference on Neural Information Processing Systems. , Advances in Neural Information Processing Systems, vol. 37.
[Preprint] View | Files available | Download Preprint (ext.) | arXiv
 
[4]
2023 | Published | Conference Paper | IST-REx-ID: 14460 | OA
Nikdan M, Pegolotti T, Iofinova EB, Kurtic E, Alistarh D-A. 2023. SparseProp: Efficient sparse backpropagation for faster training of neural networks at the edge. Proceedings of the 40th International Conference on Machine Learning. ICML: International Conference on Machine Learning, PMLR, vol. 202, 26215–26227.
[Preprint] View | Download Preprint (ext.) | arXiv
 
[3]
2023 | Published | Conference Paper | IST-REx-ID: 13053 | OA
Krumes A, Vladu A, Kurtic E, Lampert C, Alistarh D-A. 2023. CrAM: A Compression-Aware Minimizer. 11th International Conference on Learning Representations . ICLR: International Conference on Learning Representations.
[Published Version] View | Files available | Download Published Version (ext.) | arXiv
 
[2]
2022 | Published | Conference Paper | IST-REx-ID: 17088 | OA
Kurtic E, Campos D, Nguyen T, Frantar E, Kurtz M, Fineran B, Goin M, Alistarh D-A. 2022. The optimal BERT surgeon: Scalable and accurate second-order pruning for large language models. Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing. EMNLP: Conference on Empirical Methods in Natural Language Processing, 4163–4181.
[Published Version] View | Files available | DOI | arXiv
 
[1]
2021 | Published | Conference Paper | IST-REx-ID: 11463 | OA
Frantar E, Kurtic E, Alistarh D-A. 2021. M-FAC: Efficient matrix-free approximations of second-order information. 35th Conference on Neural Information Processing Systems. NeurIPS: Neural Information Processing Systems, Advances in Neural Information Processing Systems, vol. 34, 14873–14886.
[Published Version] View | Download Published Version (ext.) | arXiv
 

Search

Filter Publications

Display / Sort

Citation Style: ISTA Annual Report

Export / Embed