17 Publications

Mark all

[17]
2025 | Published | Conference Paper | IST-REx-ID: 19877 | OA
E. Frantar, R. L. Castro, J. Chen, T. Hoefler, and D.-A. Alistarh, “MARLIN: Mixed-precision auto-regressive parallel inference on Large Language Models,” in Proceedings of the 30th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming, Las Vegas, NV, United States, 2025, pp. 239–251.
[Published Version] View | Files available | DOI | arXiv
 
[16]
2024 | Published | Conference Paper | IST-REx-ID: 18113 | OA
V. Egiazarian, A. Panferov, D. Kuznedelev, E. Frantar, A. Babenko, and D.-A. Alistarh, “Extreme compression of large language models via additive quantization,” in Proceedings of the 41st International Conference on Machine Learning, Vienna, Austria, 2024, vol. 235, pp. 12284–12303.
[Preprint] View | Download Preprint (ext.) | arXiv
 
[15]
2024 | Published | Conference Paper | IST-REx-ID: 18975 | OA
I.-V. Modoranu, A. Kalinov, E. Kurtic, E. Frantar, and D.-A. Alistarh, “Error feedback can accurately compress preconditioners,” in 41st International Conference on Machine Learning, Vienna, Austria, 2024, vol. 235, pp. 35910–35933.
[Preprint] View | Download Preprint (ext.) | arXiv
 
[14]
2024 | Published | Conference Paper | IST-REx-ID: 18977 | OA
T. Dettmers et al., “SpQR: A sparse-quantized representation for near-lossless LLM weight compression,” in 12th International Conference on Learning Representations, Vienna, Austria, 2024.
[Preprint] View | Download Preprint (ext.) | arXiv
 
[13]
2024 | Published | Thesis | IST-REx-ID: 17485 | OA
E. Frantar, “Compressing large neural networks : Algorithms, systems and scaling laws,” Institute of Science and Technology Austria, 2024.
[Published Version] View | Files available | DOI
 
[12]
2024 | Published | Conference Paper | IST-REx-ID: 18061 | OA
E. Frantar and D.-A. Alistarh, “QMoE: Sub-1-bit compression of trillion parameter models,” in Proceedings of Machine Learning and Systems, Santa Clara, CA, USA, 2024, vol. 6.
[Published Version] View | Files available | Download Published Version (ext.)
 
[11]
2024 | Published | Conference Paper | IST-REx-ID: 18062 | OA
E. Frantar, C. R. Ruiz, N. Houlsby, D.-A. Alistarh, and U. Evci, “Scaling laws for sparsely-connected foundation models,” in The Twelfth International Conference on Learning Representations, Vienna, Austria, 2024.
[Published Version] View | Files available | Download Published Version (ext.) | arXiv
 
[10]
2024 | Published | Conference Paper | IST-REx-ID: 18121 | OA
A. S. Moakhar, E. B. Iofinova, E. Frantar, and D.-A. Alistarh, “SPADE: Sparsity-guided debugging for deep neural networks,” in Proceedings of the 41st International Conference on Machine Learning, Vienna, Austria, 2024, vol. 235, pp. 45955–45987.
[Preprint] View | Files available | Download Preprint (ext.) | arXiv
 
[9]
2024 | Published | Conference Paper | IST-REx-ID: 17456 | OA
I. Markov, K. Alimohammadi, E. Frantar, and D.-A. Alistarh, “L-GreCo: Layerwise-adaptive gradient compression for efficient data-parallel deep learning,” in Proceedings of Machine Learning and Systems , Athens, Greece, 2024, vol. 6.
[Published Version] View | Files available | Download Published Version (ext.) | arXiv
 
[8]
2024 | Research Data Reference | IST-REx-ID: 19884 | OA
E. Frantar, R. Castro, J. Chen, T. Hoefler, and D.-A. Alistarh, “MARLIN: Mixed-precision auto-regressive parallel inference on Large Language Models.” Zenodo, 2024.
[Published Version] View | Files available | DOI | Download Published Version (ext.)
 
[7]
2023 | Published | Conference Paper | IST-REx-ID: 17378 | OA
E. Frantar, S. Ashkboos, T. Hoefler, and D.-A. Alistarh, “OPTQ: Accurate post-training quantization for generative pre-trained transformers,” in 11th International Conference on Learning Representations , Kigali, Rwanda, 2023.
[Published Version] View | Files available
 
[6]
2023 | Published | Conference Paper | IST-REx-ID: 14458 | OA
E. Frantar and D.-A. Alistarh, “SparseGPT: Massive language models can be accurately pruned in one-shot,” in Proceedings of the 40th International Conference on Machine Learning, Honolulu, Hawaii, HI, United States, 2023, vol. 202, pp. 10323–10337.
[Preprint] View | Files available | Download Preprint (ext.) | arXiv
 
[5]
2022 | Published | Conference Paper | IST-REx-ID: 17088 | OA
E. Kurtic et al., “The optimal BERT surgeon: Scalable and accurate second-order pruning for large language models,” in Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, Abu Dhabi, United Arab Emirates, 2022, pp. 4163–4181.
[Published Version] View | Files available | DOI | arXiv
 
[4]
2022 | Published | Conference Paper | IST-REx-ID: 17087 | OA
E. Frantar, S. P. Singh, and D.-A. Alistarh, “Optimal brain compression: A framework for accurate post-training quantization and pruning,” in 36th Conference on Neural Information Processing Systems, New Orleans, LA, United States, 2022, vol. 35.
[Submitted Version] View | Files available | arXiv
 
[3]
2022 | Published | Conference Paper | IST-REx-ID: 17059 | OA
E. Frantar and D.-A. Alistarh, “SPDY: Accurate pruning with speedup guarantees,” in 39th International Conference on Machine Learning, Baltimore, MD, United States, 2022, vol. 162, pp. 6726–6743.
[Published Version] View | Files available | WoS
 
[2]
2021 | Published | Conference Paper | IST-REx-ID: 11463 | OA
E. Frantar, E. Kurtic, and D.-A. Alistarh, “M-FAC: Efficient matrix-free approximations of second-order information,” in 35th Conference on Neural Information Processing Systems, Virtual, Online, 2021, vol. 34, pp. 14873–14886.
[Published Version] View | Download Published Version (ext.) | arXiv
 
[1]
2020 | Published | Conference Paper | IST-REx-ID: 8724 | OA
N. H. Konstantinov, E. Frantar, D.-A. Alistarh, and C. Lampert, “On the sample complexity of adversarial multi-source PAC learning,” in Proceedings of the 37th International Conference on Machine Learning, Online, 2020, vol. 119, pp. 5416–5425.
[Published Version] View | Files available | arXiv
 

Search

Filter Publications

Display / Sort

Citation Style: IEEE

Export / Embed

Grants


17 Publications

Mark all

[17]
2025 | Published | Conference Paper | IST-REx-ID: 19877 | OA
E. Frantar, R. L. Castro, J. Chen, T. Hoefler, and D.-A. Alistarh, “MARLIN: Mixed-precision auto-regressive parallel inference on Large Language Models,” in Proceedings of the 30th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming, Las Vegas, NV, United States, 2025, pp. 239–251.
[Published Version] View | Files available | DOI | arXiv
 
[16]
2024 | Published | Conference Paper | IST-REx-ID: 18113 | OA
V. Egiazarian, A. Panferov, D. Kuznedelev, E. Frantar, A. Babenko, and D.-A. Alistarh, “Extreme compression of large language models via additive quantization,” in Proceedings of the 41st International Conference on Machine Learning, Vienna, Austria, 2024, vol. 235, pp. 12284–12303.
[Preprint] View | Download Preprint (ext.) | arXiv
 
[15]
2024 | Published | Conference Paper | IST-REx-ID: 18975 | OA
I.-V. Modoranu, A. Kalinov, E. Kurtic, E. Frantar, and D.-A. Alistarh, “Error feedback can accurately compress preconditioners,” in 41st International Conference on Machine Learning, Vienna, Austria, 2024, vol. 235, pp. 35910–35933.
[Preprint] View | Download Preprint (ext.) | arXiv
 
[14]
2024 | Published | Conference Paper | IST-REx-ID: 18977 | OA
T. Dettmers et al., “SpQR: A sparse-quantized representation for near-lossless LLM weight compression,” in 12th International Conference on Learning Representations, Vienna, Austria, 2024.
[Preprint] View | Download Preprint (ext.) | arXiv
 
[13]
2024 | Published | Thesis | IST-REx-ID: 17485 | OA
E. Frantar, “Compressing large neural networks : Algorithms, systems and scaling laws,” Institute of Science and Technology Austria, 2024.
[Published Version] View | Files available | DOI
 
[12]
2024 | Published | Conference Paper | IST-REx-ID: 18061 | OA
E. Frantar and D.-A. Alistarh, “QMoE: Sub-1-bit compression of trillion parameter models,” in Proceedings of Machine Learning and Systems, Santa Clara, CA, USA, 2024, vol. 6.
[Published Version] View | Files available | Download Published Version (ext.)
 
[11]
2024 | Published | Conference Paper | IST-REx-ID: 18062 | OA
E. Frantar, C. R. Ruiz, N. Houlsby, D.-A. Alistarh, and U. Evci, “Scaling laws for sparsely-connected foundation models,” in The Twelfth International Conference on Learning Representations, Vienna, Austria, 2024.
[Published Version] View | Files available | Download Published Version (ext.) | arXiv
 
[10]
2024 | Published | Conference Paper | IST-REx-ID: 18121 | OA
A. S. Moakhar, E. B. Iofinova, E. Frantar, and D.-A. Alistarh, “SPADE: Sparsity-guided debugging for deep neural networks,” in Proceedings of the 41st International Conference on Machine Learning, Vienna, Austria, 2024, vol. 235, pp. 45955–45987.
[Preprint] View | Files available | Download Preprint (ext.) | arXiv
 
[9]
2024 | Published | Conference Paper | IST-REx-ID: 17456 | OA
I. Markov, K. Alimohammadi, E. Frantar, and D.-A. Alistarh, “L-GreCo: Layerwise-adaptive gradient compression for efficient data-parallel deep learning,” in Proceedings of Machine Learning and Systems , Athens, Greece, 2024, vol. 6.
[Published Version] View | Files available | Download Published Version (ext.) | arXiv
 
[8]
2024 | Research Data Reference | IST-REx-ID: 19884 | OA
E. Frantar, R. Castro, J. Chen, T. Hoefler, and D.-A. Alistarh, “MARLIN: Mixed-precision auto-regressive parallel inference on Large Language Models.” Zenodo, 2024.
[Published Version] View | Files available | DOI | Download Published Version (ext.)
 
[7]
2023 | Published | Conference Paper | IST-REx-ID: 17378 | OA
E. Frantar, S. Ashkboos, T. Hoefler, and D.-A. Alistarh, “OPTQ: Accurate post-training quantization for generative pre-trained transformers,” in 11th International Conference on Learning Representations , Kigali, Rwanda, 2023.
[Published Version] View | Files available
 
[6]
2023 | Published | Conference Paper | IST-REx-ID: 14458 | OA
E. Frantar and D.-A. Alistarh, “SparseGPT: Massive language models can be accurately pruned in one-shot,” in Proceedings of the 40th International Conference on Machine Learning, Honolulu, Hawaii, HI, United States, 2023, vol. 202, pp. 10323–10337.
[Preprint] View | Files available | Download Preprint (ext.) | arXiv
 
[5]
2022 | Published | Conference Paper | IST-REx-ID: 17088 | OA
E. Kurtic et al., “The optimal BERT surgeon: Scalable and accurate second-order pruning for large language models,” in Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, Abu Dhabi, United Arab Emirates, 2022, pp. 4163–4181.
[Published Version] View | Files available | DOI | arXiv
 
[4]
2022 | Published | Conference Paper | IST-REx-ID: 17087 | OA
E. Frantar, S. P. Singh, and D.-A. Alistarh, “Optimal brain compression: A framework for accurate post-training quantization and pruning,” in 36th Conference on Neural Information Processing Systems, New Orleans, LA, United States, 2022, vol. 35.
[Submitted Version] View | Files available | arXiv
 
[3]
2022 | Published | Conference Paper | IST-REx-ID: 17059 | OA
E. Frantar and D.-A. Alistarh, “SPDY: Accurate pruning with speedup guarantees,” in 39th International Conference on Machine Learning, Baltimore, MD, United States, 2022, vol. 162, pp. 6726–6743.
[Published Version] View | Files available | WoS
 
[2]
2021 | Published | Conference Paper | IST-REx-ID: 11463 | OA
E. Frantar, E. Kurtic, and D.-A. Alistarh, “M-FAC: Efficient matrix-free approximations of second-order information,” in 35th Conference on Neural Information Processing Systems, Virtual, Online, 2021, vol. 34, pp. 14873–14886.
[Published Version] View | Download Published Version (ext.) | arXiv
 
[1]
2020 | Published | Conference Paper | IST-REx-ID: 8724 | OA
N. H. Konstantinov, E. Frantar, D.-A. Alistarh, and C. Lampert, “On the sample complexity of adversarial multi-source PAC learning,” in Proceedings of the 37th International Conference on Machine Learning, Online, 2020, vol. 119, pp. 5416–5425.
[Published Version] View | Files available | arXiv
 

Search

Filter Publications

Display / Sort

Citation Style: IEEE

Export / Embed