Elias Frantar
Graduate School
Alistarh Group
15 Publications
2024 | Published | Conference Paper | IST-REx-ID: 18975 |

Error feedback can accurately compress preconditioners
I.-V. Modoranu, A. Kalinov, E. Kurtic, E. Frantar, D.-A. Alistarh, in:, 41st International Conference on Machine Learning, ML Research Press, 2024, pp. 35910–35933.
[Preprint]
View
| Download Preprint (ext.)
| arXiv
I.-V. Modoranu, A. Kalinov, E. Kurtic, E. Frantar, D.-A. Alistarh, in:, 41st International Conference on Machine Learning, ML Research Press, 2024, pp. 35910–35933.
2024 | Published | Conference Paper | IST-REx-ID: 18977 |

SpQR: A sparse-quantized representation for near-lossless LLM weight compression
T. Dettmers, R.A. Svirschevski, V. Egiazarian, D. Kuznedelev, E. Frantar, S. Ashkboos, A. Borzunov, T. Hoefler, D.-A. Alistarh, in:, 12th International Conference on Learning Representations, OpenReview, 2024.
[Preprint]
View
| Download Preprint (ext.)
| arXiv
T. Dettmers, R.A. Svirschevski, V. Egiazarian, D. Kuznedelev, E. Frantar, S. Ashkboos, A. Borzunov, T. Hoefler, D.-A. Alistarh, in:, 12th International Conference on Learning Representations, OpenReview, 2024.
2024 | Published | Conference Paper | IST-REx-ID: 17456 |

L-GreCo: Layerwise-adaptive gradient compression for efficient data-parallel deep learning
I. Markov, K. Alimohammadi, E. Frantar, D.-A. Alistarh, in:, P. Gibbons, G. Pekhimenko, C. De Sa (Eds.), Proceedings of Machine Learning and Systems , Association for Computing Machinery, 2024.
[Published Version]
View
| Files available
| Download Published Version (ext.)
| arXiv
I. Markov, K. Alimohammadi, E. Frantar, D.-A. Alistarh, in:, P. Gibbons, G. Pekhimenko, C. De Sa (Eds.), Proceedings of Machine Learning and Systems , Association for Computing Machinery, 2024.
2024 | Published | Thesis | IST-REx-ID: 17485 |

Compressing large neural networks : Algorithms, systems and scaling laws
E. Frantar, Compressing Large Neural Networks : Algorithms, Systems and Scaling Laws, Institute of Science and Technology Austria, 2024.
[Published Version]
View
| Files available
| DOI
E. Frantar, Compressing Large Neural Networks : Algorithms, Systems and Scaling Laws, Institute of Science and Technology Austria, 2024.
2024 | Published | Conference Paper | IST-REx-ID: 18061 |

QMoE: Sub-1-bit compression of trillion parameter models
E. Frantar, D.-A. Alistarh, in:, P. Gibbons, G. Pekhimenko, C. De Sa (Eds.), Proceedings of Machine Learning and Systems, 2024.
[Published Version]
View
| Files available
| Download Published Version (ext.)
E. Frantar, D.-A. Alistarh, in:, P. Gibbons, G. Pekhimenko, C. De Sa (Eds.), Proceedings of Machine Learning and Systems, 2024.
2024 | Published | Conference Paper | IST-REx-ID: 18062 |

Scaling laws for sparsely-connected foundation models
E. Frantar, C.R. Ruiz, N. Houlsby, D.-A. Alistarh, U. Evci, in:, The Twelfth International Conference on Learning Representations, 2024.
[Published Version]
View
| Files available
| Download Published Version (ext.)
| arXiv
E. Frantar, C.R. Ruiz, N. Houlsby, D.-A. Alistarh, U. Evci, in:, The Twelfth International Conference on Learning Representations, 2024.
2024 | Published | Conference Paper | IST-REx-ID: 18113 |

Extreme compression of large language models via additive quantization
V. Egiazarian, A. Panferov, D. Kuznedelev, E. Frantar, A. Babenko, D.-A. Alistarh, in:, Proceedings of the 41st International Conference on Machine Learning, ML Research Press, 2024, pp. 12284–12303.
[Preprint]
View
| Download Preprint (ext.)
| arXiv
V. Egiazarian, A. Panferov, D. Kuznedelev, E. Frantar, A. Babenko, D.-A. Alistarh, in:, Proceedings of the 41st International Conference on Machine Learning, ML Research Press, 2024, pp. 12284–12303.
2024 | Published | Conference Paper | IST-REx-ID: 18121 |

SPADE: Sparsity-guided debugging for deep neural networks
A.S. Moakhar, E.B. Iofinova, E. Frantar, D.-A. Alistarh, in:, Proceedings of the 41st International Conference on Machine Learning, ML Research Press, 2024, pp. 45955–45987.
[Preprint]
View
| Files available
| Download Preprint (ext.)
| arXiv
A.S. Moakhar, E.B. Iofinova, E. Frantar, D.-A. Alistarh, in:, Proceedings of the 41st International Conference on Machine Learning, ML Research Press, 2024, pp. 45955–45987.
2023 | Published | Conference Paper | IST-REx-ID: 14458 |

SparseGPT: Massive language models can be accurately pruned in one-shot
E. Frantar, D.-A. Alistarh, in:, Proceedings of the 40th International Conference on Machine Learning, ML Research Press, 2023, pp. 10323–10337.
[Preprint]
View
| Files available
| Download Preprint (ext.)
| arXiv
E. Frantar, D.-A. Alistarh, in:, Proceedings of the 40th International Conference on Machine Learning, ML Research Press, 2023, pp. 10323–10337.
2023 | Published | Conference Paper | IST-REx-ID: 17378 |

OPTQ: Accurate post-training quantization for generative pre-trained transformers
E. Frantar, S. Ashkboos, T. Hoefler, D.-A. Alistarh, in:, 11th International Conference on Learning Representations , International Conference on Learning Representations, 2023.
[Published Version]
View
| Files available
E. Frantar, S. Ashkboos, T. Hoefler, D.-A. Alistarh, in:, 11th International Conference on Learning Representations , International Conference on Learning Representations, 2023.
2022 | Published | Conference Paper | IST-REx-ID: 17059 |

SPDY: Accurate pruning with speedup guarantees
E. Frantar, D.-A. Alistarh, in:, 39th International Conference on Machine Learning, ML Research Press, 2022, pp. 6726–6743.
[Published Version]
View
| Files available
| WoS
E. Frantar, D.-A. Alistarh, in:, 39th International Conference on Machine Learning, ML Research Press, 2022, pp. 6726–6743.
2022 | Published | Conference Paper | IST-REx-ID: 17087 |

Optimal brain compression: A framework for accurate post-training quantization and pruning
E. Frantar, S.P. Singh, D.-A. Alistarh, in:, 36th Conference on Neural Information Processing Systems, ML Research Press, 2022.
[Submitted Version]
View
| Files available
| arXiv
E. Frantar, S.P. Singh, D.-A. Alistarh, in:, 36th Conference on Neural Information Processing Systems, ML Research Press, 2022.
2022 | Published | Conference Paper | IST-REx-ID: 17088 |

The optimal BERT surgeon: Scalable and accurate second-order pruning for large language models
E. Kurtic, D. Campos, T. Nguyen, E. Frantar, M. Kurtz, B. Fineran, M. Goin, D.-A. Alistarh, in:, Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics, 2022, pp. 4163–4181.
[Published Version]
View
| Files available
| DOI
| arXiv
E. Kurtic, D. Campos, T. Nguyen, E. Frantar, M. Kurtz, B. Fineran, M. Goin, D.-A. Alistarh, in:, Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics, 2022, pp. 4163–4181.
2021 | Published | Conference Paper | IST-REx-ID: 11463 |

M-FAC: Efficient matrix-free approximations of second-order information
E. Frantar, E. Kurtic, D.-A. Alistarh, in:, 35th Conference on Neural Information Processing Systems, Curran Associates, 2021, pp. 14873–14886.
[Published Version]
View
| Download Published Version (ext.)
| arXiv
E. Frantar, E. Kurtic, D.-A. Alistarh, in:, 35th Conference on Neural Information Processing Systems, Curran Associates, 2021, pp. 14873–14886.
2020 | Published | Conference Paper | IST-REx-ID: 8724 |

On the sample complexity of adversarial multi-source PAC learning
N.H. Konstantinov, E. Frantar, D.-A. Alistarh, C. Lampert, in:, Proceedings of the 37th International Conference on Machine Learning, ML Research Press, 2020, pp. 5416–5425.
[Published Version]
View
| Files available
| arXiv
N.H. Konstantinov, E. Frantar, D.-A. Alistarh, C. Lampert, in:, Proceedings of the 37th International Conference on Machine Learning, ML Research Press, 2020, pp. 5416–5425.
Grants
15 Publications
2024 | Published | Conference Paper | IST-REx-ID: 18975 |

Error feedback can accurately compress preconditioners
I.-V. Modoranu, A. Kalinov, E. Kurtic, E. Frantar, D.-A. Alistarh, in:, 41st International Conference on Machine Learning, ML Research Press, 2024, pp. 35910–35933.
[Preprint]
View
| Download Preprint (ext.)
| arXiv
I.-V. Modoranu, A. Kalinov, E. Kurtic, E. Frantar, D.-A. Alistarh, in:, 41st International Conference on Machine Learning, ML Research Press, 2024, pp. 35910–35933.
2024 | Published | Conference Paper | IST-REx-ID: 18977 |

SpQR: A sparse-quantized representation for near-lossless LLM weight compression
T. Dettmers, R.A. Svirschevski, V. Egiazarian, D. Kuznedelev, E. Frantar, S. Ashkboos, A. Borzunov, T. Hoefler, D.-A. Alistarh, in:, 12th International Conference on Learning Representations, OpenReview, 2024.
[Preprint]
View
| Download Preprint (ext.)
| arXiv
T. Dettmers, R.A. Svirschevski, V. Egiazarian, D. Kuznedelev, E. Frantar, S. Ashkboos, A. Borzunov, T. Hoefler, D.-A. Alistarh, in:, 12th International Conference on Learning Representations, OpenReview, 2024.
2024 | Published | Conference Paper | IST-REx-ID: 17456 |

L-GreCo: Layerwise-adaptive gradient compression for efficient data-parallel deep learning
I. Markov, K. Alimohammadi, E. Frantar, D.-A. Alistarh, in:, P. Gibbons, G. Pekhimenko, C. De Sa (Eds.), Proceedings of Machine Learning and Systems , Association for Computing Machinery, 2024.
[Published Version]
View
| Files available
| Download Published Version (ext.)
| arXiv
I. Markov, K. Alimohammadi, E. Frantar, D.-A. Alistarh, in:, P. Gibbons, G. Pekhimenko, C. De Sa (Eds.), Proceedings of Machine Learning and Systems , Association for Computing Machinery, 2024.
2024 | Published | Thesis | IST-REx-ID: 17485 |

Compressing large neural networks : Algorithms, systems and scaling laws
E. Frantar, Compressing Large Neural Networks : Algorithms, Systems and Scaling Laws, Institute of Science and Technology Austria, 2024.
[Published Version]
View
| Files available
| DOI
E. Frantar, Compressing Large Neural Networks : Algorithms, Systems and Scaling Laws, Institute of Science and Technology Austria, 2024.
2024 | Published | Conference Paper | IST-REx-ID: 18061 |

QMoE: Sub-1-bit compression of trillion parameter models
E. Frantar, D.-A. Alistarh, in:, P. Gibbons, G. Pekhimenko, C. De Sa (Eds.), Proceedings of Machine Learning and Systems, 2024.
[Published Version]
View
| Files available
| Download Published Version (ext.)
E. Frantar, D.-A. Alistarh, in:, P. Gibbons, G. Pekhimenko, C. De Sa (Eds.), Proceedings of Machine Learning and Systems, 2024.
2024 | Published | Conference Paper | IST-REx-ID: 18062 |

Scaling laws for sparsely-connected foundation models
E. Frantar, C.R. Ruiz, N. Houlsby, D.-A. Alistarh, U. Evci, in:, The Twelfth International Conference on Learning Representations, 2024.
[Published Version]
View
| Files available
| Download Published Version (ext.)
| arXiv
E. Frantar, C.R. Ruiz, N. Houlsby, D.-A. Alistarh, U. Evci, in:, The Twelfth International Conference on Learning Representations, 2024.
2024 | Published | Conference Paper | IST-REx-ID: 18113 |

Extreme compression of large language models via additive quantization
V. Egiazarian, A. Panferov, D. Kuznedelev, E. Frantar, A. Babenko, D.-A. Alistarh, in:, Proceedings of the 41st International Conference on Machine Learning, ML Research Press, 2024, pp. 12284–12303.
[Preprint]
View
| Download Preprint (ext.)
| arXiv
V. Egiazarian, A. Panferov, D. Kuznedelev, E. Frantar, A. Babenko, D.-A. Alistarh, in:, Proceedings of the 41st International Conference on Machine Learning, ML Research Press, 2024, pp. 12284–12303.
2024 | Published | Conference Paper | IST-REx-ID: 18121 |

SPADE: Sparsity-guided debugging for deep neural networks
A.S. Moakhar, E.B. Iofinova, E. Frantar, D.-A. Alistarh, in:, Proceedings of the 41st International Conference on Machine Learning, ML Research Press, 2024, pp. 45955–45987.
[Preprint]
View
| Files available
| Download Preprint (ext.)
| arXiv
A.S. Moakhar, E.B. Iofinova, E. Frantar, D.-A. Alistarh, in:, Proceedings of the 41st International Conference on Machine Learning, ML Research Press, 2024, pp. 45955–45987.
2023 | Published | Conference Paper | IST-REx-ID: 14458 |

SparseGPT: Massive language models can be accurately pruned in one-shot
E. Frantar, D.-A. Alistarh, in:, Proceedings of the 40th International Conference on Machine Learning, ML Research Press, 2023, pp. 10323–10337.
[Preprint]
View
| Files available
| Download Preprint (ext.)
| arXiv
E. Frantar, D.-A. Alistarh, in:, Proceedings of the 40th International Conference on Machine Learning, ML Research Press, 2023, pp. 10323–10337.
2023 | Published | Conference Paper | IST-REx-ID: 17378 |

OPTQ: Accurate post-training quantization for generative pre-trained transformers
E. Frantar, S. Ashkboos, T. Hoefler, D.-A. Alistarh, in:, 11th International Conference on Learning Representations , International Conference on Learning Representations, 2023.
[Published Version]
View
| Files available
E. Frantar, S. Ashkboos, T. Hoefler, D.-A. Alistarh, in:, 11th International Conference on Learning Representations , International Conference on Learning Representations, 2023.
2022 | Published | Conference Paper | IST-REx-ID: 17059 |

SPDY: Accurate pruning with speedup guarantees
E. Frantar, D.-A. Alistarh, in:, 39th International Conference on Machine Learning, ML Research Press, 2022, pp. 6726–6743.
[Published Version]
View
| Files available
| WoS
E. Frantar, D.-A. Alistarh, in:, 39th International Conference on Machine Learning, ML Research Press, 2022, pp. 6726–6743.
2022 | Published | Conference Paper | IST-REx-ID: 17087 |

Optimal brain compression: A framework for accurate post-training quantization and pruning
E. Frantar, S.P. Singh, D.-A. Alistarh, in:, 36th Conference on Neural Information Processing Systems, ML Research Press, 2022.
[Submitted Version]
View
| Files available
| arXiv
E. Frantar, S.P. Singh, D.-A. Alistarh, in:, 36th Conference on Neural Information Processing Systems, ML Research Press, 2022.
2022 | Published | Conference Paper | IST-REx-ID: 17088 |

The optimal BERT surgeon: Scalable and accurate second-order pruning for large language models
E. Kurtic, D. Campos, T. Nguyen, E. Frantar, M. Kurtz, B. Fineran, M. Goin, D.-A. Alistarh, in:, Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics, 2022, pp. 4163–4181.
[Published Version]
View
| Files available
| DOI
| arXiv
E. Kurtic, D. Campos, T. Nguyen, E. Frantar, M. Kurtz, B. Fineran, M. Goin, D.-A. Alistarh, in:, Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics, 2022, pp. 4163–4181.
2021 | Published | Conference Paper | IST-REx-ID: 11463 |

M-FAC: Efficient matrix-free approximations of second-order information
E. Frantar, E. Kurtic, D.-A. Alistarh, in:, 35th Conference on Neural Information Processing Systems, Curran Associates, 2021, pp. 14873–14886.
[Published Version]
View
| Download Published Version (ext.)
| arXiv
E. Frantar, E. Kurtic, D.-A. Alistarh, in:, 35th Conference on Neural Information Processing Systems, Curran Associates, 2021, pp. 14873–14886.
2020 | Published | Conference Paper | IST-REx-ID: 8724 |

On the sample complexity of adversarial multi-source PAC learning
N.H. Konstantinov, E. Frantar, D.-A. Alistarh, C. Lampert, in:, Proceedings of the 37th International Conference on Machine Learning, ML Research Press, 2020, pp. 5416–5425.
[Published Version]
View
| Files available
| arXiv
N.H. Konstantinov, E. Frantar, D.-A. Alistarh, C. Lampert, in:, Proceedings of the 37th International Conference on Machine Learning, ML Research Press, 2020, pp. 5416–5425.