ISTA Research Explorer

Elias Frantar

17 Publications

[17]

2025 | Published | Conference Paper | IST-REx-ID: 19877 |

Frantar, Elias, Roberto L. Castro, Jiale Chen, Torsten Hoefler, and Dan-Adrian Alistarh. “MARLIN: Mixed-Precision Auto-Regressive Parallel Inference on Large Language Models.” In Proceedings of the 30th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming, 239–51. Association for Computing Machinery, 2025. https://doi.org/10.1145/3710848.3710871.

[Published Version] View | Files available | DOI | WoS | arXiv

[16]

2024 | Published | Conference Paper | IST-REx-ID: 18113 |

Egiazarian, Vage, Andrei Panferov, Denis Kuznedelev, Elias Frantar, Artem Babenko, and Dan-Adrian Alistarh. “Extreme Compression of Large Language Models via Additive Quantization.” In Proceedings of the 41st International Conference on Machine Learning, 235:12284–303. ML Research Press, 2024.

[Preprint] View | Download Preprint (ext.) | arXiv

[15]

2024 | Published | Conference Paper | IST-REx-ID: 18975 |

Modoranu, Ionut-Vlad, Aleksei Kalinov, Eldar Kurtic, Elias Frantar, and Dan-Adrian Alistarh. “Error Feedback Can Accurately Compress Preconditioners.” In 41st International Conference on Machine Learning, 235:35910–33. ML Research Press, 2024.

[Preprint] View | Download Preprint (ext.) | arXiv

[14]

2024 | Published | Conference Paper | IST-REx-ID: 18977 |

Dettmers, Tim, Ruslan A. Svirschevski, Vage Egiazarian, Denis Kuznedelev, Elias Frantar, Saleh Ashkboos, Alexander Borzunov, Torsten Hoefler, and Dan-Adrian Alistarh. “SpQR: A Sparse-Quantized Representation for near-Lossless LLM Weight Compression.” In 12th International Conference on Learning Representations. OpenReview, 2024.

[Preprint] View | Download Preprint (ext.) | arXiv

[13]

2024 | Published | Thesis | IST-REx-ID: 17485 |

Frantar, Elias. “Compressing Large Neural Networks : Algorithms, Systems and Scaling Laws.” Institute of Science and Technology Austria, 2024. https://doi.org/10.15479/at:ista:17485.

[Published Version] View | Files available | DOI

[12]

2024 | Published | Conference Paper | IST-REx-ID: 18061 |

Frantar, Elias, and Dan-Adrian Alistarh. “QMoE: Sub-1-Bit Compression of Trillion Parameter Models.” In Proceedings of Machine Learning and Systems, edited by P. Gibbons, G. Pekhimenko, and C. De Sa, Vol. 6, 2024.

[Published Version] View | Files available | Download Published Version (ext.)

[11]

2024 | Published | Conference Paper | IST-REx-ID: 18062 |

Frantar, Elias, Carlos Riquelme Ruiz, Neil Houlsby, Dan-Adrian Alistarh, and Utku Evci. “Scaling Laws for Sparsely-Connected Foundation Models.” In The Twelfth International Conference on Learning Representations, 2024.

[Published Version] View | Files available | Download Published Version (ext.) | arXiv

[10]

2024 | Published | Conference Paper | IST-REx-ID: 17456 |

Markov, Ilia, Kaveh Alimohammadi, Elias Frantar, and Dan-Adrian Alistarh. “L-GreCo: Layerwise-Adaptive Gradient Compression for Efficient Data-Parallel Deep Learning.” In Proceedings of Machine Learning and Systems , edited by P. Gibbons, G. Pekhimenko, and C. De Sa, Vol. 6. Association for Computing Machinery, 2024.

[Published Version] View | Files available | Download Published Version (ext.) | arXiv

[9]

2024 | Research Data Reference | IST-REx-ID: 19884 |

Frantar, Elias, Roberto Castro, Jiale Chen, Torsten Hoefler, and Dan-Adrian Alistarh. “MARLIN: Mixed-Precision Auto-Regressive Parallel Inference on Large Language Models.” Zenodo, 2024. https://doi.org/10.5281/ZENODO.14213091.

[Published Version] View | Files available | DOI | Download Published Version (ext.)

[8]

2024 | Published | Conference Paper | IST-REx-ID: 18121 |

Moakhar, Arshia Soltani, Eugenia B Iofinova, Elias Frantar, and Dan-Adrian Alistarh. “SPADE: Sparsity-Guided Debugging for Deep Neural Networks.” In Proceedings of the 41st International Conference on Machine Learning, 235:45955–87. ML Research Press, 2024.

[Preprint] View | Files available | Download Preprint (ext.) | arXiv

[7]

2023 | Published | Conference Paper | IST-REx-ID: 17378 |

Frantar, Elias, Saleh Ashkboos, Torsten Hoefler, and Dan-Adrian Alistarh. “OPTQ: Accurate Post-Training Quantization for Generative Pre-Trained Transformers.” In 11th International Conference on Learning Representations . International Conference on Learning Representations, 2023.

[Published Version] View | Files available

[6]

2023 | Published | Conference Paper | IST-REx-ID: 14458 |

Frantar, Elias, and Dan-Adrian Alistarh. “SparseGPT: Massive Language Models Can Be Accurately Pruned in One-Shot.” In Proceedings of the 40th International Conference on Machine Learning, 202:10323–37. ML Research Press, 2023.

[Preprint] View | Files available | Download Preprint (ext.) | arXiv

[5]

2022 | Published | Conference Paper | IST-REx-ID: 17088 |

Kurtic, Eldar, Daniel Campos, Tuan Nguyen, Elias Frantar, Mark Kurtz, Benjamin Fineran, Michael Goin, and Dan-Adrian Alistarh. “The Optimal BERT Surgeon: Scalable and Accurate Second-Order Pruning for Large Language Models.” In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 4163–81. Association for Computational Linguistics, 2022. https://doi.org/10.18653/v1/2022.emnlp-main.279.

[Published Version] View | Files available | DOI | arXiv

[4]

2022 | Published | Conference Paper | IST-REx-ID: 17087 |

Frantar, Elias, Sidak Pal Singh, and Dan-Adrian Alistarh. “Optimal Brain Compression: A Framework for Accurate Post-Training Quantization and Pruning.” In 36th Conference on Neural Information Processing Systems, Vol. 35. ML Research Press, 2022.

[Submitted Version] View | Files available | arXiv

[3]

2022 | Published | Conference Paper | IST-REx-ID: 17059 |

Frantar, Elias, and Dan-Adrian Alistarh. “SPDY: Accurate Pruning with Speedup Guarantees.” In 39th International Conference on Machine Learning, 162:6726–43. ML Research Press, 2022.

[Published Version] View | Files available | WoS

[2]

2021 | Published | Conference Paper | IST-REx-ID: 11463 |

Frantar, Elias, Eldar Kurtic, and Dan-Adrian Alistarh. “M-FAC: Efficient Matrix-Free Approximations of Second-Order Information.” In 35th Conference on Neural Information Processing Systems, 34:14873–86. Neural Information Processing Systems Foundation, 2021.

[Published Version] View | Download Published Version (ext.) | arXiv

[1]

2020 | Published | Conference Paper | IST-REx-ID: 8724 |

Konstantinov, Nikola H, Elias Frantar, Dan-Adrian Alistarh, and Christoph Lampert. “On the Sample Complexity of Adversarial Multi-Source PAC Learning.” In Proceedings of the 37th International Conference on Machine Learning, 119:5416–25. ML Research Press, 2020.

[Published Version] View | Files available | arXiv

Grants

17 Publications

Mark all

[17]

2025 | Published | Conference Paper | IST-REx-ID: 19877 |

[Published Version] View | Files available | DOI | WoS | arXiv

[16]

2024 | Published | Conference Paper | IST-REx-ID: 18113 |

[Preprint] View | Download Preprint (ext.) | arXiv

[15]

2024 | Published | Conference Paper | IST-REx-ID: 18975 |

[Preprint] View | Download Preprint (ext.) | arXiv

[14]

2024 | Published | Conference Paper | IST-REx-ID: 18977 |

[Preprint] View | Download Preprint (ext.) | arXiv

[13]

2024 | Published | Thesis | IST-REx-ID: 17485 |

Frantar, Elias. “Compressing Large Neural Networks : Algorithms, Systems and Scaling Laws.” Institute of Science and Technology Austria, 2024. https://doi.org/10.15479/at:ista:17485.

[Published Version] View | Files available | DOI

[12]

2024 | Published | Conference Paper | IST-REx-ID: 18061 |

[Published Version] View | Files available | Download Published Version (ext.)

[11]

2024 | Published | Conference Paper | IST-REx-ID: 18062 |

[Published Version] View | Files available | Download Published Version (ext.) | arXiv

[10]

2024 | Published | Conference Paper | IST-REx-ID: 17456 |

[Published Version] View | Files available | Download Published Version (ext.) | arXiv

[9]

2024 | Research Data Reference | IST-REx-ID: 19884 |

[Published Version] View | Files available | DOI | Download Published Version (ext.)

[8]

2024 | Published | Conference Paper | IST-REx-ID: 18121 |

[Preprint] View | Files available | Download Preprint (ext.) | arXiv

[7]

2023 | Published | Conference Paper | IST-REx-ID: 17378 |

[Published Version] View | Files available

[6]

2023 | Published | Conference Paper | IST-REx-ID: 14458 |

[Preprint] View | Files available | Download Preprint (ext.) | arXiv

[5]

2022 | Published | Conference Paper | IST-REx-ID: 17088 |

[Published Version] View | Files available | DOI | arXiv

[4]

2022 | Published | Conference Paper | IST-REx-ID: 17087 |

[Submitted Version] View | Files available | arXiv

[3]

2022 | Published | Conference Paper | IST-REx-ID: 17059 |

Frantar, Elias, and Dan-Adrian Alistarh. “SPDY: Accurate Pruning with Speedup Guarantees.” In 39th International Conference on Machine Learning, 162:6726–43. ML Research Press, 2022.

[Published Version] View | Files available | WoS

[2]

2021 | Published | Conference Paper | IST-REx-ID: 11463 |

[Published Version] View | Download Published Version (ext.) | arXiv

[1]

2020 | Published | Conference Paper | IST-REx-ID: 8724 |

[Published Version] View | Files available | arXiv

Elias Frantar

17 Publications

Search

Filter Publications

Display / Sort

Export / Embed

Grants

17 Publications

Search

Filter Publications

Display / Sort

Export / Embed

Elias Frantar

17 Publications

Search

Filter Publications

Display / Sort

Export / Embed

Export Options

Grants

17 Publications

Search

Filter Publications

Display / Sort

Export / Embed

Export Options