ISTA Research Explorer

Elias Frantar

17 Publications

[17]

2025 | Published | Conference Paper | IST-REx-ID: 19877 |

Frantar, Elias, et al. “MARLIN: Mixed-Precision Auto-Regressive Parallel Inference on Large Language Models.” Proceedings of the 30th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming, Association for Computing Machinery, 2025, pp. 239–51, doi:10.1145/3710848.3710871.

[Published Version] View | Files available | DOI | WoS | arXiv

[16]

2024 | Published | Conference Paper | IST-REx-ID: 18113 |

Egiazarian, Vage, et al. “Extreme Compression of Large Language Models via Additive Quantization.” Proceedings of the 41st International Conference on Machine Learning, vol. 235, ML Research Press, 2024, pp. 12284–303.

[Preprint] View | Download Preprint (ext.) | arXiv

[15]

2024 | Published | Conference Paper | IST-REx-ID: 18975 |

Modoranu, Ionut-Vlad, et al. “Error Feedback Can Accurately Compress Preconditioners.” 41st International Conference on Machine Learning, vol. 235, ML Research Press, 2024, pp. 35910–33.

[Preprint] View | Download Preprint (ext.) | arXiv

[14]

2024 | Published | Conference Paper | IST-REx-ID: 18977 |

Dettmers, Tim, et al. “SpQR: A Sparse-Quantized Representation for near-Lossless LLM Weight Compression.” 12th International Conference on Learning Representations, OpenReview, 2024.

[Preprint] View | Download Preprint (ext.) | arXiv

[13]

2024 | Published | Thesis | IST-REx-ID: 17485 |

Frantar, Elias. Compressing Large Neural Networks : Algorithms, Systems and Scaling Laws. Institute of Science and Technology Austria, 2024, doi:10.15479/at:ista:17485.

[Published Version] View | Files available | DOI

[12]

2024 | Published | Conference Paper | IST-REx-ID: 18061 |

Frantar, Elias, and Dan-Adrian Alistarh. “QMoE: Sub-1-Bit Compression of Trillion Parameter Models.” Proceedings of Machine Learning and Systems, edited by P. Gibbons et al., vol. 6, 2024.

[Published Version] View | Files available | Download Published Version (ext.)

[11]

2024 | Published | Conference Paper | IST-REx-ID: 18062 |

Frantar, Elias, et al. “Scaling Laws for Sparsely-Connected Foundation Models.” The Twelfth International Conference on Learning Representations, 2024.

[Published Version] View | Files available | Download Published Version (ext.) | arXiv

[10]

2024 | Published | Conference Paper | IST-REx-ID: 18121 |

Moakhar, Arshia Soltani, et al. “SPADE: Sparsity-Guided Debugging for Deep Neural Networks.” Proceedings of the 41st International Conference on Machine Learning, vol. 235, ML Research Press, 2024, pp. 45955–87.

[Preprint] View | Files available | Download Preprint (ext.) | arXiv

[9]

2024 | Published | Conference Paper | IST-REx-ID: 17456 |

Markov, Ilia, et al. “L-GreCo: Layerwise-Adaptive Gradient Compression for Efficient Data-Parallel Deep Learning.” Proceedings of Machine Learning and Systems , edited by P. Gibbons et al., vol. 6, Association for Computing Machinery, 2024.

[Published Version] View | Files available | Download Published Version (ext.) | arXiv

[8]

2024 | Research Data Reference | IST-REx-ID: 19884 |

Frantar, Elias, et al. MARLIN: Mixed-Precision Auto-Regressive Parallel Inference on Large Language Models. Zenodo, 2024, doi:10.5281/ZENODO.14213091.

[Published Version] View | Files available | DOI | Download Published Version (ext.)

[7]

2023 | Published | Conference Paper | IST-REx-ID: 17378 |

Frantar, Elias, et al. “OPTQ: Accurate Post-Training Quantization for Generative Pre-Trained Transformers.” 11th International Conference on Learning Representations , International Conference on Learning Representations, 2023.

[Published Version] View | Files available

[6]

2023 | Published | Conference Paper | IST-REx-ID: 14458 |

Frantar, Elias, and Dan-Adrian Alistarh. “SparseGPT: Massive Language Models Can Be Accurately Pruned in One-Shot.” Proceedings of the 40th International Conference on Machine Learning, vol. 202, ML Research Press, 2023, pp. 10323–37.

[Preprint] View | Files available | Download Preprint (ext.) | arXiv

[5]

2022 | Published | Conference Paper | IST-REx-ID: 17088 |

Kurtic, Eldar, et al. “The Optimal BERT Surgeon: Scalable and Accurate Second-Order Pruning for Large Language Models.” Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics, 2022, pp. 4163–81, doi:10.18653/v1/2022.emnlp-main.279.

[Published Version] View | Files available | DOI | arXiv

[4]

2022 | Published | Conference Paper | IST-REx-ID: 17087 |

Frantar, Elias, et al. “Optimal Brain Compression: A Framework for Accurate Post-Training Quantization and Pruning.” 36th Conference on Neural Information Processing Systems, vol. 35, ML Research Press, 2022.

[Submitted Version] View | Files available | arXiv

[3]

2022 | Published | Conference Paper | IST-REx-ID: 17059 |

Frantar, Elias, and Dan-Adrian Alistarh. “SPDY: Accurate Pruning with Speedup Guarantees.” 39th International Conference on Machine Learning, vol. 162, ML Research Press, 2022, pp. 6726–43.

[Published Version] View | Files available | WoS

[2]

2021 | Published | Conference Paper | IST-REx-ID: 11463 |

Frantar, Elias, et al. “M-FAC: Efficient Matrix-Free Approximations of Second-Order Information.” 35th Conference on Neural Information Processing Systems, vol. 34, Neural Information Processing Systems Foundation, 2021, pp. 14873–86.

[Published Version] View | Download Published Version (ext.) | arXiv

[1]

2020 | Published | Conference Paper | IST-REx-ID: 8724 |

Konstantinov, Nikola H., et al. “On the Sample Complexity of Adversarial Multi-Source PAC Learning.” Proceedings of the 37th International Conference on Machine Learning, vol. 119, ML Research Press, 2020, pp. 5416–25.

[Published Version] View | Files available | arXiv

Grants

17 Publications

Mark all

[17]

2025 | Published | Conference Paper | IST-REx-ID: 19877 |

[Published Version] View | Files available | DOI | WoS | arXiv

[16]

2024 | Published | Conference Paper | IST-REx-ID: 18113 |

[Preprint] View | Download Preprint (ext.) | arXiv

[15]

2024 | Published | Conference Paper | IST-REx-ID: 18975 |

Modoranu, Ionut-Vlad, et al. “Error Feedback Can Accurately Compress Preconditioners.” 41st International Conference on Machine Learning, vol. 235, ML Research Press, 2024, pp. 35910–33.

[Preprint] View | Download Preprint (ext.) | arXiv

[14]

2024 | Published | Conference Paper | IST-REx-ID: 18977 |

Dettmers, Tim, et al. “SpQR: A Sparse-Quantized Representation for near-Lossless LLM Weight Compression.” 12th International Conference on Learning Representations, OpenReview, 2024.

[Preprint] View | Download Preprint (ext.) | arXiv

[13]

2024 | Published | Thesis | IST-REx-ID: 17485 |

Frantar, Elias. Compressing Large Neural Networks : Algorithms, Systems and Scaling Laws. Institute of Science and Technology Austria, 2024, doi:10.15479/at:ista:17485.

[Published Version] View | Files available | DOI

[12]

2024 | Published | Conference Paper | IST-REx-ID: 18061 |

Frantar, Elias, and Dan-Adrian Alistarh. “QMoE: Sub-1-Bit Compression of Trillion Parameter Models.” Proceedings of Machine Learning and Systems, edited by P. Gibbons et al., vol. 6, 2024.

[Published Version] View | Files available | Download Published Version (ext.)

[11]

2024 | Published | Conference Paper | IST-REx-ID: 18062 |

Frantar, Elias, et al. “Scaling Laws for Sparsely-Connected Foundation Models.” The Twelfth International Conference on Learning Representations, 2024.

[Published Version] View | Files available | Download Published Version (ext.) | arXiv

[10]

2024 | Published | Conference Paper | IST-REx-ID: 18121 |

[Preprint] View | Files available | Download Preprint (ext.) | arXiv

[9]

2024 | Published | Conference Paper | IST-REx-ID: 17456 |

[Published Version] View | Files available | Download Published Version (ext.) | arXiv

[8]

2024 | Research Data Reference | IST-REx-ID: 19884 |

Frantar, Elias, et al. MARLIN: Mixed-Precision Auto-Regressive Parallel Inference on Large Language Models. Zenodo, 2024, doi:10.5281/ZENODO.14213091.

[Published Version] View | Files available | DOI | Download Published Version (ext.)

[7]

2023 | Published | Conference Paper | IST-REx-ID: 17378 |

[Published Version] View | Files available

[6]

2023 | Published | Conference Paper | IST-REx-ID: 14458 |

[Preprint] View | Files available | Download Preprint (ext.) | arXiv

[5]

2022 | Published | Conference Paper | IST-REx-ID: 17088 |

[Published Version] View | Files available | DOI | arXiv

[4]

2022 | Published | Conference Paper | IST-REx-ID: 17087 |

[Submitted Version] View | Files available | arXiv

[3]

2022 | Published | Conference Paper | IST-REx-ID: 17059 |

Frantar, Elias, and Dan-Adrian Alistarh. “SPDY: Accurate Pruning with Speedup Guarantees.” 39th International Conference on Machine Learning, vol. 162, ML Research Press, 2022, pp. 6726–43.

[Published Version] View | Files available | WoS

[2]

2021 | Published | Conference Paper | IST-REx-ID: 11463 |

[Published Version] View | Download Published Version (ext.) | arXiv

[1]

2020 | Published | Conference Paper | IST-REx-ID: 8724 |

[Published Version] View | Files available | arXiv

Elias Frantar

17 Publications

Search

Filter Publications

Display / Sort

Export / Embed

Grants

17 Publications

Search

Filter Publications

Display / Sort

Export / Embed

Elias Frantar

17 Publications

Search

Filter Publications

Display / Sort

Export / Embed

Export Options

Grants

17 Publications

Search

Filter Publications

Display / Sort

Export / Embed

Export Options