3 Publications

Mark all

[3]
2025 | Published | Conference Paper | IST-REx-ID: 19877 | OA
Frantar, E., Castro, R. L., Chen, J., Hoefler, T., & Alistarh, D.-A. (2025). MARLIN: Mixed-precision auto-regressive parallel inference on Large Language Models. In Proceedings of the 30th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming (pp. 239–251). Las Vegas, NV, United States: Association for Computing Machinery. https://doi.org/10.1145/3710848.3710871
[Published Version] View | Files available | DOI | arXiv
 
[2]
2025 | Published | Conference Paper | IST-REx-ID: 20032 | OA
Chen, J., Yao, D., Pervez, A. A., Alistarh, D.-A., & Locatello, F. (2025). Scalable mechanistic neural networks. In 13th International Conference on Learning Representations (pp. 63716–63737). Singapore, Singapore: OpenReview.
[Published Version] View | Files available | arXiv
 
[1]
2024 | Research Data Reference | IST-REx-ID: 19884 | OA
Frantar, E., Castro, R., Chen, J., Hoefler, T., & Alistarh, D.-A. (2024). MARLIN: Mixed-precision auto-regressive parallel inference on Large Language Models. Zenodo. https://doi.org/10.5281/ZENODO.14213091
[Published Version] View | Files available | DOI | Download Published Version (ext.)
 

Search

Filter Publications

Display / Sort

Citation Style: APA

Export / Embed

Grants


3 Publications

Mark all

[3]
2025 | Published | Conference Paper | IST-REx-ID: 19877 | OA
Frantar, E., Castro, R. L., Chen, J., Hoefler, T., & Alistarh, D.-A. (2025). MARLIN: Mixed-precision auto-regressive parallel inference on Large Language Models. In Proceedings of the 30th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming (pp. 239–251). Las Vegas, NV, United States: Association for Computing Machinery. https://doi.org/10.1145/3710848.3710871
[Published Version] View | Files available | DOI | arXiv
 
[2]
2025 | Published | Conference Paper | IST-REx-ID: 20032 | OA
Chen, J., Yao, D., Pervez, A. A., Alistarh, D.-A., & Locatello, F. (2025). Scalable mechanistic neural networks. In 13th International Conference on Learning Representations (pp. 63716–63737). Singapore, Singapore: OpenReview.
[Published Version] View | Files available | arXiv
 
[1]
2024 | Research Data Reference | IST-REx-ID: 19884 | OA
Frantar, E., Castro, R., Chen, J., Hoefler, T., & Alistarh, D.-A. (2024). MARLIN: Mixed-precision auto-regressive parallel inference on Large Language Models. Zenodo. https://doi.org/10.5281/ZENODO.14213091
[Published Version] View | Files available | DOI | Download Published Version (ext.)
 

Search

Filter Publications

Display / Sort

Citation Style: APA

Export / Embed