Please note that ISTA Research Explorer no longer supports Internet Explorer versions 8 or 9 (or earlier).

We recommend upgrading to the latest Internet Explorer, Google Chrome, or Firefox.

1 Publication


2025 | Published | Conference Paper | IST-REx-ID: 19877 | OA
Frantar, E., Castro, R. L., Chen, J., Hoefler, T., & Alistarh, D.-A. (2025). MARLIN: Mixed-precision auto-regressive parallel inference on Large Language Models. In Proceedings of the 30th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming (pp. 239–251). Las Vegas, NV, United States: Association for Computing Machinery. https://doi.org/10.1145/3710848.3710871
[Published Version] View | Files available | DOI | arXiv
 

Filters and Search Terms

isbn=9798400714436

Search

Filter Publications

  • Display / Sort

    Citation Style: APA

    Export / Embed