6 Publications

Mark all

[6]
2025 | Published | Conference Paper | IST-REx-ID: 20034 | OA
Robert, T., Safaryan, M., Modoranu, I.-V., & Alistarh, D.-A. (2025). LDAdam: Adaptive optimization from low-dimensional gradient statistics. In 13th International Conference on Learning Representations (pp. 101877–101913). Singapore, Singapore: ICLR.
[Published Version] View | Files available | arXiv
 
[5]
2024 | Published | Conference Paper | IST-REx-ID: 18976 | OA
Islamov, R., Safaryan, M., & Alistarh, D.-A. (2024). AsGrad: A sharp unified analysis of asynchronous-SGD algorithms. In Proceedings of The 27th International Conference on Artificial Intelligence and Statistics (Vol. 238, pp. 649–657). Valencia, Spain: ML Research Press.
[Preprint] View | Download Preprint (ext.) | arXiv
 
[4]
2024 | Published | Conference Paper | IST-REx-ID: 19510 | OA
Modoranu, I.-V., Safaryan, M., Malinovsky, G., Kurtic, E., Robert, T., Richtárik, P., & Alistarh, D.-A. (2024). MICROADAM: Accurate adaptive optimization with low space overhead and provable convergence. In 38th Conference on Neural Information Processing Systems (Vol. 37). Neural Information Processing Systems Foundation.
[Preprint] View | Files available | Download Preprint (ext.) | arXiv
 
[3]
2024 | Published | Conference Paper | IST-REx-ID: 19518 | OA
Wu, D., Modoranu, I.-V., Safaryan, M., Kuznedelev, D., & Alistarh, D.-A. (2024). The iterative optimal brain surgeon: Faster sparse recovery by leveraging second-order information. In 38th Conference on Neural Information Processing Systems (Vol. 37). Vancouver, Canada: Neural Information Processing Systems Foundation.
[Preprint] View | Download Preprint (ext.) | arXiv
 
[2]
2023 | Published | Journal Article | IST-REx-ID: 14815 | OA
Beznosikov, A., Horvath, S., Richtarik, P., & Safaryan, M. (2023). On biased compression for distributed learning. Journal of Machine Learning Research. Journal of Machine Learning Research.
[Published Version] View | Files available | WoS | arXiv
 
[1]
2023 | Published | Conference Paper | IST-REx-ID: 15363 | OA
Safaryan, M., Krumes, A., & Alistarh, D.-A. (2023). Knowledge distillation performs partial variance reduction. In 36th Conference on Neural Information Processing Systems (Vol. 36). New Orleans, LA, United States.
[Published Version] View | Files available | arXiv
 

Search

Filter Publications

Display / Sort

Citation Style: APA

Export / Embed

Grants


6 Publications

Mark all

[6]
2025 | Published | Conference Paper | IST-REx-ID: 20034 | OA
Robert, T., Safaryan, M., Modoranu, I.-V., & Alistarh, D.-A. (2025). LDAdam: Adaptive optimization from low-dimensional gradient statistics. In 13th International Conference on Learning Representations (pp. 101877–101913). Singapore, Singapore: ICLR.
[Published Version] View | Files available | arXiv
 
[5]
2024 | Published | Conference Paper | IST-REx-ID: 18976 | OA
Islamov, R., Safaryan, M., & Alistarh, D.-A. (2024). AsGrad: A sharp unified analysis of asynchronous-SGD algorithms. In Proceedings of The 27th International Conference on Artificial Intelligence and Statistics (Vol. 238, pp. 649–657). Valencia, Spain: ML Research Press.
[Preprint] View | Download Preprint (ext.) | arXiv
 
[4]
2024 | Published | Conference Paper | IST-REx-ID: 19510 | OA
Modoranu, I.-V., Safaryan, M., Malinovsky, G., Kurtic, E., Robert, T., Richtárik, P., & Alistarh, D.-A. (2024). MICROADAM: Accurate adaptive optimization with low space overhead and provable convergence. In 38th Conference on Neural Information Processing Systems (Vol. 37). Neural Information Processing Systems Foundation.
[Preprint] View | Files available | Download Preprint (ext.) | arXiv
 
[3]
2024 | Published | Conference Paper | IST-REx-ID: 19518 | OA
Wu, D., Modoranu, I.-V., Safaryan, M., Kuznedelev, D., & Alistarh, D.-A. (2024). The iterative optimal brain surgeon: Faster sparse recovery by leveraging second-order information. In 38th Conference on Neural Information Processing Systems (Vol. 37). Vancouver, Canada: Neural Information Processing Systems Foundation.
[Preprint] View | Download Preprint (ext.) | arXiv
 
[2]
2023 | Published | Journal Article | IST-REx-ID: 14815 | OA
Beznosikov, A., Horvath, S., Richtarik, P., & Safaryan, M. (2023). On biased compression for distributed learning. Journal of Machine Learning Research. Journal of Machine Learning Research.
[Published Version] View | Files available | WoS | arXiv
 
[1]
2023 | Published | Conference Paper | IST-REx-ID: 15363 | OA
Safaryan, M., Krumes, A., & Alistarh, D.-A. (2023). Knowledge distillation performs partial variance reduction. In 36th Conference on Neural Information Processing Systems (Vol. 36). New Orleans, LA, United States.
[Published Version] View | Files available | arXiv
 

Search

Filter Publications

Display / Sort

Citation Style: APA

Export / Embed