Mher Safaryan
6 Publications
2025 |
Published |
Conference Paper |
IST-REx-ID: 20034 |
Robert, Thomas, Mher Safaryan, Ionut-Vlad Modoranu, and Dan-Adrian Alistarh. “LDAdam: Adaptive Optimization from Low-Dimensional Gradient Statistics.” In 13th International Conference on Learning Representations, 101877–913. ICLR, 2025.
[Published Version]
View
| Files available
| arXiv
2024 |
Published |
Conference Paper |
IST-REx-ID: 18976 |
Islamov, Rustem, Mher Safaryan, and Dan-Adrian Alistarh. “AsGrad: A Sharp Unified Analysis of Asynchronous-SGD Algorithms.” In Proceedings of The 27th International Conference on Artificial Intelligence and Statistics, 238:649–57. ML Research Press, 2024.
[Preprint]
View
| Download Preprint (ext.)
| arXiv
2024 |
Published |
Conference Paper |
IST-REx-ID: 19510 |
Modoranu, Ionut-Vlad, Mher Safaryan, Grigory Malinovsky, Eldar Kurtic, Thomas Robert, Peter Richtárik, and Dan-Adrian Alistarh. “MICROADAM: Accurate Adaptive Optimization with Low Space Overhead and Provable Convergence.” In 38th Conference on Neural Information Processing Systems, Vol. 37. Neural Information Processing Systems Foundation, 2024.
[Preprint]
View
| Files available
| Download Preprint (ext.)
| arXiv
2024 |
Published |
Conference Paper |
IST-REx-ID: 19518 |
Wu, Diyuan, Ionut-Vlad Modoranu, Mher Safaryan, Denis Kuznedelev, and Dan-Adrian Alistarh. “The Iterative Optimal Brain Surgeon: Faster Sparse Recovery by Leveraging Second-Order Information.” In 38th Conference on Neural Information Processing Systems, Vol. 37. Neural Information Processing Systems Foundation, 2024.
[Preprint]
View
| Download Preprint (ext.)
| arXiv
2023 |
Published |
Journal Article |
IST-REx-ID: 14815 |
Beznosikov, Aleksandr, Samuel Horvath, Peter Richtarik, and Mher Safaryan. “On Biased Compression for Distributed Learning.” Journal of Machine Learning Research. Journal of Machine Learning Research, 2023.
[Published Version]
View
| Files available
| WoS
| arXiv
2023 |
Published |
Conference Paper |
IST-REx-ID: 15363 |
Safaryan, Mher, Alexandra Krumes, and Dan-Adrian Alistarh. “Knowledge Distillation Performs Partial Variance Reduction.” In 36th Conference on Neural Information Processing Systems, Vol. 36, 2023.
[Published Version]
View
| Files available
| arXiv
Search
Filter Publications
Display / Sort
Export / Embed
Grants
6 Publications
2025 |
Published |
Conference Paper |
IST-REx-ID: 20034 |
Robert, Thomas, Mher Safaryan, Ionut-Vlad Modoranu, and Dan-Adrian Alistarh. “LDAdam: Adaptive Optimization from Low-Dimensional Gradient Statistics.” In 13th International Conference on Learning Representations, 101877–913. ICLR, 2025.
[Published Version]
View
| Files available
| arXiv
2024 |
Published |
Conference Paper |
IST-REx-ID: 18976 |
Islamov, Rustem, Mher Safaryan, and Dan-Adrian Alistarh. “AsGrad: A Sharp Unified Analysis of Asynchronous-SGD Algorithms.” In Proceedings of The 27th International Conference on Artificial Intelligence and Statistics, 238:649–57. ML Research Press, 2024.
[Preprint]
View
| Download Preprint (ext.)
| arXiv
2024 |
Published |
Conference Paper |
IST-REx-ID: 19510 |
Modoranu, Ionut-Vlad, Mher Safaryan, Grigory Malinovsky, Eldar Kurtic, Thomas Robert, Peter Richtárik, and Dan-Adrian Alistarh. “MICROADAM: Accurate Adaptive Optimization with Low Space Overhead and Provable Convergence.” In 38th Conference on Neural Information Processing Systems, Vol. 37. Neural Information Processing Systems Foundation, 2024.
[Preprint]
View
| Files available
| Download Preprint (ext.)
| arXiv
2024 |
Published |
Conference Paper |
IST-REx-ID: 19518 |
Wu, Diyuan, Ionut-Vlad Modoranu, Mher Safaryan, Denis Kuznedelev, and Dan-Adrian Alistarh. “The Iterative Optimal Brain Surgeon: Faster Sparse Recovery by Leveraging Second-Order Information.” In 38th Conference on Neural Information Processing Systems, Vol. 37. Neural Information Processing Systems Foundation, 2024.
[Preprint]
View
| Download Preprint (ext.)
| arXiv
2023 |
Published |
Journal Article |
IST-REx-ID: 14815 |
Beznosikov, Aleksandr, Samuel Horvath, Peter Richtarik, and Mher Safaryan. “On Biased Compression for Distributed Learning.” Journal of Machine Learning Research. Journal of Machine Learning Research, 2023.
[Published Version]
View
| Files available
| WoS
| arXiv
2023 |
Published |
Conference Paper |
IST-REx-ID: 15363 |
Safaryan, Mher, Alexandra Krumes, and Dan-Adrian Alistarh. “Knowledge Distillation Performs Partial Variance Reduction.” In 36th Conference on Neural Information Processing Systems, Vol. 36, 2023.
[Published Version]
View
| Files available
| arXiv