6 Publications

Mark all

[6]
2025 | Published | Conference Paper | IST-REx-ID: 20034 | OA
Robert T, Safaryan M, Modoranu I-V, Alistarh D-A. LDAdam: Adaptive optimization from low-dimensional gradient statistics. In: 13th International Conference on Learning Representations. ICLR; 2025:101877-101913.
[Published Version] View | Files available | arXiv
 
[5]
2024 | Published | Conference Paper | IST-REx-ID: 18976 | OA
Islamov R, Safaryan M, Alistarh D-A. AsGrad: A sharp unified analysis of asynchronous-SGD algorithms. In: Proceedings of The 27th International Conference on Artificial Intelligence and Statistics. Vol 238. ML Research Press; 2024:649-657.
[Preprint] View | Download Preprint (ext.) | arXiv
 
[4]
2024 | Published | Conference Paper | IST-REx-ID: 19518 | OA
Wu D, Modoranu I-V, Safaryan M, Kuznedelev D, Alistarh D-A. The iterative optimal brain surgeon: Faster sparse recovery by leveraging second-order information. In: 38th Conference on Neural Information Processing Systems. Vol 37. Neural Information Processing Systems Foundation; 2024.
[Preprint] View | Download Preprint (ext.) | arXiv
 
[3]
2024 | Published | Conference Paper | IST-REx-ID: 19510 | OA
Modoranu I-V, Safaryan M, Malinovsky G, et al. MICROADAM: Accurate adaptive optimization with low space overhead and provable convergence. In: 38th Conference on Neural Information Processing Systems. Vol 37. Neural Information Processing Systems Foundation; 2024.
[Preprint] View | Files available | Download Preprint (ext.) | arXiv
 
[2]
2023 | Published | Journal Article | IST-REx-ID: 14815 | OA
Beznosikov A, Horvath S, Richtarik P, Safaryan M. On biased compression for distributed learning. Journal of Machine Learning Research. 2023;24:1-50.
[Published Version] View | Files available | WoS | arXiv
 
[1]
2023 | Published | Conference Paper | IST-REx-ID: 15363 | OA
Safaryan M, Krumes A, Alistarh D-A. Knowledge distillation performs partial variance reduction. In: 36th Conference on Neural Information Processing Systems. Vol 36. ; 2023.
[Published Version] View | Files available | arXiv
 

Search

Filter Publications

Display / Sort

Citation Style: AMA

Export / Embed

Grants


6 Publications

Mark all

[6]
2025 | Published | Conference Paper | IST-REx-ID: 20034 | OA
Robert T, Safaryan M, Modoranu I-V, Alistarh D-A. LDAdam: Adaptive optimization from low-dimensional gradient statistics. In: 13th International Conference on Learning Representations. ICLR; 2025:101877-101913.
[Published Version] View | Files available | arXiv
 
[5]
2024 | Published | Conference Paper | IST-REx-ID: 18976 | OA
Islamov R, Safaryan M, Alistarh D-A. AsGrad: A sharp unified analysis of asynchronous-SGD algorithms. In: Proceedings of The 27th International Conference on Artificial Intelligence and Statistics. Vol 238. ML Research Press; 2024:649-657.
[Preprint] View | Download Preprint (ext.) | arXiv
 
[4]
2024 | Published | Conference Paper | IST-REx-ID: 19518 | OA
Wu D, Modoranu I-V, Safaryan M, Kuznedelev D, Alistarh D-A. The iterative optimal brain surgeon: Faster sparse recovery by leveraging second-order information. In: 38th Conference on Neural Information Processing Systems. Vol 37. Neural Information Processing Systems Foundation; 2024.
[Preprint] View | Download Preprint (ext.) | arXiv
 
[3]
2024 | Published | Conference Paper | IST-REx-ID: 19510 | OA
Modoranu I-V, Safaryan M, Malinovsky G, et al. MICROADAM: Accurate adaptive optimization with low space overhead and provable convergence. In: 38th Conference on Neural Information Processing Systems. Vol 37. Neural Information Processing Systems Foundation; 2024.
[Preprint] View | Files available | Download Preprint (ext.) | arXiv
 
[2]
2023 | Published | Journal Article | IST-REx-ID: 14815 | OA
Beznosikov A, Horvath S, Richtarik P, Safaryan M. On biased compression for distributed learning. Journal of Machine Learning Research. 2023;24:1-50.
[Published Version] View | Files available | WoS | arXiv
 
[1]
2023 | Published | Conference Paper | IST-REx-ID: 15363 | OA
Safaryan M, Krumes A, Alistarh D-A. Knowledge distillation performs partial variance reduction. In: 36th Conference on Neural Information Processing Systems. Vol 36. ; 2023.
[Published Version] View | Files available | arXiv
 

Search

Filter Publications

Display / Sort

Citation Style: AMA

Export / Embed