ISTA Research Explorer

Mher Safaryan

6 Publications

[6]

2025 | Published | Conference Paper | IST-REx-ID: 20034 |

Robert T, Safaryan M, Modoranu I-V, Alistarh D-A. LDAdam: Adaptive optimization from low-dimensional gradient statistics. In: 13th International Conference on Learning Representations. ICLR; 2025:101877-101913.

[Published Version] View | Files available | arXiv

[5]

2024 | Published | Conference Paper | IST-REx-ID: 18976 |

Islamov R, Safaryan M, Alistarh D-A. AsGrad: A sharp unified analysis of asynchronous-SGD algorithms. In: Proceedings of The 27th International Conference on Artificial Intelligence and Statistics. Vol 238. ML Research Press; 2024:649-657.

[Preprint] View | Download Preprint (ext.) | arXiv

[4]

2024 | Published | Conference Paper | IST-REx-ID: 19510 |

Modoranu I-V, Safaryan M, Malinovsky G, et al. MICROADAM: Accurate adaptive optimization with low space overhead and provable convergence. In: 38th Conference on Neural Information Processing Systems. Vol 37. Neural Information Processing Systems Foundation; 2024.

[Preprint] View | Files available | Download Preprint (ext.) | arXiv

[3]

2024 | Published | Conference Paper | IST-REx-ID: 19518 |

Wu D, Modoranu I-V, Safaryan M, Kuznedelev D, Alistarh D-A. The iterative optimal brain surgeon: Faster sparse recovery by leveraging second-order information. In: 38th Conference on Neural Information Processing Systems. Vol 37. Neural Information Processing Systems Foundation; 2024.

[Preprint] View | Download Preprint (ext.) | arXiv

[2]

2023 | Published | Journal Article | IST-REx-ID: 14815 |

Beznosikov A, Horvath S, Richtarik P, Safaryan M. On biased compression for distributed learning. Journal of Machine Learning Research. 2023;24:1-50.

[Published Version] View | Files available | WoS | arXiv

[1]

2023 | Published | Conference Paper | IST-REx-ID: 15363 |

Safaryan M, Krumes A, Alistarh D-A. Knowledge distillation performs partial variance reduction. In: 36th Conference on Neural Information Processing Systems. Vol 36. ; 2023.

[Published Version] View | Files available | arXiv

Grants

6 Publications

Mark all

[6]

2025 | Published | Conference Paper | IST-REx-ID: 20034 |

[Published Version] View | Files available | arXiv

[5]

2024 | Published | Conference Paper | IST-REx-ID: 18976 |

[Preprint] View | Download Preprint (ext.) | arXiv

[4]

2024 | Published | Conference Paper | IST-REx-ID: 19510 |

[Preprint] View | Files available | Download Preprint (ext.) | arXiv

[3]

2024 | Published | Conference Paper | IST-REx-ID: 19518 |

[Preprint] View | Download Preprint (ext.) | arXiv

[2]

2023 | Published | Journal Article | IST-REx-ID: 14815 |

Beznosikov A, Horvath S, Richtarik P, Safaryan M. On biased compression for distributed learning. Journal of Machine Learning Research. 2023;24:1-50.

[Published Version] View | Files available | WoS | arXiv

[1]

2023 | Published | Conference Paper | IST-REx-ID: 15363 |

Safaryan M, Krumes A, Alistarh D-A. Knowledge distillation performs partial variance reduction. In: 36th Conference on Neural Information Processing Systems. Vol 36. ; 2023.

[Published Version] View | Files available | arXiv

Mher Safaryan

6 Publications

Search

Filter Publications

Display / Sort

Export / Embed

Grants

6 Publications

Search

Filter Publications

Display / Sort

Export / Embed

Mher Safaryan

6 Publications

Search

Filter Publications

Display / Sort

Export / Embed

Export Options

Grants

6 Publications

Search

Filter Publications

Display / Sort

Export / Embed

Export Options