Please note that ISTA Research Explorer no longer supports Internet Explorer versions 8 or 9 (or earlier).
We recommend upgrading to the latest Internet Explorer, Google Chrome, or Firefox.
7 Publications
2025 | Published | Conference Paper | IST-REx-ID: 20038 |
Jin, Tian, Ahmed Imtiaz Humayun, Utku Evci, Suvinay Subramanian, Amir Yazdanbakhsh, Dan-Adrian Alistarh, and Gintare Karolina Dziugaite. “The Journey Matters: Average Parameter Count over Pre-Training Unifies Sparse and Dense Scaling Laws.” In 13th International Conference on Learning Representations, 85165–81. ICLR, 2025.
[Published Version]
View
| Files available
| arXiv
2025 | Published | Conference Paper | IST-REx-ID: 20033 |
Emrullah Ildiz, M., Halil Alperen Gozeten, Ege Onur Taga, Marco Mondelli, and Samet Oymak. “High-Dimensional Analysis of Knowledge Distillation: Weak-to-Strong Generalization and Scaling Laws.” In 13th International Conference on Learning Representations, 2967–3006. ICLR, 2025.
[Published Version]
View
| Files available
| arXiv
2025 | Published | Conference Paper | IST-REx-ID: 20037 |
Sawmya, Shashata, Linghao Kong, Ilia Markov, Dan-Adrian Alistarh, and Nir Shavit. “Wasserstein Distances, Neuronal Entanglement, and Sparsity.” In 13th International Conference on Learning Representations, 26244–74. ICLR, 2025.
[Published Version]
View
| Files available
| arXiv
2025 | Published | Conference Paper | IST-REx-ID: 20036 |
Pariza, Valentinos, Mohammadreza Salehi, Gertjan Burghouts, Francesco Locatello, and Yuki M. Asano. “Near, Far: Patch-Ordering Enhances Vision Foundation Models’ Scene Understanding.” In 13th International Conference on Learning Representations, 72303–30. ICLR, 2025.
[Published Version]
View
| Files available
| arXiv
2025 | Published | Conference Paper | IST-REx-ID: 20032 |
Chen, Jiale, Dingling Yao, Adeel A Pervez, Dan-Adrian Alistarh, and Francesco Locatello. “Scalable Mechanistic Neural Networks.” In 13th International Conference on Learning Representations, 63716–37. ICLR, 2025.
[Published Version]
View
| Files available
| arXiv
2025 | Published | Conference Paper | IST-REx-ID: 20035 |
Jacot, Arthur, Peter Súkeník, Zihan Wang, and Marco Mondelli. “Wide Neural Networks Trained with Weight Decay Provably Exhibit Neural Collapse.” In 13th International Conference on Learning Representations, 1905–31. ICLR, 2025.
[Published Version]
View
| Files available
| arXiv
2025 | Published | Conference Paper | IST-REx-ID: 20034 |
Robert, Thomas, Mher Safaryan, Ionut-Vlad Modoranu, and Dan-Adrian Alistarh. “LDAdam: Adaptive Optimization from Low-Dimensional Gradient Statistics.” In 13th International Conference on Learning Representations, 101877–913. ICLR, 2025.
[Published Version]
View
| Files available
| arXiv