Please note that ISTA Research Explorer no longer supports Internet Explorer versions 8 or 9 (or earlier).

We recommend upgrading to the latest Internet Explorer, Google Chrome, or Firefox.

7 Publications


2025 | Published | Conference Paper | IST-REx-ID: 20038 | OA
Jin, Tian, Ahmed Imtiaz Humayun, Utku Evci, Suvinay Subramanian, Amir Yazdanbakhsh, Dan-Adrian Alistarh, and Gintare Karolina Dziugaite. “The Journey Matters: Average Parameter Count over Pre-Training Unifies Sparse and Dense Scaling Laws.” In 13th International Conference on Learning Representations, 85165–81. ICLR, 2025.
[Published Version] View | Files available | arXiv
 

2025 | Published | Conference Paper | IST-REx-ID: 20033 | OA
Emrullah Ildiz, M., Halil Alperen Gozeten, Ege Onur Taga, Marco Mondelli, and Samet Oymak. “High-Dimensional Analysis of Knowledge Distillation: Weak-to-Strong Generalization and Scaling Laws.” In 13th International Conference on Learning Representations, 2967–3006. ICLR, 2025.
[Published Version] View | Files available | arXiv
 

2025 | Published | Conference Paper | IST-REx-ID: 20037 | OA
Sawmya, Shashata, Linghao Kong, Ilia Markov, Dan-Adrian Alistarh, and Nir Shavit. “Wasserstein Distances, Neuronal Entanglement, and Sparsity.” In 13th International Conference on Learning Representations, 26244–74. ICLR, 2025.
[Published Version] View | Files available | arXiv
 

2025 | Published | Conference Paper | IST-REx-ID: 20036 | OA
Pariza, Valentinos, Mohammadreza Salehi, Gertjan Burghouts, Francesco Locatello, and Yuki M. Asano. “Near, Far: Patch-Ordering Enhances Vision Foundation Models’ Scene Understanding.” In 13th International Conference on Learning Representations, 72303–30. ICLR, 2025.
[Published Version] View | Files available | arXiv
 

2025 | Published | Conference Paper | IST-REx-ID: 20032 | OA
Chen, Jiale, Dingling Yao, Adeel A Pervez, Dan-Adrian Alistarh, and Francesco Locatello. “Scalable Mechanistic Neural Networks.” In 13th International Conference on Learning Representations, 63716–37. ICLR, 2025.
[Published Version] View | Files available | arXiv
 

2025 | Published | Conference Paper | IST-REx-ID: 20035 | OA
Jacot, Arthur, Peter Súkeník, Zihan Wang, and Marco Mondelli. “Wide Neural Networks Trained with Weight Decay Provably Exhibit Neural Collapse.” In 13th International Conference on Learning Representations, 1905–31. ICLR, 2025.
[Published Version] View | Files available | arXiv
 

2025 | Published | Conference Paper | IST-REx-ID: 20034 | OA
Robert, Thomas, Mher Safaryan, Ionut-Vlad Modoranu, and Dan-Adrian Alistarh. “LDAdam: Adaptive Optimization from Low-Dimensional Gradient Statistics.” In 13th International Conference on Learning Representations, 101877–913. ICLR, 2025.
[Published Version] View | Files available | arXiv
 

Filters and Search Terms

isbn=9798331320850

Search

Filter Publications

Display / Sort

Citation Style: Chicago

Export / Embed