Please note that ISTA Research Explorer no longer supports Internet Explorer versions 8 or 9 (or earlier).

We recommend upgrading to the latest Internet Explorer, Google Chrome, or Firefox.

7 Publications


2025 | Published | Conference Paper | IST-REx-ID: 20038 | OA
Jin T, Humayun AI, Evci U, Subramanian S, Yazdanbakhsh A, Alistarh D-A, Dziugaite GK. 2025. The journey matters: Average parameter count over pre-training unifies sparse and dense scaling laws. 13th International Conference on Learning Representations. ICLR: International Conference on Learning Representations, 85165–85181.
[Published Version] View | Files available | arXiv
 

2025 | Published | Conference Paper | IST-REx-ID: 20033 | OA
Emrullah Ildiz M, Gozeten HA, Taga EO, Mondelli M, Oymak S. 2025. High-dimensional analysis of knowledge distillation: Weak-to-Strong generalization and scaling laws. 13th International Conference on Learning Representations. ICLR: International Conference on Learning Representations, 2967–3006.
[Published Version] View | Files available | arXiv
 

2025 | Published | Conference Paper | IST-REx-ID: 20037 | OA
Sawmya S, Kong L, Markov I, Alistarh D-A, Shavit N. 2025. Wasserstein distances, neuronal entanglement, and sparsity. 13th International Conference on Learning Representations. ICLR: International Conference on Learning Representations, 26244–26274.
[Published Version] View | Files available | arXiv
 

2025 | Published | Conference Paper | IST-REx-ID: 20036 | OA
Pariza V, Salehi M, Burghouts G, Locatello F, Asano YM. 2025. Near, far: Patch-ordering enhances vision foundation models’ scene understanding. 13th International Conference on Learning Representations. ICLR: International Conference on Learning Representations, 72303–72330.
[Published Version] View | Files available | arXiv
 

2025 | Published | Conference Paper | IST-REx-ID: 20032 | OA
Chen J, Yao D, Pervez AA, Alistarh D-A, Locatello F. 2025. Scalable mechanistic neural networks. 13th International Conference on Learning Representations. ICLR: International Conference on Learning Representations, 63716–63737.
[Published Version] View | Files available | arXiv
 

2025 | Published | Conference Paper | IST-REx-ID: 20035 | OA
Jacot A, Súkeník P, Wang Z, Mondelli M. 2025. Wide neural networks trained with weight decay provably exhibit neural collapse. 13th International Conference on Learning Representations. ICLR: International Conference on Learning Representations, 1905–1931.
[Published Version] View | Files available | arXiv
 

2025 | Published | Conference Paper | IST-REx-ID: 20034 | OA
Robert T, Safaryan M, Modoranu I-V, Alistarh D-A. 2025. LDAdam: Adaptive optimization from low-dimensional gradient statistics. 13th International Conference on Learning Representations. ICLR: International Conference on Learning Representations, 101877–101913.
[Published Version] View | Files available | arXiv
 

Filters and Search Terms

isbn=9798331320850

Search

Filter Publications

Display / Sort

Citation Style: ISTA Annual Report

Export / Embed