ISTA Research Explorer

Please note that ISTA Research Explorer no longer supports Internet Explorer versions 8 or 9 (or earlier).

We recommend upgrading to the latest Internet Explorer, Google Chrome, or Firefox.

7 Publications

Search / Filter

2025 | Published | Conference Paper | IST-REx-ID: 20032 |

Chen, J., Yao, D., Pervez, A. A., Alistarh, D.-A., & Locatello, F. (2025). Scalable mechanistic neural networks. In 13th International Conference on Learning Representations (pp. 63716–63737). Singapore, Singapore: ICLR.

[Published Version] View | Files available | arXiv

2025 | Published | Conference Paper | IST-REx-ID: 20033 |

Emrullah Ildiz, M., Gozeten, H. A., Taga, E. O., Mondelli, M., & Oymak, S. (2025). High-dimensional analysis of knowledge distillation: Weak-to-Strong generalization and scaling laws. In 13th International Conference on Learning Representations (pp. 2967–3006). Singapore, Singapore: ICLR.

[Published Version] View | Files available | arXiv

2025 | Published | Conference Paper | IST-REx-ID: 20034 |

Robert, T., Safaryan, M., Modoranu, I.-V., & Alistarh, D.-A. (2025). LDAdam: Adaptive optimization from low-dimensional gradient statistics. In 13th International Conference on Learning Representations (pp. 101877–101913). Singapore, Singapore: ICLR.

[Published Version] View | Files available | arXiv

2025 | Published | Conference Paper | IST-REx-ID: 20035 |

Jacot, A., Súkeník, P., Wang, Z., & Mondelli, M. (2025). Wide neural networks trained with weight decay provably exhibit neural collapse. In 13th International Conference on Learning Representations (pp. 1905–1931). Singapore, Singapore: ICLR.

[Published Version] View | Files available | arXiv

2025 | Published | Conference Paper | IST-REx-ID: 20036 |

Pariza, V., Salehi, M., Burghouts, G., Locatello, F., & Asano, Y. M. (2025). Near, far: Patch-ordering enhances vision foundation models’ scene understanding. In 13th International Conference on Learning Representations (pp. 72303–72330). Singapore, Singapore: ICLR.

[Published Version] View | Files available | arXiv

2025 | Published | Conference Paper | IST-REx-ID: 20037 |

Sawmya, S., Kong, L., Markov, I., Alistarh, D.-A., & Shavit, N. (2025). Wasserstein distances, neuronal entanglement, and sparsity. In 13th International Conference on Learning Representations (pp. 26244–26274). Singapore, Singapore: ICLR.

[Published Version] View | Files available | arXiv

2025 | Published | Conference Paper | IST-REx-ID: 20038 |

Jin, T., Humayun, A. I., Evci, U., Subramanian, S., Yazdanbakhsh, A., Alistarh, D.-A., & Dziugaite, G. K. (2025). The journey matters: Average parameter count over pre-training unifies sparse and dense scaling laws. In 13th International Conference on Learning Representations (pp. 85165–85181). Singapore, Singapore: ICLR.

[Published Version] View | Files available | arXiv