Test-time training provably improves transformers as in-context learners
Gozeten HA, Ildiz ME, Zhang X, Soltanolkotabi M, Mondelli M, Oymak S. 2025. Test-time training provably improves transformers as in-context learners. Proceedings of the 42nd International Conference on Machine Learning. ICML: International Conference on Machine Learning, PMLR, vol. 267, 20266–20295.
Download
Conference Paper
| Published
| English
Author
Gozeten, Halil Alperen;
Ildiz, Muhammed Emrullah;
Zhang, Xuechen;
Soltanolkotabi, Mahdi;
Mondelli, MarcoISTA
;
Oymak, Samet
Department
Series Title
PMLR
Abstract
Test-time training (TTT) methods explicitly update the weights of a model to adapt to the specific test instance, and they have found success in a variety of settings, including most recently language modeling and reasoning. To demystify this success, we investigate a gradient-based TTT algorithm for in-context learning, where we train a transformer model on the in-context demonstrations provided in the test prompt. Specifically, we provide a comprehensive theoretical characterization of linear transformers when the update rule is a single gradient step. Our theory (i) delineates the role of alignment between pretraining distribution and target task, (ii) demystifies how TTT can alleviate distribution shift, and (iii) quantifies the sample complexity of TTT including how it can significantly reduce the eventual sample size required for in-context learning. As our empirical contribution, we study the benefits of TTT for TabPFN, a tabular foundation model. In line with our theory, we demonstrate that TTT significantly reduces the required sample size for tabular classification (3 to 5 times fewer) unlocking substantial inference efficiency with a negligible training cost.
Publishing Year
Date Published
2025-11-30
Proceedings Title
Proceedings of the 42nd International Conference on Machine Learning
Publisher
ML Research Press
Acknowledgement
H.A.G., M.E.I., X.Z., and S.O. were supported in part by the NSF grants CCF2046816, CCF-2403075, CCF-2008020, and the Office of Naval Research grant N000142412289.
M. M. is funded by the European Union (ERC, INF2 , project number 101161364). Views and opinions expressed are, however, those of the author(s) only and do not necessarily
reflect those of the European Union or the European Research Council Executive Agency. Neither the European Union nor the granting authority can be held responsible for them. M.S. is supported by the Packard Fellowship in Science and Engineering, a Sloan Research Fellowship in Mathematics, an NSF-CAREER under award #1846369, DARPA FastNICS program, and NSF-CIF awards #1813877 and #2008443, and NIH DP2LM014564-01. The authors also
acknowledge further support from Open Philanthropy, OpenAI, Amazon Research, Google Research, and Microsoft Research.
Volume
267
Page
20266-20295
Conference
ICML: International Conference on Machine Learning
Conference Location
Vancouver, Canada
Conference Date
2025-07-13 – 2025-07-19
eISSN
IST-REx-ID
Cite this
Gozeten HA, Ildiz ME, Zhang X, Soltanolkotabi M, Mondelli M, Oymak S. Test-time training provably improves transformers as in-context learners. In: Proceedings of the 42nd International Conference on Machine Learning. Vol 267. ML Research Press; 2025:20266-20295.
Gozeten, H. A., Ildiz, M. E., Zhang, X., Soltanolkotabi, M., Mondelli, M., & Oymak, S. (2025). Test-time training provably improves transformers as in-context learners. In Proceedings of the 42nd International Conference on Machine Learning (Vol. 267, pp. 20266–20295). Vancouver, Canada: ML Research Press.
Gozeten, Halil Alperen, Muhammed Emrullah Ildiz, Xuechen Zhang, Mahdi Soltanolkotabi, Marco Mondelli, and Samet Oymak. “Test-Time Training Provably Improves Transformers as in-Context Learners.” In Proceedings of the 42nd International Conference on Machine Learning, 267:20266–95. ML Research Press, 2025.
H. A. Gozeten, M. E. Ildiz, X. Zhang, M. Soltanolkotabi, M. Mondelli, and S. Oymak, “Test-time training provably improves transformers as in-context learners,” in Proceedings of the 42nd International Conference on Machine Learning, Vancouver, Canada, 2025, vol. 267, pp. 20266–20295.
Gozeten HA, Ildiz ME, Zhang X, Soltanolkotabi M, Mondelli M, Oymak S. 2025. Test-time training provably improves transformers as in-context learners. Proceedings of the 42nd International Conference on Machine Learning. ICML: International Conference on Machine Learning, PMLR, vol. 267, 20266–20295.
Gozeten, Halil Alperen, et al. “Test-Time Training Provably Improves Transformers as in-Context Learners.” Proceedings of the 42nd International Conference on Machine Learning, vol. 267, ML Research Press, 2025, pp. 20266–95.
All files available under the following license(s):
Creative Commons Attribution 4.0 International Public License (CC-BY 4.0):
Main File(s)
File Name
2025_ICML_Gozeten.pdf
471.18 KB
Access Level
Open Access
Date Uploaded
2026-02-19
MD5 Checksum
f774f8619a0d72f3975d9cb23942a1e9
Export
Marked PublicationsOpen Data ISTA Research Explorer
Sources
PMID: 41321376
PubMed | Europe PMC
