[{"corr_author":"1","oa":1,"related_material":{"link":[{"description":"News on ISTA website","relation":"press_release","url":"https://ista.ac.at/en/news/big-data-and-human-height/"}]},"date_updated":"2026-04-28T12:08:37Z","DOAJ_listed":"1","department":[{"_id":"MaMo"},{"_id":"MaRo"}],"publication":"Cell Genomics","_id":"21488","abstract":[{"text":"Human height is a model for the genetic analysis of complex traits, and recent studies suggest the presence of thousands of common genetic variant associations and hundreds of low-frequency/rare variants. Here, we develop a new algorithmic paradigm based on approximate message passing (genomic vector approximate message passing [gVAMP]) for identifying DNA sequence variants associated with complex traits and common diseases in large-scale whole-genome sequencing (WGS) data. We show that gVAMP accurately localizes associations to variants with the correct frequency and position in the DNA, outperforming existing fine-mapping methods in selecting the appropriate genetic variants within WGS data. We then apply gVAMP to jointly model the relationship of tens of millions of WGS variants with human height in hundreds of thousands of UK Biobank individuals. We identify 59 rare variants and gene burden scores alongside many hundreds of DNA regions containing common variant associations and show that understanding the genetic basis of complex traits will require the joint analysis of hundreds of millions of variables measured on millions of people. The polygenic risk scores obtained from gVAMP have high accuracy (including a prediction accuracy of ∼46% for human height) and outperform current methods for downstream tasks such as mixed linear model association testing across 13 UK Biobank traits. In conclusion, gVAMP offers a scalable foundation for a wider range of analyses in WGS data.","lang":"eng"}],"month":"02","article_type":"original","OA_type":"gold","quality_controlled":"1","citation":{"ama":"Depope A, Bajzik J, Mondelli M, Robinson MR. Joint modeling of whole-genome sequencing data for human height via approximate message passing. <i>Cell Genomics</i>. 2026. doi:<a href=\"https://doi.org/10.1016/j.xgen.2026.101162\">10.1016/j.xgen.2026.101162</a>","short":"A. Depope, J. Bajzik, M. Mondelli, M.R. Robinson, Cell Genomics (2026).","ieee":"A. Depope, J. Bajzik, M. Mondelli, and M. R. Robinson, “Joint modeling of whole-genome sequencing data for human height via approximate message passing,” <i>Cell Genomics</i>. Elsevier, 2026.","apa":"Depope, A., Bajzik, J., Mondelli, M., &#38; Robinson, M. R. (2026). Joint modeling of whole-genome sequencing data for human height via approximate message passing. <i>Cell Genomics</i>. Elsevier. <a href=\"https://doi.org/10.1016/j.xgen.2026.101162\">https://doi.org/10.1016/j.xgen.2026.101162</a>","ista":"Depope A, Bajzik J, Mondelli M, Robinson MR. 2026. Joint modeling of whole-genome sequencing data for human height via approximate message passing. Cell Genomics., 101162.","mla":"Depope, Al, et al. “Joint Modeling of Whole-Genome Sequencing Data for Human Height via Approximate Message Passing.” <i>Cell Genomics</i>, 101162, Elsevier, 2026, doi:<a href=\"https://doi.org/10.1016/j.xgen.2026.101162\">10.1016/j.xgen.2026.101162</a>.","chicago":"Depope, Al, Jakub Bajzik, Marco Mondelli, and Matthew Richard Robinson. “Joint Modeling of Whole-Genome Sequencing Data for Human Height via Approximate Message Passing.” <i>Cell Genomics</i>. Elsevier, 2026. <a href=\"https://doi.org/10.1016/j.xgen.2026.101162\">https://doi.org/10.1016/j.xgen.2026.101162</a>."},"article_number":"101162","user_id":"ba8df636-2132-11f1-aed0-ed93e2281fdd","main_file_link":[{"open_access":"1","url":"https://doi.org/10.1016/j.xgen.2026.101162"}],"publication_identifier":{"eissn":["2666-979X"]},"doi":"10.1016/j.xgen.2026.101162","has_accepted_license":"1","tmp":{"image":"/images/cc_by_nc_nd.png","name":"Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0)","legal_code_url":"https://creativecommons.org/licenses/by-nc-nd/4.0/legalcode","short":"CC BY-NC-ND (4.0)"},"date_published":"2026-02-18T00:00:00Z","language":[{"iso":"eng"}],"type":"journal_article","ddc":["000","570"],"day":"18","oa_version":"Published Version","title":"Joint modeling of whole-genome sequencing data for human height via approximate message passing","article_processing_charge":"Yes","year":"2026","author":[{"full_name":"Depope, Al","first_name":"Al","last_name":"Depope","id":"0b77531d-dbcd-11ea-9d1d-a8eee0bf3830"},{"id":"b995e25b-8c4b-11ed-a6d8-f71b7bcd6122","last_name":"Bajzik","first_name":"Jakub","full_name":"Bajzik, Jakub"},{"first_name":"Marco","full_name":"Mondelli, Marco","last_name":"Mondelli","id":"27EB676C-8706-11E9-9510-7717E6697425","orcid":"0000-0002-3242-7020"},{"orcid":"0000-0001-8982-8813","id":"E5D42276-F5DA-11E9-8E24-6303E6697425","last_name":"Robinson","full_name":"Robinson, Matthew Richard","first_name":"Matthew Richard"}],"OA_place":"publisher","date_created":"2026-03-23T15:10:03Z","status":"public","acknowledgement":"We thank Malgorzata Borczyk for creating the gene burden scores. We thank Robin Beaumont, Amedeo Roberto Esposito, Gareth Hawkes, Philip Schniter, Matthew Stephens, Pragya Sur, Peter Visscher, Michael Weedon, and Harry Wright for providing valuable suggestions and comments on earlier versions of the work. This project was funded by a Lopez-Loreta Prize to M.M., an SNSF Eccellenza Grant to M.R.R. (PCEGP3-181181), an ERC Starting Grant to M.M. (INF2, project number 101161364), and core funding from ISTA. High-performance computing was supported by the Scientific Service Units (SSU) of ISTA through resources provided by Scientific Computing (SciComp). We would like to acknowledge the participants and investigators of the UK Biobank study. We gratefully acknowledge the All of Us participants for their contributions, without whom this research would not have been possible. We also thank the National Institutes of Health All of Us Research Program for making available the participant data (and/or samples and/or cohort) examined in this study.","project":[{"name":"Prix Lopez-Loretta 2019 - Marco Mondelli","_id":"059876FA-7A3F-11EA-A408-12923DDC885E"},{"_id":"911e6d1f-16d5-11f0-9cad-c5c68c6a1cdf","grant_number":"101161364","name":"Inference in High Dimensions: Light-speed Algorithms and Information Limits"},{"grant_number":"PCEGP3_181181","name":"Improving estimation and prediction of common complex disease risk","_id":"9B8D11D6-BA93-11EA-9121-9846C619BF3A"}],"publication_status":"epub_ahead","publisher":"Elsevier"},{"month":"04","file":[{"date_updated":"2025-08-04T08:32:38Z","access_level":"open_access","relation":"main_file","file_name":"2025_ICLR_Ildiz.pdf","checksum":"5a38b093ebb4ee4eb662ea142621a5ca","creator":"dernst","content_type":"application/pdf","file_size":528171,"date_created":"2025-08-04T08:32:38Z","success":1,"file_id":"20112"}],"abstract":[{"lang":"eng","text":"A growing number of machine learning scenarios rely on knowledge distillation where one uses the output of a surrogate model as labels to supervise the training of a target model. In this work, we provide a sharp characterization of this process for ridgeless, high-dimensional regression, under two settings: (i) model shift, where the surrogate model is arbitrary, and (ii) distribution shift, where the surrogate model is the solution of empirical risk minimization with out-of-distribution data. In both cases, we characterize the precise risk of the target model through non-asymptotic bounds in terms of sample size and data distribution under mild conditions. As a consequence, we identify the form of the optimal surrogate model, which reveals the benefits and limitations of discarding weak features in a data-dependent fashion. In the context of weak-to-strong (W2S) generalization, this has the interpretation that (i) W2S training, with the surrogate as the weak model, can provably outperform training with strong labels under the same data budget, but (ii) it is unable to improve the data scaling law. We validate our results on numerical experiments both on ridgeless regression and on neural network architectures."}],"_id":"20033","publication":"13th International Conference on Learning Representations","department":[{"_id":"MaMo"}],"date_updated":"2025-08-04T08:33:58Z","external_id":{"arxiv":["2410.18837"]},"oa":1,"tmp":{"legal_code_url":"https://creativecommons.org/licenses/by/4.0/legalcode","short":"CC BY (4.0)","image":"/images/cc_by.png","name":"Creative Commons Attribution 4.0 International Public License (CC-BY 4.0)"},"has_accepted_license":"1","publication_identifier":{"isbn":["9798331320850"]},"user_id":"2DF688A6-F248-11E8-B48F-1D18A9856A87","file_date_updated":"2025-08-04T08:32:38Z","citation":{"short":"M. Emrullah Ildiz, H.A. Gozeten, E.O. Taga, M. Mondelli, S. Oymak, in:, 13th International Conference on Learning Representations, ICLR, 2025, pp. 2967–3006.","ama":"Emrullah Ildiz M, Gozeten HA, Taga EO, Mondelli M, Oymak S. High-dimensional analysis of knowledge distillation: Weak-to-Strong generalization and scaling laws. In: <i>13th International Conference on Learning Representations</i>. ICLR; 2025:2967-3006.","chicago":"Emrullah Ildiz, M., Halil Alperen Gozeten, Ege Onur Taga, Marco Mondelli, and Samet Oymak. “High-Dimensional Analysis of Knowledge Distillation: Weak-to-Strong Generalization and Scaling Laws.” In <i>13th International Conference on Learning Representations</i>, 2967–3006. ICLR, 2025.","ista":"Emrullah Ildiz M, Gozeten HA, Taga EO, Mondelli M, Oymak S. 2025. High-dimensional analysis of knowledge distillation: Weak-to-Strong generalization and scaling laws. 13th International Conference on Learning Representations. ICLR: International Conference on Learning Representations, 2967–3006.","apa":"Emrullah Ildiz, M., Gozeten, H. A., Taga, E. O., Mondelli, M., &#38; Oymak, S. (2025). High-dimensional analysis of knowledge distillation: Weak-to-Strong generalization and scaling laws. In <i>13th International Conference on Learning Representations</i> (pp. 2967–3006). Singapore, Singapore: ICLR.","mla":"Emrullah Ildiz, M., et al. “High-Dimensional Analysis of Knowledge Distillation: Weak-to-Strong Generalization and Scaling Laws.” <i>13th International Conference on Learning Representations</i>, ICLR, 2025, pp. 2967–3006.","ieee":"M. Emrullah Ildiz, H. A. Gozeten, E. O. Taga, M. Mondelli, and S. Oymak, “High-dimensional analysis of knowledge distillation: Weak-to-Strong generalization and scaling laws,” in <i>13th International Conference on Learning Representations</i>, Singapore, Singapore, 2025, pp. 2967–3006."},"quality_controlled":"1","conference":{"end_date":"2025-04-28","name":"ICLR: International Conference on Learning Representations","start_date":"2025-04-24","location":"Singapore, Singapore"},"OA_type":"diamond","page":"2967-3006","day":"01","ddc":["000"],"type":"conference","scopus_import":"1","language":[{"iso":"eng"}],"date_published":"2025-04-01T00:00:00Z","publisher":"ICLR","publication_status":"published","project":[{"name":"Inference in High Dimensions: Light-speed Algorithms and Information Limits","grant_number":"101161364","_id":"911e6d1f-16d5-11f0-9cad-c5c68c6a1cdf"}],"acknowledgement":"M.E.I., H.A.G., E.O.T., S.O. are supported by the NSF grants CCF-2046816, CCF-2403075, the Office of Naval Research grant N000142412289, an OpenAI Agentic AI Systems grant, and gifts by Open Philanthropy and Google Research. M. M. is funded by the European Union (ERC, INF2, project number 101161364). Views and opinions expressed are however those of the author(s) only and do not necessarily reflect those of the European Union or the European Research Council Executive Agency. Neither the European Union nor the granting authority can be held responsible for them.","date_created":"2025-07-20T22:02:02Z","status":"public","OA_place":"publisher","author":[{"full_name":"Emrullah Ildiz, M.","first_name":"M.","last_name":"Emrullah Ildiz"},{"first_name":"Halil Alperen","full_name":"Gozeten, Halil Alperen","last_name":"Gozeten"},{"last_name":"Taga","full_name":"Taga, Ege Onur","first_name":"Ege Onur"},{"orcid":"0000-0002-3242-7020","id":"27EB676C-8706-11E9-9510-7717E6697425","last_name":"Mondelli","full_name":"Mondelli, Marco","first_name":"Marco"},{"last_name":"Oymak","full_name":"Oymak, Samet","first_name":"Samet"}],"year":"2025","title":"High-dimensional analysis of knowledge distillation: Weak-to-Strong generalization and scaling laws","article_processing_charge":"No","oa_version":"Published Version","arxiv":1},{"external_id":{"arxiv":["2410.04887"]},"date_updated":"2025-08-04T08:47:00Z","oa":1,"corr_author":"1","abstract":[{"lang":"eng","text":"Deep neural networks (DNNs) at convergence consistently represent the training data in the last layer via a geometric structure referred to as neural collapse. This empirical evidence has spurred a line of theoretical research aimed at proving the emergence of neural collapse, mostly focusing on the unconstrained features model. Here, the features of the penultimate layer are free variables, which makes the model data-agnostic and puts into question its ability to capture DNN training. Our work addresses the issue, moving away from unconstrained features and\r\nstudying DNNs that end with at least two linear layers. We first prove generic guarantees on neural collapse that assume (i) low training error and balancedness of linear layers (for within-class variability collapse), and (ii) bounded conditioning of the features before the linear part (for orthogonality of class-means, and their alignment with weight matrices). The balancedness refers to the fact that W⊤ℓ+1Wℓ+1 ≈ WℓW⊤ℓfor any pair of consecutive weight matrices of the linear part, and the bounded conditioning requires a well-behaved ratio between largest and smallest non-zero singular values of the features. We then show that such assumptions hold for gradient descent training with weight decay: (i) for networks with a wide first layer, we prove low training error and balancedness, and (ii) for solutions that are either nearly optimal or stable under large learning rates, we additionally prove the bounded conditioning. Taken together, our results are the first to show neural collapse in the end-to-end training of DNNs."}],"_id":"20035","file":[{"creator":"dernst","checksum":"59c48c173887139647cc9839c0801136","file_id":"20114","content_type":"application/pdf","date_created":"2025-08-04T08:45:43Z","success":1,"file_size":1337236,"relation":"main_file","date_updated":"2025-08-04T08:45:43Z","access_level":"open_access","file_name":"2025_ICLR_Jacot.pdf"}],"month":"04","department":[{"_id":"MaMo"}],"publication":"13th International Conference on Learning Representations","citation":{"ieee":"A. Jacot, P. Súkeník, Z. Wang, and M. Mondelli, “Wide neural networks trained with weight decay provably exhibit neural collapse,” in <i>13th International Conference on Learning Representations</i>, Singapore, Singapore, 2025, pp. 1905–1931.","chicago":"Jacot, Arthur, Peter Súkeník, Zihan Wang, and Marco Mondelli. “Wide Neural Networks Trained with Weight Decay Provably Exhibit Neural Collapse.” In <i>13th International Conference on Learning Representations</i>, 1905–31. ICLR, 2025.","mla":"Jacot, Arthur, et al. “Wide Neural Networks Trained with Weight Decay Provably Exhibit Neural Collapse.” <i>13th International Conference on Learning Representations</i>, ICLR, 2025, pp. 1905–31.","apa":"Jacot, A., Súkeník, P., Wang, Z., &#38; Mondelli, M. (2025). Wide neural networks trained with weight decay provably exhibit neural collapse. In <i>13th International Conference on Learning Representations</i> (pp. 1905–1931). Singapore, Singapore: ICLR.","ista":"Jacot A, Súkeník P, Wang Z, Mondelli M. 2025. Wide neural networks trained with weight decay provably exhibit neural collapse. 13th International Conference on Learning Representations. ICLR: International Conference on Learning Representations, 1905–1931.","ama":"Jacot A, Súkeník P, Wang Z, Mondelli M. Wide neural networks trained with weight decay provably exhibit neural collapse. In: <i>13th International Conference on Learning Representations</i>. ICLR; 2025:1905-1931.","short":"A. Jacot, P. Súkeník, Z. Wang, M. Mondelli, in:, 13th International Conference on Learning Representations, ICLR, 2025, pp. 1905–1931."},"file_date_updated":"2025-08-04T08:45:43Z","user_id":"2DF688A6-F248-11E8-B48F-1D18A9856A87","conference":{"end_date":"2025-04-28","name":"ICLR: International Conference on Learning Representations","start_date":"2025-04-24","location":"Singapore, Singapore"},"OA_type":"diamond","quality_controlled":"1","has_accepted_license":"1","tmp":{"legal_code_url":"https://creativecommons.org/licenses/by/4.0/legalcode","short":"CC BY (4.0)","image":"/images/cc_by.png","name":"Creative Commons Attribution 4.0 International Public License (CC-BY 4.0)"},"publication_identifier":{"isbn":["9798331320850"]},"date_published":"2025-04-01T00:00:00Z","type":"conference","day":"01","ddc":["000"],"page":"1905-1931","language":[{"iso":"eng"}],"scopus_import":"1","year":"2025","author":[{"first_name":"Arthur","full_name":"Jacot, Arthur","last_name":"Jacot"},{"full_name":"Súkeník, Peter","first_name":"Peter","last_name":"Súkeník","id":"d64d6a8d-eb8e-11eb-b029-96fd216dec3c"},{"full_name":"Wang, Zihan","first_name":"Zihan","last_name":"Wang"},{"orcid":"0000-0002-3242-7020","first_name":"Marco","full_name":"Mondelli, Marco","id":"27EB676C-8706-11E9-9510-7717E6697425","last_name":"Mondelli"}],"OA_place":"publisher","arxiv":1,"oa_version":"Published Version","article_processing_charge":"No","title":"Wide neural networks trained with weight decay provably exhibit neural collapse","publication_status":"published","publisher":"ICLR","date_created":"2025-07-20T22:02:02Z","status":"public","acknowledgement":"M. M. and P. S. are funded by the European Union (ERC, INF2, project number 101161364). Views and opinions expressed are however those of the author(s) only and do not necessarily reflect those of the European Union or the European Research Council Executive Agency. Neither the European Union nor the granting authority can be held responsible for them.","project":[{"name":"Inference in High Dimensions: Light-speed Algorithms and Information Limits","grant_number":"101161364","_id":"911e6d1f-16d5-11f0-9cad-c5c68c6a1cdf"}]},{"doi":"10.1109/TIT.2025.3587340","publication_identifier":{"issn":["0018-9448"],"eissn":["1557-9654"]},"user_id":"2DF688A6-F248-11E8-B48F-1D18A9856A87","main_file_link":[{"url":"https://doi.org/10.48550/arXiv.2405.08352","open_access":"1"}],"citation":{"short":"A.R. Esposito, M. Gastpar, I. Issa, IEEE Transactions on Information Theory (2025).","ama":"Esposito AR, Gastpar M, Issa I. Sibson α-mutual information and its variational representations. <i>IEEE Transactions on Information Theory</i>. 2025. doi:<a href=\"https://doi.org/10.1109/TIT.2025.3587340\">10.1109/TIT.2025.3587340</a>","chicago":"Esposito, Amedeo Roberto, Michael Gastpar, and Ibrahim Issa. “Sibson α-Mutual Information and Its Variational Representations.” <i>IEEE Transactions on Information Theory</i>. IEEE, 2025. <a href=\"https://doi.org/10.1109/TIT.2025.3587340\">https://doi.org/10.1109/TIT.2025.3587340</a>.","apa":"Esposito, A. R., Gastpar, M., &#38; Issa, I. (2025). Sibson α-mutual information and its variational representations. <i>IEEE Transactions on Information Theory</i>. IEEE. <a href=\"https://doi.org/10.1109/TIT.2025.3587340\">https://doi.org/10.1109/TIT.2025.3587340</a>","ista":"Esposito AR, Gastpar M, Issa I. 2025. Sibson α-mutual information and its variational representations. IEEE Transactions on Information Theory.","mla":"Esposito, Amedeo Roberto, et al. “Sibson α-Mutual Information and Its Variational Representations.” <i>IEEE Transactions on Information Theory</i>, IEEE, 2025, doi:<a href=\"https://doi.org/10.1109/TIT.2025.3587340\">10.1109/TIT.2025.3587340</a>.","ieee":"A. R. Esposito, M. Gastpar, and I. Issa, “Sibson α-mutual information and its variational representations,” <i>IEEE Transactions on Information Theory</i>. IEEE, 2025."},"quality_controlled":"1","OA_type":"green","article_type":"original","month":"07","abstract":[{"text":"Information measures can be constructed from Rényi divergences much like mutual information from Kullback-Leibler divergence. One such information measure is known as Sibson α-mutual information and has received renewed attention recently in several contexts: concentration of measure under dependence, statistical learning, hypothesis testing, and estimation theory. In this paper, we survey and extend the state of the art. In particular, we introduce variational representations for Sibson α-mutual information and employ them in each described context to derive novel results. Namely, we produce generalized Transportation-Cost inequalities and Fano-type inequalities. We also present an overview of known applications, spanning from learning theory and Bayesian risk to universal prediction.","lang":"eng"}],"_id":"20081","department":[{"_id":"MaMo"}],"publication":"IEEE Transactions on Information Theory","external_id":{"arxiv":["2405.08352"]},"date_updated":"2026-02-16T11:49:40Z","oa":1,"publisher":"IEEE","publication_status":"epub_ahead","status":"public","date_created":"2025-07-27T22:01:26Z","OA_place":"repository","year":"2025","author":[{"id":"9583e921-e1ad-11ec-9862-cef099626dc9","last_name":"Esposito","full_name":"Esposito, Amedeo Roberto","first_name":"Amedeo Roberto"},{"full_name":"Gastpar, Michael","first_name":"Michael","last_name":"Gastpar"},{"last_name":"Issa","first_name":"Ibrahim","full_name":"Issa, Ibrahim"}],"oa_version":"Preprint","title":"Sibson α-mutual information and its variational representations","article_processing_charge":"No","arxiv":1,"type":"journal_article","day":"11","scopus_import":"1","language":[{"iso":"eng"}],"date_published":"2025-07-11T00:00:00Z"},{"publisher":"ML Research Press","publication_status":"published","status":"public","date_created":"2025-09-07T22:01:35Z","acknowledgement":"We thank Junhyung Park for valuable feedback on the manuscript. AT was supported by a PhD fellowship from the Swiss Data Science Center. TW was supported by the SNF Grant 204439. This work was done in part while TW and FY were visiting the Simons Institute for the Theory of\r\nComputing.","OA_place":"repository","year":"2025","author":[{"first_name":"Tobias","full_name":"Wegel, Tobias","last_name":"Wegel"},{"id":"d0258e7b-50b8-11ef-ad56-8b9f537b6b1b","last_name":"Kovačević","full_name":"Kovačević, Filip","first_name":"Filip"},{"first_name":"Alexandru","full_name":"Ţifrea, Alexandru","last_name":"Ţifrea"},{"last_name":"Yang","full_name":"Yang, Fanny","first_name":"Fanny"}],"oa_version":"Preprint","article_processing_charge":"No","title":"Learning Pareto manifolds in high dimensions: How can regularization help?","arxiv":1,"type":"conference","page":"4591-4599","day":"01","scopus_import":"1","language":[{"iso":"eng"}],"date_published":"2025-05-01T00:00:00Z","publication_identifier":{"eissn":["2640-3498"]},"user_id":"2DF688A6-F248-11E8-B48F-1D18A9856A87","main_file_link":[{"url":"https://doi.org/10.48550/arXiv.2503.08849","open_access":"1"}],"citation":{"ieee":"T. Wegel, F. Kovačević, A. Ţifrea, and F. Yang, “Learning Pareto manifolds in high dimensions: How can regularization help?,” in <i>The 28th International Conference on Artificial Intelligence and Statistics</i>, Mai Khao, Thailand, 2025, vol. 258, pp. 4591–4599.","apa":"Wegel, T., Kovačević, F., Ţifrea, A., &#38; Yang, F. (2025). Learning Pareto manifolds in high dimensions: How can regularization help? In <i>The 28th International Conference on Artificial Intelligence and Statistics</i> (Vol. 258, pp. 4591–4599). Mai Khao, Thailand: ML Research Press.","mla":"Wegel, Tobias, et al. “Learning Pareto Manifolds in High Dimensions: How Can Regularization Help?” <i>The 28th International Conference on Artificial Intelligence and Statistics</i>, vol. 258, ML Research Press, 2025, pp. 4591–99.","ista":"Wegel T, Kovačević F, Ţifrea A, Yang F. 2025. Learning Pareto manifolds in high dimensions: How can regularization help? The 28th International Conference on Artificial Intelligence and Statistics. AISTATS: Conference on Artificial Intelligence and Statistics, PMLR, vol. 258, 4591–4599.","chicago":"Wegel, Tobias, Filip Kovačević, Alexandru Ţifrea, and Fanny Yang. “Learning Pareto Manifolds in High Dimensions: How Can Regularization Help?” In <i>The 28th International Conference on Artificial Intelligence and Statistics</i>, 258:4591–99. ML Research Press, 2025.","ama":"Wegel T, Kovačević F, Ţifrea A, Yang F. Learning Pareto manifolds in high dimensions: How can regularization help? In: <i>The 28th International Conference on Artificial Intelligence and Statistics</i>. Vol 258. ML Research Press; 2025:4591-4599.","short":"T. Wegel, F. Kovačević, A. Ţifrea, F. Yang, in:, The 28th International Conference on Artificial Intelligence and Statistics, ML Research Press, 2025, pp. 4591–4599."},"quality_controlled":"1","conference":{"name":"AISTATS: Conference on Artificial Intelligence and Statistics","end_date":"2025-05-05","start_date":"2025-05-03","location":"Mai Khao, Thailand"},"OA_type":"green","month":"05","intvolume":"       258","_id":"20300","abstract":[{"lang":"eng","text":"Simultaneously addressing multiple objectives is becoming increasingly important in modern machine learning. At the same time, data is often high-dimensional and costly to label. For a single objective such as prediction risk, conventional regularization techniques are known to improve generalization when the data exhibits low-dimensional structure like sparsity. However, it is largely unexplored how to leverage this structure in the context of multi-objective learning (MOL) with multiple competing objectives. In this work, we discuss how the application of vanilla regularization approaches can fail, and propose a two-stage MOL framework that can successfully leverage low-dimensional structure. We demonstrate its effectiveness experimentally for multi-distribution learning and fairness-risk trade-offs."}],"department":[{"_id":"MaMo"}],"publication":"The 28th International Conference on Artificial Intelligence and Statistics","volume":258,"external_id":{"arxiv":["2503.08849"]},"date_updated":"2025-09-09T07:00:34Z","alternative_title":["PMLR"],"oa":1},{"_id":"20667","abstract":[{"text":"We explore the problem of mean estimation for a high-dimensional binary symmetric Gaussian mixture model, where the label (sign) follows a time-inhomogeneous Markov chain. We propose a spectral estimator based on a partition of a subset of the samples to blocks. We develop a computationally efficient algorithm to find the optimal blocks, and derive minimax lower bounds on the estimation loss of any estimator, which establish the effectiveness of our proposed estimator. The resulting minimax rate illuminates the interplay between the sample size, dimension, signal strength, and the memory on the loss.","lang":"eng"}],"type":"conference","day":"20","month":"10","language":[{"iso":"eng"}],"department":[{"_id":"MaMo"}],"publication":"2025 IEEE International Symposium on Information Theory Proceedings","scopus_import":"1","date_updated":"2025-11-24T08:53:34Z","date_published":"2025-10-20T00:00:00Z","doi":"10.1109/ISIT63088.2025.11195426","publication_status":"published","publisher":"IEEE","status":"public","date_created":"2025-11-23T23:01:39Z","publication_identifier":{"isbn":["9798331543990"],"issn":["2157-8095"]},"acknowledgement":"The research of A.K. and N.W. was supported by the Israel Science Foundation (ISF), grant no. 1782/22.","year":"2025","citation":{"chicago":"El Latif Kadry, Abd, Yihan Zhang, and Nir Weinberger. “Mean Estimation in High-Dimensional Binary Timeinhomogeneous Markov Gaussian Mixture Models.” In <i>2025 IEEE International Symposium on Information Theory Proceedings</i>. IEEE, 2025. <a href=\"https://doi.org/10.1109/ISIT63088.2025.11195426\">https://doi.org/10.1109/ISIT63088.2025.11195426</a>.","mla":"El Latif Kadry, Abd, et al. “Mean Estimation in High-Dimensional Binary Timeinhomogeneous Markov Gaussian Mixture Models.” <i>2025 IEEE International Symposium on Information Theory Proceedings</i>, IEEE, 2025, doi:<a href=\"https://doi.org/10.1109/ISIT63088.2025.11195426\">10.1109/ISIT63088.2025.11195426</a>.","ista":"El Latif Kadry A, Zhang Y, Weinberger N. 2025. Mean estimation in high-dimensional binary timeinhomogeneous Markov Gaussian mixture models. 2025 IEEE International Symposium on Information Theory Proceedings. ISIT: International Symposium on Information Theory.","apa":"El Latif Kadry, A., Zhang, Y., &#38; Weinberger, N. (2025). Mean estimation in high-dimensional binary timeinhomogeneous Markov Gaussian mixture models. In <i>2025 IEEE International Symposium on Information Theory Proceedings</i>. Ann Arbor, MI, United States: IEEE. <a href=\"https://doi.org/10.1109/ISIT63088.2025.11195426\">https://doi.org/10.1109/ISIT63088.2025.11195426</a>","ieee":"A. El Latif Kadry, Y. Zhang, and N. Weinberger, “Mean estimation in high-dimensional binary timeinhomogeneous Markov Gaussian mixture models,” in <i>2025 IEEE International Symposium on Information Theory Proceedings</i>, Ann Arbor, MI, United States, 2025.","short":"A. El Latif Kadry, Y. Zhang, N. Weinberger, in:, 2025 IEEE International Symposium on Information Theory Proceedings, IEEE, 2025.","ama":"El Latif Kadry A, Zhang Y, Weinberger N. Mean estimation in high-dimensional binary timeinhomogeneous Markov Gaussian mixture models. In: <i>2025 IEEE International Symposium on Information Theory Proceedings</i>. IEEE; 2025. doi:<a href=\"https://doi.org/10.1109/ISIT63088.2025.11195426\">10.1109/ISIT63088.2025.11195426</a>"},"author":[{"full_name":"El Latif Kadry, Abd","first_name":"Abd","last_name":"El Latif Kadry"},{"orcid":"0000-0002-6465-6258","first_name":"Yihan","full_name":"Zhang, Yihan","last_name":"Zhang","id":"2ce5da42-b2ea-11eb-bba5-9f264e9d002c"},{"last_name":"Weinberger","full_name":"Weinberger, Nir","first_name":"Nir"}],"user_id":"2DF688A6-F248-11E8-B48F-1D18A9856A87","OA_type":"closed access","conference":{"end_date":"2025-06-27","name":"ISIT: International Symposium on Information Theory","start_date":"2025-06-22","location":"Ann Arbor, MI, United States"},"oa_version":"None","title":"Mean estimation in high-dimensional binary timeinhomogeneous Markov Gaussian mixture models","article_processing_charge":"No","quality_controlled":"1"},{"issue":"3-4","date_published":"2025-09-02T00:00:00Z","scopus_import":"1","language":[{"iso":"eng"}],"type":"journal_article","ddc":["000"],"page":"193-304","day":"02","oa_version":"Published Version","title":"Spectral estimators for structured generalized linear models via approximate message passing","article_processing_charge":"No","OA_place":"publisher","year":"2025","author":[{"orcid":"0000-0002-6465-6258","last_name":"Zhang","id":"2ce5da42-b2ea-11eb-bba5-9f264e9d002c","first_name":"Yihan","full_name":"Zhang, Yihan"},{"last_name":"Ji","first_name":"Hong Chang","full_name":"Ji, Hong Chang"},{"last_name":"Venkataramanan","full_name":"Venkataramanan, Ramji","first_name":"Ramji"},{"full_name":"Mondelli, Marco","first_name":"Marco","id":"27EB676C-8706-11E9-9510-7717E6697425","last_name":"Mondelli","orcid":"0000-0002-3242-7020"}],"project":[{"name":"Prix Lopez-Loretta 2019 - Marco Mondelli","_id":"059876FA-7A3F-11EA-A408-12923DDC885E"}],"date_created":"2025-12-07T23:02:02Z","status":"public","acknowledgement":"This work was done when Y. Z. and H. C. J. were at the Institute of Science and Technology Austria. Y. Z. thanks Hugo Latourelle-Vigeant for bringing [53] to the authors’ attention.\r\nY. Z. and M. M. are partially supported by the 2019 Lopez-Loreta Prize and by the Interdisciplinary Projects Committee (IPC) at ISTA. H. C. J. is supported by the ERC Advanced Grant “RMTBeyond” No. 101020331.","publisher":"EMS Press","publication_status":"published","oa":1,"corr_author":"1","date_updated":"2025-12-09T13:53:31Z","department":[{"_id":"MaMo"}],"publication":"Mathematical Statistics and Learning","volume":8,"article_type":"original","month":"09","_id":"20734","intvolume":"         8","abstract":[{"text":"We consider the problem of parameter estimation in a high-dimensional generalized linear model. Spectral methods obtained via the principal eigenvector of a suitable data-dependent matrix provide a simple yet surprisingly effective solution. However, despite their wide use, a rigorous performance characterization, as well as a principled way to preprocess the data, are available only for unstructured (i.i.d. Gaussian and Haar orthogonal) designs. In contrast, real-world data matrices are highly structured and exhibit non-trivial correlations. To address the problem, we consider correlated Gaussian designs capturing the anisotropic nature of the features via a covariance matrix Σ. Our main result is a precise asymptotic characterization of the performance of spectral estimators. This allows us to identify the optimal preprocessing that minimizes the number of samples needed for parameter estimation. Surprisingly, such preprocessing is universal across a broad set of designs, which partly addresses a conjecture on optimal spectral estimators for rotationally invariant models. Our principled approach vastly improves upon previous heuristic methods, including for designs common in computational imaging and genetics. The proposed methodology, based on approximate message passing, is broadly applicable and opens the way to the precise characterization of spiked matrices and of the corresponding spectral methods in a variety of settings.","lang":"eng"}],"file":[{"date_updated":"2025-12-09T13:50:03Z","access_level":"open_access","relation":"main_file","file_name":"2025_MathStatLearning_Zhang.pdf","checksum":"55a1bd9c1b6b0198c42504fb94f4ad4c","creator":"dernst","content_type":"application/pdf","date_created":"2025-12-09T13:50:03Z","file_size":1379626,"success":1,"file_id":"20752"}],"quality_controlled":"1","PlanS_conform":"1","OA_type":"diamond","user_id":"2DF688A6-F248-11E8-B48F-1D18A9856A87","file_date_updated":"2025-12-09T13:50:03Z","citation":{"short":"Y. Zhang, H.C. Ji, R. Venkataramanan, M. Mondelli, Mathematical Statistics and Learning 8 (2025) 193–304.","ama":"Zhang Y, Ji HC, Venkataramanan R, Mondelli M. Spectral estimators for structured generalized linear models via approximate message passing. <i>Mathematical Statistics and Learning</i>. 2025;8(3-4):193-304. doi:<a href=\"https://doi.org/10.4171/MSL/52\">10.4171/MSL/52</a>","mla":"Zhang, Yihan, et al. “Spectral Estimators for Structured Generalized Linear Models via Approximate Message Passing.” <i>Mathematical Statistics and Learning</i>, vol. 8, no. 3–4, EMS Press, 2025, pp. 193–304, doi:<a href=\"https://doi.org/10.4171/MSL/52\">10.4171/MSL/52</a>.","apa":"Zhang, Y., Ji, H. C., Venkataramanan, R., &#38; Mondelli, M. (2025). Spectral estimators for structured generalized linear models via approximate message passing. <i>Mathematical Statistics and Learning</i>. EMS Press. <a href=\"https://doi.org/10.4171/MSL/52\">https://doi.org/10.4171/MSL/52</a>","ista":"Zhang Y, Ji HC, Venkataramanan R, Mondelli M. 2025. Spectral estimators for structured generalized linear models via approximate message passing. Mathematical Statistics and Learning. 8(3–4), 193–304.","chicago":"Zhang, Yihan, Hong Chang Ji, Ramji Venkataramanan, and Marco Mondelli. “Spectral Estimators for Structured Generalized Linear Models via Approximate Message Passing.” <i>Mathematical Statistics and Learning</i>. EMS Press, 2025. <a href=\"https://doi.org/10.4171/MSL/52\">https://doi.org/10.4171/MSL/52</a>.","ieee":"Y. Zhang, H. C. Ji, R. Venkataramanan, and M. Mondelli, “Spectral estimators for structured generalized linear models via approximate message passing,” <i>Mathematical Statistics and Learning</i>, vol. 8, no. 3–4. EMS Press, pp. 193–304, 2025."},"publication_identifier":{"issn":["2520-2316"],"eissn":["2520-2324"]},"tmp":{"legal_code_url":"https://creativecommons.org/licenses/by/4.0/legalcode","short":"CC BY (4.0)","image":"/images/cc_by.png","name":"Creative Commons Attribution 4.0 International Public License (CC-BY 4.0)"},"doi":"10.4171/MSL/52","has_accepted_license":"1"},{"PlanS_conform":"1","OA_type":"hybrid","quality_controlled":"1","file_date_updated":"2025-08-05T12:22:04Z","citation":{"short":"M. Fornasier, T. Klock, M. Mondelli, M. Rauchensteiner, Applied and Computational Harmonic Analysis 77 (2025).","ama":"Fornasier M, Klock T, Mondelli M, Rauchensteiner M. Efficient identification of wide shallow neural networks with biases. <i>Applied and Computational Harmonic Analysis</i>. 2025;77. doi:<a href=\"https://doi.org/10.1016/j.acha.2025.101749\">10.1016/j.acha.2025.101749</a>","apa":"Fornasier, M., Klock, T., Mondelli, M., &#38; Rauchensteiner, M. (2025). Efficient identification of wide shallow neural networks with biases. <i>Applied and Computational Harmonic Analysis</i>. Elsevier. <a href=\"https://doi.org/10.1016/j.acha.2025.101749\">https://doi.org/10.1016/j.acha.2025.101749</a>","mla":"Fornasier, Massimo, et al. “Efficient Identification of Wide Shallow Neural Networks with Biases.” <i>Applied and Computational Harmonic Analysis</i>, vol. 77, 101749, Elsevier, 2025, doi:<a href=\"https://doi.org/10.1016/j.acha.2025.101749\">10.1016/j.acha.2025.101749</a>.","ista":"Fornasier M, Klock T, Mondelli M, Rauchensteiner M. 2025. Efficient identification of wide shallow neural networks with biases. Applied and Computational Harmonic Analysis. 77, 101749.","chicago":"Fornasier, Massimo, Timo Klock, Marco Mondelli, and Michael Rauchensteiner. “Efficient Identification of Wide Shallow Neural Networks with Biases.” <i>Applied and Computational Harmonic Analysis</i>. Elsevier, 2025. <a href=\"https://doi.org/10.1016/j.acha.2025.101749\">https://doi.org/10.1016/j.acha.2025.101749</a>.","ieee":"M. Fornasier, T. Klock, M. Mondelli, and M. Rauchensteiner, “Efficient identification of wide shallow neural networks with biases,” <i>Applied and Computational Harmonic Analysis</i>, vol. 77. Elsevier, 2025."},"article_number":"101749","user_id":"317138e5-6ab7-11ef-aa6d-ffef3953e345","publication_identifier":{"eissn":["1096-603X"],"issn":["1063-5203"]},"doi":"10.1016/j.acha.2025.101749","has_accepted_license":"1","tmp":{"legal_code_url":"https://creativecommons.org/licenses/by/4.0/legalcode","short":"CC BY (4.0)","image":"/images/cc_by.png","name":"Creative Commons Attribution 4.0 International Public License (CC-BY 4.0)"},"oa":1,"corr_author":"1","external_id":{"isi":["001430202700001"]},"date_updated":"2025-09-30T10:35:09Z","volume":77,"department":[{"_id":"MaMo"}],"publication":"Applied and Computational Harmonic Analysis","_id":"19065","intvolume":"        77","abstract":[{"lang":"eng","text":"The identification of the parameters of a neural network from finite samples of input-output pairs is often referred to as the teacher-student model, and this model has represented a popular framework for understanding training and generalization. Even if the problem is NP-complete in the worst case, a rapidly growing literature – after adding suitable distributional assumptions – has established finite sample identification of two-layer networks with a number of neurons (math. formula), D being the input dimension. For the range (math. formula) the problem becomes harder, and truly little is known for networks parametrized by biases as well. This paper fills the gap by providing efficient algorithms and rigorous theoretical guarantees of finite sample identification for such wider shallow networks with biases. Our approach is based on a two-step pipeline: first, we recover the direction of the weights, by exploiting second order information; next, we identify the signs by suitable algebraic evaluations, and we recover the biases by empirical risk minimization via gradient descent. Numerical results demonstrate the effectiveness of our approach."}],"file":[{"file_name":"2025_ApplCompAnalysis_Fornasier.pdf","date_updated":"2025-08-05T12:22:04Z","access_level":"open_access","relation":"main_file","file_size":2223350,"date_created":"2025-08-05T12:22:04Z","success":1,"content_type":"application/pdf","file_id":"20131","checksum":"657f258af0f7ca135e69959fd13e2d63","creator":"dernst"}],"month":"06","article_type":"original","oa_version":"Published Version","article_processing_charge":"No","title":"Efficient identification of wide shallow neural networks with biases","year":"2025","author":[{"last_name":"Fornasier","first_name":"Massimo","full_name":"Fornasier, Massimo"},{"full_name":"Klock, Timo","first_name":"Timo","last_name":"Klock"},{"orcid":"0000-0002-3242-7020","last_name":"Mondelli","id":"27EB676C-8706-11E9-9510-7717E6697425","full_name":"Mondelli, Marco","first_name":"Marco"},{"first_name":"Michael","full_name":"Rauchensteiner, Michael","last_name":"Rauchensteiner"}],"OA_place":"publisher","date_created":"2025-02-23T23:01:54Z","status":"public","publication_status":"published","publisher":"Elsevier","date_published":"2025-06-01T00:00:00Z","language":[{"iso":"eng"}],"isi":1,"scopus_import":"1","type":"journal_article","ddc":["000"],"day":"01"},{"publication_identifier":{"issn":["1868-8969"],"isbn":["9783959773614"]},"has_accepted_license":"1","doi":"10.4230/LIPIcs.ITCS.2025.82","tmp":{"legal_code_url":"https://creativecommons.org/licenses/by/4.0/legalcode","short":"CC BY (4.0)","image":"/images/cc_by.png","name":"Creative Commons Attribution 4.0 International Public License (CC-BY 4.0)"},"conference":{"location":"New York, NY, United States","start_date":"2025-01-07","name":"ITCS: Innovations in Theoretical Computer Science","end_date":"2025-01-10"},"OA_type":"gold","quality_controlled":"1","article_number":"82","file_date_updated":"2025-03-04T09:35:57Z","citation":{"ama":"Resch N, Yuan C, Zhang Y. Tight bounds on list-decodable and list-recoverable zero-rate codes. In: <i>16th Innovations in Theoretical Computer Science Conference</i>. Vol 325. Schloss Dagstuhl - Leibniz-Zentrum für Informatik; 2025. doi:<a href=\"https://doi.org/10.4230/LIPIcs.ITCS.2025.82\">10.4230/LIPIcs.ITCS.2025.82</a>","short":"N. Resch, C. Yuan, Y. Zhang, in:, 16th Innovations in Theoretical Computer Science Conference, Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2025.","ieee":"N. Resch, C. Yuan, and Y. Zhang, “Tight bounds on list-decodable and list-recoverable zero-rate codes,” in <i>16th Innovations in Theoretical Computer Science Conference</i>, New York, NY, United States, 2025, vol. 325.","chicago":"Resch, Nicolas, Chen Yuan, and Yihan Zhang. “Tight Bounds on List-Decodable and List-Recoverable Zero-Rate Codes.” In <i>16th Innovations in Theoretical Computer Science Conference</i>, Vol. 325. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2025. <a href=\"https://doi.org/10.4230/LIPIcs.ITCS.2025.82\">https://doi.org/10.4230/LIPIcs.ITCS.2025.82</a>.","ista":"Resch N, Yuan C, Zhang Y. 2025. Tight bounds on list-decodable and list-recoverable zero-rate codes. 16th Innovations in Theoretical Computer Science Conference. ITCS: Innovations in Theoretical Computer Science, LIPIcs, vol. 325, 82.","apa":"Resch, N., Yuan, C., &#38; Zhang, Y. (2025). Tight bounds on list-decodable and list-recoverable zero-rate codes. In <i>16th Innovations in Theoretical Computer Science Conference</i> (Vol. 325). New York, NY, United States: Schloss Dagstuhl - Leibniz-Zentrum für Informatik. <a href=\"https://doi.org/10.4230/LIPIcs.ITCS.2025.82\">https://doi.org/10.4230/LIPIcs.ITCS.2025.82</a>","mla":"Resch, Nicolas, et al. “Tight Bounds on List-Decodable and List-Recoverable Zero-Rate Codes.” <i>16th Innovations in Theoretical Computer Science Conference</i>, vol. 325, 82, Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2025, doi:<a href=\"https://doi.org/10.4230/LIPIcs.ITCS.2025.82\">10.4230/LIPIcs.ITCS.2025.82</a>."},"user_id":"317138e5-6ab7-11ef-aa6d-ffef3953e345","volume":325,"publication":"16th Innovations in Theoretical Computer Science Conference","department":[{"_id":"MaMo"}],"file":[{"date_created":"2025-03-04T09:35:57Z","file_size":898601,"success":1,"content_type":"application/pdf","file_id":"19286","checksum":"df3921ddf1b360b07f43d427fea51242","creator":"dernst","file_name":"2025_LIPIcs_Resch.pdf","access_level":"open_access","date_updated":"2025-03-04T09:35:57Z","relation":"main_file"}],"_id":"19281","intvolume":"       325","abstract":[{"lang":"eng","text":"In this work, we consider the list-decodability and list-recoverability of codes in the zero-rate regime. Briefly, a code 𝒞 ⊆ [q]ⁿ is (p,𝓁,L)-list-recoverable if for all tuples of input lists (Y₁,… ,Y_n) with each Y_i ⊆ [q] and |Y_i| = 𝓁, the number of codewords c ∈ 𝒞 such that c_i ∉ Y_i for at most pn choices of i ∈ [n] is less than L; list-decoding is the special case of 𝓁 = 1. In recent work by Resch, Yuan and Zhang (ICALP 2023) the zero-rate threshold for list-recovery was determined for all parameters: that is, the work explicitly computes p_*: = p_*(q,𝓁,L) with the property that for all ε > 0 (a) there exist positive-rate (p_*-ε,𝓁,L)-list-recoverable codes, and (b) any (p_*+ε,𝓁,L)-list-recoverable code has rate 0. In fact, in the latter case the code has constant size, independent on n. However, the constant size in their work is quite large in 1/ε, at least |𝒞| ≥ (1/(ε))^O(q^L).\r\nOur contribution in this work is to show that for all choices of q,𝓁 and L with q ≥ 3, any (p_*+ε,𝓁,L)-list-recoverable code must have size O_{q,𝓁,L}(1/ε), and furthermore this upper bound is complemented by a matching lower bound Ω_{q,𝓁,L}(1/ε). This greatly generalizes work by Alon, Bukh and Polyanskiy (IEEE Trans. Inf. Theory 2018) which focused only on the case of binary alphabet (and thus necessarily only list-decoding). We remark that we can in fact recover the same result for q = 2 and even L, as obtained by Alon, Bukh and Polyanskiy: we thus strictly generalize their work. \r\nOur main technical contribution is to (a) properly define a linear programming relaxation of the list-recovery condition over large alphabets; and (b) to demonstrate that a certain function defined on a q-ary probability simplex is maximized by the uniform distribution. This represents the core challenge in generalizing to larger q (as a binary simplex can be naturally identified with a one-dimensional interval). We can subsequently re-utilize certain Schur convexity and convexity properties established for a related function by Resch, Yuan and Zhang along with ideas of Alon, Bukh and Polyanskiy."}],"month":"02","corr_author":"1","oa":1,"date_updated":"2025-09-30T10:42:35Z","alternative_title":["LIPIcs"],"external_id":{"arxiv":["2309.01800"],"isi":["001532717300082"]},"acknowledgement":"The research of C. Yuan was support in part by the National Key R&D Program of China\r\nunder Grant 2023YFE0123900 and Natural Science Foundation of Shanghai under the 2024 Shanghai Action Plan for Science, Technology and Innovation Grant 24BC3200700. The research of N. Resch is supported in part by an NWO (Dutch Research Council) grant with number C.2324.0590, and this work was done in part while he was visiting the Simons Institute for the Theory of Computing, supported by DOE grant #DE-SC0024124.","date_created":"2025-03-02T23:01:53Z","status":"public","publication_status":"published","publisher":"Schloss Dagstuhl - Leibniz-Zentrum für Informatik","arxiv":1,"article_processing_charge":"Yes","title":"Tight bounds on list-decodable and list-recoverable zero-rate codes","oa_version":"Published Version","author":[{"last_name":"Resch","first_name":"Nicolas","full_name":"Resch, Nicolas"},{"last_name":"Yuan","full_name":"Yuan, Chen","first_name":"Chen"},{"full_name":"Zhang, Yihan","first_name":"Yihan","id":"2ce5da42-b2ea-11eb-bba5-9f264e9d002c","last_name":"Zhang","orcid":"0000-0002-6465-6258"}],"year":"2025","OA_place":"publisher","isi":1,"language":[{"iso":"eng"}],"scopus_import":"1","day":"11","ddc":["510","000"],"type":"conference","date_published":"2025-02-11T00:00:00Z"},{"year":"2025","author":[{"last_name":"Bombari","id":"ca726dda-de17-11ea-bc14-f9da834f63aa","first_name":"Simone","full_name":"Bombari, Simone"},{"orcid":"0000-0002-3242-7020","last_name":"Mondelli","id":"27EB676C-8706-11E9-9510-7717E6697425","first_name":"Marco","full_name":"Mondelli, Marco"}],"OA_place":"publisher","arxiv":1,"oa_version":"Published Version","article_processing_charge":"No","title":"Spurious correlations in high dimensional regression: The roles of regularization, simplicity bias and over-parameterization","publication_status":"published","publisher":"ML Research Press","date_created":"2026-02-18T11:58:00Z","status":"public","acknowledgement":"Marco Mondelli is funded by the European Union (ERC, INF2, project number 101161364). Views and opinions expressed are however those of the author(s) only and do not necessarily reflect those of the European Union or the European Research Council Executive Agency. Neither the European Union nor the granting authority can be held responsible for them. Simone Bombari is supported by a Google PhD fellowship. The authors would like to thank GuanWen Qiu for helpful discussions.","project":[{"_id":"911e6d1f-16d5-11f0-9cad-c5c68c6a1cdf","grant_number":"101161364","name":"Inference in High Dimensions: Light-speed Algorithms and Information Limits"},{"name":"Trustworthy Deep Learning Theory: Private Over-Parameterized Models and Robust LLMs","_id":"92099302-16d5-11f0-9cad-f9a785f54fbd"}],"date_published":"2025-07-30T00:00:00Z","type":"conference","page":"4839-4873","day":"30","ddc":["000"],"language":[{"iso":"eng"}],"file_date_updated":"2026-02-19T08:04:38Z","citation":{"chicago":"Bombari, Simone, and Marco Mondelli. “Spurious Correlations in High Dimensional Regression: The Roles of Regularization, Simplicity Bias and over-Parameterization.” In <i>Proceedings of the 42nd International Conference on Machine Learning</i>, 267:4839–73. ML Research Press, 2025.","mla":"Bombari, Simone, and Marco Mondelli. “Spurious Correlations in High Dimensional Regression: The Roles of Regularization, Simplicity Bias and over-Parameterization.” <i>Proceedings of the 42nd International Conference on Machine Learning</i>, vol. 267, ML Research Press, 2025, pp. 4839–73.","apa":"Bombari, S., &#38; Mondelli, M. (2025). Spurious correlations in high dimensional regression: The roles of regularization, simplicity bias and over-parameterization. In <i>Proceedings of the 42nd International Conference on Machine Learning</i> (Vol. 267, pp. 4839–4873). Vancouver, Canada: ML Research Press.","ista":"Bombari S, Mondelli M. 2025. Spurious correlations in high dimensional regression: The roles of regularization, simplicity bias and over-parameterization. Proceedings of the 42nd International Conference on Machine Learning. ICML: International Conference on Machine Learning, PMLR, vol. 267, 4839–4873.","ieee":"S. Bombari and M. Mondelli, “Spurious correlations in high dimensional regression: The roles of regularization, simplicity bias and over-parameterization,” in <i>Proceedings of the 42nd International Conference on Machine Learning</i>, Vancouver, Canada, 2025, vol. 267, pp. 4839–4873.","short":"S. Bombari, M. Mondelli, in:, Proceedings of the 42nd International Conference on Machine Learning, ML Research Press, 2025, pp. 4839–4873.","ama":"Bombari S, Mondelli M. Spurious correlations in high dimensional regression: The roles of regularization, simplicity bias and over-parameterization. In: <i>Proceedings of the 42nd International Conference on Machine Learning</i>. Vol 267. ML Research Press; 2025:4839-4873."},"user_id":"2DF688A6-F248-11E8-B48F-1D18A9856A87","OA_type":"gold","conference":{"end_date":"2025-07-19","name":"ICML: International Conference on Machine Learning","start_date":"2025-07-13","location":"Vancouver, Canada"},"quality_controlled":"1","has_accepted_license":"1","tmp":{"legal_code_url":"https://creativecommons.org/licenses/by/4.0/legalcode","short":"CC BY (4.0)","image":"/images/cc_by.png","name":"Creative Commons Attribution 4.0 International Public License (CC-BY 4.0)"},"publication_identifier":{"eissn":["2640-3498"]},"external_id":{"arxiv":["2502.01347"]},"alternative_title":["PMLR"],"date_updated":"2026-02-19T08:08:55Z","oa":1,"corr_author":"1","_id":"21324","abstract":[{"lang":"eng","text":"Learning models have been shown to rely on spurious correlations between non-predictive features and the associated labels in the training data, with negative implications on robustness, bias and fairness. In this work, we provide a statistical characterization of this phenomenon for high-dimensional regression, when the data contains a predictive core feature x and a spurious feature y. Specifically, we quantify the amount of spurious correlations C learned via linear regression, in terms of the data covariance and the strength λ of the ridge regularization. As a consequence, we first capture the simplicity of y through the spectrum of its covariance, and its correlation with x through the Schur complement of the full data covariance. Next, we prove a trade-off between C and the in-distribution test loss L, by showing that the value of λ that minimizes L lies in an interval where C is increasing. Finally, we investigate the effects of over-parameterization via the random features model, by showing its equivalence to regularized linear regression. Our theoretical results are supported by numerical experiments on Gaussian, Color-MNIST, and CIFAR-10 datasets."}],"intvolume":"       267","file":[{"file_name":"2025_ICML_Bombari.pdf","access_level":"open_access","date_updated":"2026-02-19T08:04:38Z","relation":"main_file","content_type":"application/pdf","date_created":"2026-02-19T08:04:38Z","success":1,"file_size":887526,"file_id":"21335","checksum":"d4ba4f7717b362ca38878f45e57bd643","creator":"dernst"}],"month":"07","volume":267,"department":[{"_id":"MaMo"}],"publication":"Proceedings of the 42nd International Conference on Machine Learning"},{"publication_identifier":{"eissn":["2640-3498"]},"tmp":{"legal_code_url":"https://creativecommons.org/licenses/by/4.0/legalcode","short":"CC BY (4.0)","image":"/images/cc_by.png","name":"Creative Commons Attribution 4.0 International Public License (CC-BY 4.0)"},"has_accepted_license":"1","quality_controlled":"1","OA_type":"gold","conference":{"start_date":"2025-07-13","end_date":"2025-07-19","name":"ICML: International Conference on Machine Learning","location":"Vancouver, Canada"},"user_id":"2DF688A6-F248-11E8-B48F-1D18A9856A87","file_date_updated":"2026-02-19T08:15:48Z","citation":{"apa":"Gozeten, H. A., Ildiz, M. E., Zhang, X., Soltanolkotabi, M., Mondelli, M., &#38; Oymak, S. (2025). Test-time training provably improves transformers as in-context learners. In <i>Proceedings of the 42nd International Conference on Machine Learning</i> (Vol. 267, pp. 20266–20295). Vancouver, Canada: ML Research Press.","mla":"Gozeten, Halil Alperen, et al. “Test-Time Training Provably Improves Transformers as in-Context Learners.” <i>Proceedings of the 42nd International Conference on Machine Learning</i>, vol. 267, ML Research Press, 2025, pp. 20266–95.","ista":"Gozeten HA, Ildiz ME, Zhang X, Soltanolkotabi M, Mondelli M, Oymak S. 2025. Test-time training provably improves transformers as in-context learners. Proceedings of the 42nd International Conference on Machine Learning. ICML: International Conference on Machine Learning, PMLR, vol. 267, 20266–20295.","chicago":"Gozeten, Halil Alperen, Muhammed Emrullah Ildiz, Xuechen Zhang, Mahdi Soltanolkotabi, Marco Mondelli, and Samet Oymak. “Test-Time Training Provably Improves Transformers as in-Context Learners.” In <i>Proceedings of the 42nd International Conference on Machine Learning</i>, 267:20266–95. ML Research Press, 2025.","ieee":"H. A. Gozeten, M. E. Ildiz, X. Zhang, M. Soltanolkotabi, M. Mondelli, and S. Oymak, “Test-time training provably improves transformers as in-context learners,” in <i>Proceedings of the 42nd International Conference on Machine Learning</i>, Vancouver, Canada, 2025, vol. 267, pp. 20266–20295.","short":"H.A. Gozeten, M.E. Ildiz, X. Zhang, M. Soltanolkotabi, M. Mondelli, S. Oymak, in:, Proceedings of the 42nd International Conference on Machine Learning, ML Research Press, 2025, pp. 20266–20295.","ama":"Gozeten HA, Ildiz ME, Zhang X, Soltanolkotabi M, Mondelli M, Oymak S. Test-time training provably improves transformers as in-context learners. In: <i>Proceedings of the 42nd International Conference on Machine Learning</i>. Vol 267. ML Research Press; 2025:20266-20295."},"department":[{"_id":"MaMo"}],"publication":"Proceedings of the 42nd International Conference on Machine Learning","volume":267,"month":"11","_id":"21325","abstract":[{"text":"Test-time training (TTT) methods explicitly update the weights of a model to adapt to the specific test instance, and they have found success in a variety of settings, including most recently language modeling and reasoning. To demystify this success, we investigate a gradient-based TTT algorithm for in-context learning, where we train a transformer model on the in-context demonstrations provided in the test prompt. Specifically, we provide a comprehensive theoretical characterization of linear transformers when the update rule is a single gradient step. Our theory (i) delineates the role of alignment between pretraining distribution and target task, (ii) demystifies how TTT can alleviate distribution shift, and (iii) quantifies the sample complexity of TTT including how it can significantly reduce the eventual sample size required for in-context learning. As our empirical contribution, we study the benefits of TTT for TabPFN, a tabular foundation model. In line with our theory, we demonstrate that TTT significantly reduces the required sample size for tabular classification (3 to 5 times fewer) unlocking substantial inference efficiency with a negligible training cost.","lang":"eng"}],"intvolume":"       267","file":[{"file_name":"2025_ICML_Gozeten.pdf","relation":"main_file","access_level":"open_access","date_updated":"2026-02-19T08:15:48Z","file_id":"21336","content_type":"application/pdf","date_created":"2026-02-19T08:15:48Z","success":1,"file_size":471176,"creator":"dernst","checksum":"f774f8619a0d72f3975d9cb23942a1e9"}],"oa":1,"external_id":{"pmid":["41321376"]},"date_updated":"2026-02-19T08:18:24Z","alternative_title":["PMLR"],"project":[{"_id":"911e6d1f-16d5-11f0-9cad-c5c68c6a1cdf","grant_number":"101161364","name":"Inference in High Dimensions: Light-speed Algorithms and Information Limits"}],"status":"public","date_created":"2026-02-18T12:00:44Z","acknowledgement":"H.A.G., M.E.I., X.Z., and S.O. were supported in part by the NSF grants CCF2046816, CCF-2403075, CCF-2008020, and the Office of Naval Research grant N000142412289.\r\nM. M. is funded by the European Union (ERC, INF2 , project number 101161364). Views and opinions expressed are, however, those of the author(s) only and do not necessarily\r\nreflect those of the European Union or the European Research Council Executive Agency. Neither the European Union nor the granting authority can be held responsible for them. M.S. is supported by the Packard Fellowship in Science and Engineering, a Sloan Research Fellowship in Mathematics, an NSF-CAREER under award #1846369, DARPA FastNICS program, and NSF-CIF awards #1813877 and #2008443, and NIH DP2LM014564-01. The authors also\r\nacknowledge further support from Open Philanthropy, OpenAI, Amazon Research, Google Research, and Microsoft Research.","publisher":"ML Research Press","publication_status":"published","oa_version":"Published Version","article_processing_charge":"No","title":"Test-time training provably improves transformers as in-context learners","pmid":1,"OA_place":"publisher","year":"2025","author":[{"first_name":"Halil Alperen","full_name":"Gozeten, Halil Alperen","last_name":"Gozeten"},{"full_name":"Ildiz, Muhammed Emrullah","first_name":"Muhammed Emrullah","last_name":"Ildiz"},{"full_name":"Zhang, Xuechen","first_name":"Xuechen","last_name":"Zhang"},{"first_name":"Mahdi","full_name":"Soltanolkotabi, Mahdi","last_name":"Soltanolkotabi"},{"first_name":"Marco","full_name":"Mondelli, Marco","last_name":"Mondelli","id":"27EB676C-8706-11E9-9510-7717E6697425","orcid":"0000-0002-3242-7020"},{"last_name":"Oymak","full_name":"Oymak, Samet","first_name":"Samet"}],"language":[{"iso":"eng"}],"type":"conference","page":"20266-20295","ddc":["000"],"day":"30","date_published":"2025-11-30T00:00:00Z"},{"date_published":"2025-07-30T00:00:00Z","language":[{"iso":"eng"}],"page":"67499-67536","day":"30","ddc":["000"],"type":"conference","arxiv":1,"title":"Neural collapse beyond the unconstrained features model: Landscape, dynamics, and generalization in the mean-field regime","article_processing_charge":"No","oa_version":"Published Version","author":[{"first_name":"Diyuan","full_name":"Wu, Diyuan","last_name":"Wu","id":"1a5914c2-896a-11ed-bdf8-fb80621a0635"},{"last_name":"Mondelli","id":"27EB676C-8706-11E9-9510-7717E6697425","full_name":"Mondelli, Marco","first_name":"Marco","orcid":"0000-0002-3242-7020"}],"year":"2025","OA_place":"publisher","acknowledgement":"This research was funded in whole or in part by the Austrian Science Fund (FWF) 10.55776/COE12. For the purpose of open access, the authors have applied a CC BY public\r\ncopyright license to any Author Accepted Manuscript version arising from this submission. The authors would like to thank Peter Sukenık for general helpful discussions and for pointing out that all the stationary points are approximately proportional in the case without entropic regularization. ","date_created":"2026-02-18T12:02:45Z","status":"public","publication_status":"published","publisher":"ML Research Press","corr_author":"1","oa":1,"alternative_title":["PMLR"],"date_updated":"2026-02-19T08:30:42Z","external_id":{"arxiv":["2501.19104"]},"volume":267,"publication":"Proceedings of the 42nd International Conference on Machine Learning","department":[{"_id":"MaMo"}],"file":[{"access_level":"open_access","date_updated":"2026-02-19T08:28:22Z","relation":"main_file","file_name":"2025_ICML_Wu.pdf","checksum":"c5ce8b1c83e33dc3a11122f4910deb67","creator":"dernst","content_type":"application/pdf","date_created":"2026-02-19T08:28:22Z","success":1,"file_size":3994385,"file_id":"21337"}],"abstract":[{"text":"Neural Collapse is a phenomenon where the last-layer representations of a well-trained neural network converge to a highly structured geometry. In this paper, we focus on its first (and most basic) property, known as NC1: the within-class variability vanishes. While prior theoretical studies establish the occurrence of NC1 via the data-agnostic unconstrained features model, our work adopts a data-specific perspective, analyzing NC1 in a three-layer neural network, with the first two layers operating in the mean-field regime and followed by a linear layer. In particular, we establish a fundamental connection between NC1 and the loss landscape: we prove that points with small empirical loss and gradient norm (thus, close to being stationary) approximately satisfy NC1, and the closeness to NC1 is controlled by the residual loss and gradient norm. We then show that (i) gradient flow on the mean squared error converges to NC1 solutions with small empirical loss, and (ii) for well-separated data distributions, both NC1 and vanishing test loss are achieved simultaneously. This aligns with the empirical observation that NC1 emerges during training while models attain near-zero test error. Overall, our results demonstrate that NC1 arises from gradient training due to the properties of the loss landscape, and they show the co-occurrence of NC1 and small test error for certain data distributions.","lang":"eng"}],"_id":"21326","intvolume":"       267","month":"07","OA_type":"gold","conference":{"name":"ICML: International Conference on Machine Learning","end_date":"2025-07-19","start_date":"2025-07-13","location":"Vancouver, Canada"},"quality_controlled":"1","citation":{"short":"D. Wu, M. Mondelli, in:, Proceedings of the 42nd International Conference on Machine Learning, ML Research Press, 2025, pp. 67499–67536.","ama":"Wu D, Mondelli M. Neural collapse beyond the unconstrained features model: Landscape, dynamics, and generalization in the mean-field regime. In: <i>Proceedings of the 42nd International Conference on Machine Learning</i>. Vol 267. ML Research Press; 2025:67499-67536.","chicago":"Wu, Diyuan, and Marco Mondelli. “Neural Collapse beyond the Unconstrained Features Model: Landscape, Dynamics, and Generalization in the Mean-Field Regime.” In <i>Proceedings of the 42nd International Conference on Machine Learning</i>, 267:67499–536. ML Research Press, 2025.","apa":"Wu, D., &#38; Mondelli, M. (2025). Neural collapse beyond the unconstrained features model: Landscape, dynamics, and generalization in the mean-field regime. In <i>Proceedings of the 42nd International Conference on Machine Learning</i> (Vol. 267, pp. 67499–67536). Vancouver, Canada: ML Research Press.","mla":"Wu, Diyuan, and Marco Mondelli. “Neural Collapse beyond the Unconstrained Features Model: Landscape, Dynamics, and Generalization in the Mean-Field Regime.” <i>Proceedings of the 42nd International Conference on Machine Learning</i>, vol. 267, ML Research Press, 2025, pp. 67499–536.","ista":"Wu D, Mondelli M. 2025. Neural collapse beyond the unconstrained features model: Landscape, dynamics, and generalization in the mean-field regime. Proceedings of the 42nd International Conference on Machine Learning. ICML: International Conference on Machine Learning, PMLR, vol. 267, 67499–67536.","ieee":"D. Wu and M. Mondelli, “Neural collapse beyond the unconstrained features model: Landscape, dynamics, and generalization in the mean-field regime,” in <i>Proceedings of the 42nd International Conference on Machine Learning</i>, Vancouver, Canada, 2025, vol. 267, pp. 67499–67536."},"file_date_updated":"2026-02-19T08:28:22Z","user_id":"2DF688A6-F248-11E8-B48F-1D18A9856A87","publication_identifier":{"eissn":["2640-3498"]},"has_accepted_license":"1","tmp":{"legal_code_url":"https://creativecommons.org/licenses/by/4.0/legalcode","short":"CC BY (4.0)","image":"/images/cc_by.png","name":"Creative Commons Attribution 4.0 International Public License (CC-BY 4.0)"}},{"date_published":"2025-07-01T00:00:00Z","type":"conference","day":"01","page":"3354-3404","ddc":["000"],"language":[{"iso":"eng"}],"scopus_import":"1","year":"2025","author":[{"last_name":"Kovačević","id":"d0258e7b-50b8-11ef-ad56-8b9f537b6b1b","full_name":"Kovačević, Filip","first_name":"Filip"},{"last_name":"Yihan","full_name":"Yihan, Zhang","first_name":"Zhang"},{"last_name":"Mondelli","id":"27EB676C-8706-11E9-9510-7717E6697425","first_name":"Marco","full_name":"Mondelli, Marco","orcid":"0000-0002-3242-7020"}],"OA_place":"publisher","arxiv":1,"oa_version":"Published Version","article_processing_charge":"No","title":"Spectral estimators for multi-index models: Precise asymptotics and optimal weak recovery","publication_status":"published","publisher":"ML Research Press","status":"public","date_created":"2026-02-18T12:12:47Z","acknowledgement":"This work was done when Y. Z. was at the Institute of Science and Technology Austria. Y. Z. and\r\nM. M. are funded by the European Union (ERC, INF2, project number 101161364). Views and\r\nopinions expressed are however those of the author(s) only and do not necessarily reflect those of the European Union or the European Research Council Executive Agency. Neither the European Union nor the granting authority can be held responsible for them. The authors would like to acknowledge (in alphabetical order) discussions with Yatin Dandi, Leonardo Defilippis and Bruno Loureiro concerning their parallel work (Defilippis et al., 2025).","project":[{"name":"Inference in High Dimensions: Light-speed Algorithms and Information Limits","grant_number":"101161364","_id":"911e6d1f-16d5-11f0-9cad-c5c68c6a1cdf"}],"external_id":{"arxiv":["2502.01583"]},"alternative_title":["PMLR"],"date_updated":"2026-02-19T09:03:53Z","oa":1,"corr_author":"1","_id":"21328","intvolume":"       291","abstract":[{"text":"Multi-index models provide a popular framework to investigate the learnability of functions with low-dimensional structure and, also due to their connections with neural networks, they have been object of recent intensive study. In this paper, we focus on recovering the subspace spanned by the signals via spectral estimators – a family of methods routinely used in practice, often as a warm-start for iterative algorithms. Our main technical contribution is a precise asymptotic characterization of the performance of spectral methods, when sample size and input dimension grow proportionally and the dimension p of the space to recover is fixed. Specifically, we locate the top-p eigenvalues of the spectral matrix and establish the overlaps between the corresponding eigenvectors (which give the spectral estimators) and a basis of the signal subspace. Our analysis unveils a phase transition phenomenon in which, as the sample complexity grows, eigenvalues escape from the bulk of the spectrum and, when that happens, eigenvectors recover directions of the desired subspace. The precise characterization we put forward enables the optimization of the data preprocessing, thus allowing to identify the spectral estimator that requires the minimal sample size for weak recovery.","lang":"eng"}],"file":[{"file_id":"21339","content_type":"application/pdf","date_created":"2026-02-19T09:03:43Z","file_size":844611,"success":1,"creator":"dernst","checksum":"19aa70ab4f57fb9067b6ebb99a5fd6f0","file_name":"2025_LearningTheory_Kovacevic.pdf","relation":"main_file","date_updated":"2026-02-19T09:03:43Z","access_level":"open_access"}],"month":"07","volume":291,"department":[{"_id":"MaMo"}],"publication":"Proceedings of 38th Conference on Learning Theory","citation":{"short":"F. Kovačević, Z. Yihan, M. Mondelli, in:, Proceedings of 38th Conference on Learning Theory, ML Research Press, 2025, pp. 3354–3404.","ama":"Kovačević F, Yihan Z, Mondelli M. Spectral estimators for multi-index models: Precise asymptotics and optimal weak recovery. In: <i>Proceedings of 38th Conference on Learning Theory</i>. Vol 291. ML Research Press; 2025:3354-3404.","chicago":"Kovačević, Filip, Zhang Yihan, and Marco Mondelli. “Spectral Estimators for Multi-Index Models: Precise Asymptotics and Optimal Weak Recovery.” In <i>Proceedings of 38th Conference on Learning Theory</i>, 291:3354–3404. ML Research Press, 2025.","apa":"Kovačević, F., Yihan, Z., &#38; Mondelli, M. (2025). Spectral estimators for multi-index models: Precise asymptotics and optimal weak recovery. In <i>Proceedings of 38th Conference on Learning Theory</i> (Vol. 291, pp. 3354–3404). Lyon, France: ML Research Press.","mla":"Kovačević, Filip, et al. “Spectral Estimators for Multi-Index Models: Precise Asymptotics and Optimal Weak Recovery.” <i>Proceedings of 38th Conference on Learning Theory</i>, vol. 291, ML Research Press, 2025, pp. 3354–404.","ista":"Kovačević F, Yihan Z, Mondelli M. 2025. Spectral estimators for multi-index models: Precise asymptotics and optimal weak recovery. Proceedings of 38th Conference on Learning Theory. COLT: Conference on Learning Theory, PMLR, vol. 291, 3354–3404.","ieee":"F. Kovačević, Z. Yihan, and M. Mondelli, “Spectral estimators for multi-index models: Precise asymptotics and optimal weak recovery,” in <i>Proceedings of 38th Conference on Learning Theory</i>, Lyon, France, 2025, vol. 291, pp. 3354–3404."},"file_date_updated":"2026-02-19T09:03:43Z","user_id":"2DF688A6-F248-11E8-B48F-1D18A9856A87","OA_type":"gold","conference":{"location":"Lyon, France","end_date":"2025-07-04","name":"COLT: Conference on Learning Theory","start_date":"2025-06-30"},"quality_controlled":"1","has_accepted_license":"1","tmp":{"legal_code_url":"https://creativecommons.org/licenses/by/4.0/legalcode","short":"CC BY (4.0)","image":"/images/cc_by.png","name":"Creative Commons Attribution 4.0 International Public License (CC-BY 4.0)"},"publication_identifier":{"eissn":["2640-3498"]}},{"month":"01","article_type":"original","_id":"18986","abstract":[{"text":"We consider a prototypical problem of Bayesian inference for a structured spiked model: a low-rank signal is corrupted by additive noise. While both information-theoretic and algorithmic limits are well understood when the noise is a Gaussian Wigner matrix, the more realistic case of structured noise still remains challenging. To capture the structure while maintaining mathematical tractability, a line of work has focused on rotationally invariant noise. However, existing studies either provide suboptimal algorithms or are limited to a special class of noise ensembles. In this paper, using tools from statistical physics (replica method) and random matrix theory (generalized spherical integrals) we establish the characterization of the information-theoretic limits for a noise matrix drawn from a general trace ensemble. Remarkably, our analysis unveils the asymptotic equivalence between the rotationally invariant model and a surrogate Gaussian one. Finally, we show how to saturate the predicted statistical limits using an efficient algorithm inspired by the theory of adaptive Thouless-Anderson-Palmer (TAP) equations.","lang":"eng"}],"intvolume":"         7","file":[{"file_name":"2025_PhysReviewResearch_Barbier.pdf","date_updated":"2025-02-03T08:27:59Z","access_level":"open_access","relation":"main_file","content_type":"application/pdf","file_size":702543,"success":1,"date_created":"2025-02-03T08:27:59Z","file_id":"18988","checksum":"52c5f72d80ffc928542469114fcdb62b","creator":"dernst"}],"department":[{"_id":"MaMo"}],"publication":"Physical Review Research","volume":7,"DOAJ_listed":"1","external_id":{"arxiv":["2405.20993"]},"date_updated":"2026-05-06T12:57:36Z","related_material":{"link":[{"relation":"software","url":"https://github.com/xu-yz19/spiked-matrix-models-with-structured-noise"}]},"oa":1,"tmp":{"legal_code_url":"https://creativecommons.org/licenses/by/4.0/legalcode","short":"CC BY (4.0)","image":"/images/cc_by.png","name":"Creative Commons Attribution 4.0 International Public License (CC-BY 4.0)"},"doi":"10.1103/PhysRevResearch.7.013081","has_accepted_license":"1","publication_identifier":{"issn":["2643-1564"]},"user_id":"2DF688A6-F248-11E8-B48F-1D18A9856A87","file_date_updated":"2025-02-03T08:27:59Z","citation":{"chicago":"Barbier, Jean, Francesco Camilli, Yizhou Xu, and Marco Mondelli. “Information Limits and Thouless-Anderson-Palmer Equations for Spiked Matrix Models with Structured Noise.” <i>Physical Review Research</i>. American Physical Society, 2025. <a href=\"https://doi.org/10.1103/PhysRevResearch.7.013081\">https://doi.org/10.1103/PhysRevResearch.7.013081</a>.","apa":"Barbier, J., Camilli, F., Xu, Y., &#38; Mondelli, M. (2025). Information limits and Thouless-Anderson-Palmer equations for spiked matrix models with structured noise. <i>Physical Review Research</i>. American Physical Society. <a href=\"https://doi.org/10.1103/PhysRevResearch.7.013081\">https://doi.org/10.1103/PhysRevResearch.7.013081</a>","mla":"Barbier, Jean, et al. “Information Limits and Thouless-Anderson-Palmer Equations for Spiked Matrix Models with Structured Noise.” <i>Physical Review Research</i>, vol. 7, 013081, American Physical Society, 2025, doi:<a href=\"https://doi.org/10.1103/PhysRevResearch.7.013081\">10.1103/PhysRevResearch.7.013081</a>.","ista":"Barbier J, Camilli F, Xu Y, Mondelli M. 2025. Information limits and Thouless-Anderson-Palmer equations for spiked matrix models with structured noise. Physical Review Research. 7, 013081.","ieee":"J. Barbier, F. Camilli, Y. Xu, and M. Mondelli, “Information limits and Thouless-Anderson-Palmer equations for spiked matrix models with structured noise,” <i>Physical Review Research</i>, vol. 7. American Physical Society, 2025.","short":"J. Barbier, F. Camilli, Y. Xu, M. Mondelli, Physical Review Research 7 (2025).","ama":"Barbier J, Camilli F, Xu Y, Mondelli M. Information limits and Thouless-Anderson-Palmer equations for spiked matrix models with structured noise. <i>Physical Review Research</i>. 2025;7. doi:<a href=\"https://doi.org/10.1103/PhysRevResearch.7.013081\">10.1103/PhysRevResearch.7.013081</a>"},"article_number":"013081","quality_controlled":"1","OA_type":"gold","type":"journal_article","ddc":["530"],"day":"22","scopus_import":"1","language":[{"iso":"eng"}],"date_published":"2025-01-22T00:00:00Z","APC_amount":"3272,21 EUR","publisher":"American Physical Society","publication_status":"published","project":[{"name":"Prix Lopez-Loretta 2019 - Marco Mondelli","_id":"059876FA-7A3F-11EA-A408-12923DDC885E"}],"date_created":"2025-02-02T23:01:54Z","status":"public","acknowledgement":"J.B., F.C., and Y.X. were funded by the European Union (ERC, CHORAL, Project No. 101039794). Views and opinions expressed are however those of the authors only and do not necessarily reflect those of the European Union or the European Research Council. Neither the European Union nor the granting authority can be held responsible for them. M.M. was supported by the 2019 Lopez-Loreta Prize. J.B. acknowledges discussions with TianQi Hou at the initial stage of the project, as well as with Antoine Bodin.","OA_place":"publisher","year":"2025","author":[{"last_name":"Barbier","full_name":"Barbier, Jean","first_name":"Jean"},{"last_name":"Camilli","first_name":"Francesco","full_name":"Camilli, Francesco"},{"last_name":"Xu","full_name":"Xu, Yizhou","first_name":"Yizhou"},{"last_name":"Mondelli","id":"27EB676C-8706-11E9-9510-7717E6697425","full_name":"Mondelli, Marco","first_name":"Marco","orcid":"0000-0002-3242-7020"}],"oa_version":"Published Version","article_processing_charge":"Yes","title":"Information limits and Thouless-Anderson-Palmer equations for spiked matrix models with structured noise","arxiv":1},{"tmp":{"legal_code_url":"https://creativecommons.org/licenses/by/4.0/legalcode","short":"CC BY (4.0)","image":"/images/cc_by.png","name":"Creative Commons Attribution 4.0 International Public License (CC-BY 4.0)"},"has_accepted_license":"1","doi":"10.1073/pnas.2423072122","publication_identifier":{"issn":["0027-8424"],"eissn":["1091-6490"]},"user_id":"2DF688A6-F248-11E8-B48F-1D18A9856A87","article_number":"e2423072122","citation":{"ista":"Bombari S, Mondelli M. 2025. Privacy for free in the overparameterized regime. Proceedings of the National Academy of Sciences. 122(15), e2423072122.","apa":"Bombari, S., &#38; Mondelli, M. (2025). Privacy for free in the overparameterized regime. <i>Proceedings of the National Academy of Sciences</i>. National Academy of Sciences. <a href=\"https://doi.org/10.1073/pnas.2423072122\">https://doi.org/10.1073/pnas.2423072122</a>","mla":"Bombari, Simone, and Marco Mondelli. “Privacy for Free in the Overparameterized Regime.” <i>Proceedings of the National Academy of Sciences</i>, vol. 122, no. 15, e2423072122, National Academy of Sciences, 2025, doi:<a href=\"https://doi.org/10.1073/pnas.2423072122\">10.1073/pnas.2423072122</a>.","chicago":"Bombari, Simone, and Marco Mondelli. “Privacy for Free in the Overparameterized Regime.” <i>Proceedings of the National Academy of Sciences</i>. National Academy of Sciences, 2025. <a href=\"https://doi.org/10.1073/pnas.2423072122\">https://doi.org/10.1073/pnas.2423072122</a>.","ieee":"S. Bombari and M. Mondelli, “Privacy for free in the overparameterized regime,” <i>Proceedings of the National Academy of Sciences</i>, vol. 122, no. 15. National Academy of Sciences, 2025.","short":"S. Bombari, M. Mondelli, Proceedings of the National Academy of Sciences 122 (2025).","ama":"Bombari S, Mondelli M. Privacy for free in the overparameterized regime. <i>Proceedings of the National Academy of Sciences</i>. 2025;122(15). doi:<a href=\"https://doi.org/10.1073/pnas.2423072122\">10.1073/pnas.2423072122</a>"},"file_date_updated":"2025-05-05T07:27:54Z","quality_controlled":"1","OA_type":"hybrid","article_type":"original","month":"04","file":[{"file_name":"2025_PNAS_Bombari.pdf","date_updated":"2025-05-05T07:27:54Z","access_level":"open_access","relation":"main_file","success":1,"date_created":"2025-05-05T07:27:54Z","file_size":2328320,"content_type":"application/pdf","file_id":"19648","checksum":"1ac6f78e368d35a0cafb4d2d9bd63443","creator":"dernst"}],"_id":"19627","intvolume":"       122","abstract":[{"lang":"eng","text":"Differentially private gradient descent (DP-GD) is a popular algorithm to train deep learning models with provable guarantees on the privacy of the training data. In the last decade, the problem of understanding its performance cost with respect to standard GD has received remarkable attention from the research community, which formally derived upper bounds on the excess population risk  RP  in different learning settings. However, existing bounds typically degrade with over-parameterization, i.e., as the number of parameters  p  gets larger than the number of training samples  n  -- a regime which is ubiquitous in current deep-learning practice. As a result, the lack of theoretical insights leaves practitioners without clear guidance, leading some to reduce the effective number of trainable parameters to improve performance, while others use larger models to achieve better results through scale. In this work, we show that in the popular random features model with quadratic loss, for any sufficiently large  p , privacy can be obtained for free, i.e.,  |RP|=o(1) , not only when the privacy parameter  ε  has constant order, but also in the strongly private setting  ε=o(1) . This challenges the common wisdom that over-parameterization inherently hinders performance in private learning."}],"publication":"Proceedings of the National Academy of Sciences","department":[{"_id":"MaMo"}],"volume":122,"date_updated":"2026-05-20T08:23:19Z","external_id":{"arxiv":["2410.14787"],"pmid":["40215275"],"isi":["001471214000001"]},"corr_author":"1","oa":1,"publisher":"National Academy of Sciences","publication_status":"published","project":[{"name":"Prix Lopez-Loretta 2019 - Marco Mondelli","_id":"059876FA-7A3F-11EA-A408-12923DDC885E"},{"_id":"92099302-16d5-11f0-9cad-f9a785f54fbd","name":"Trustworthy Deep Learning Theory: Private Over-Parameterized Models and Robust LLMs"}],"acknowledgement":"This research was funded in whole, or in part, by the Austrian Science Fund (FWF) Grant number COE 12. For the purpose of open access, the author has applied a CC BY public copyright license to any Author Accepted Manuscript version arising from this submission. The authors were also supported by the 2019 Lopez-Loreta prize, and Simone Bombari was supported by a Google PhD fellowship. We thank Diyuan Wu, Edwige Cyffers, Francesco Pedrotti, Inbar Seroussi, Nikita P. Kalinin, Pietro Pelliconi, Roodabeh Safavi, Yizhe Zhu, and Zhichao Wang for helpful discussions.","status":"public","date_created":"2025-04-27T22:02:13Z","OA_place":"publisher","author":[{"first_name":"Simone","full_name":"Bombari, Simone","last_name":"Bombari","id":"ca726dda-de17-11ea-bc14-f9da834f63aa"},{"last_name":"Mondelli","id":"27EB676C-8706-11E9-9510-7717E6697425","full_name":"Mondelli, Marco","first_name":"Marco","orcid":"0000-0002-3242-7020"}],"year":"2025","article_processing_charge":"Yes (in subscription journal)","title":"Privacy for free in the overparameterized regime","oa_version":"Published Version","pmid":1,"arxiv":1,"day":"15","ddc":["000"],"type":"journal_article","scopus_import":"1","isi":1,"language":[{"iso":"eng"}],"APC_amount":"2754,32 EUR","date_published":"2025-04-15T00:00:00Z","issue":"15"},{"isi":1,"language":[{"iso":"eng"}],"scopus_import":"1","page":"1008-1039","day":"01","type":"journal_article","issue":"2","date_published":"2024-02-01T00:00:00Z","acknowledgement":"The work of Yihan Zhang was supported by the European Union’s Horizon 2020 Research and Innovation Programme under Grant 682203-ERC-[Inf-Speed-Tradeoff]. The work of Shashank Vatedka was supported in part by the Core Research Grant from the Science and\r\nEngineering Research Board, India, under Grant CRG/2022/004464; and in\r\npart by the Department of Science and Technology (DST), India, under Grant\r\nDST/INT/RUS/RSF/P-41/2020 (TPN No. 65025).","date_created":"2023-12-10T23:01:00Z","status":"public","publication_status":"published","publisher":"IEEE","arxiv":1,"article_processing_charge":"No","title":"Multiple packing: Lower bounds via error exponents","oa_version":"Preprint","author":[{"orcid":"0000-0002-6465-6258","full_name":"Zhang, Yihan","first_name":"Yihan","last_name":"Zhang","id":"2ce5da42-b2ea-11eb-bba5-9f264e9d002c"},{"last_name":"Vatedka","first_name":"Shashank","full_name":"Vatedka, Shashank"}],"year":"2024","volume":70,"publication":"IEEE Transactions on Information Theory","department":[{"_id":"MaMo"}],"_id":"14665","intvolume":"        70","abstract":[{"lang":"eng","text":"We derive lower bounds on the maximal rates for multiple packings in high-dimensional Euclidean spaces. For any N > 0 and L ∈ Z ≥2 , a multiple packing is a set C of points in R n such that any point in R n lies in the intersection of at most L - 1 balls of radius √ nN around points in C . This is a natural generalization of the sphere packing problem. We study the multiple packing problem for both bounded point sets whose points have norm at most √ nP for some constant P > 0, and unbounded point sets whose points are allowed to be anywhere in R n . Given a well-known connection with coding theory, multiple packings can be viewed as the Euclidean analog of list-decodable codes, which are well-studied over finite fields. We derive the best known lower bounds on the optimal multiple packing density. This is accomplished by establishing an inequality which relates the list-decoding error exponent for additive white Gaussian noise channels, a quantity of average-case nature, to the list-decoding radius, a quantity of worst-case nature. We also derive novel bounds on the list-decoding error exponent for infinite constellations and closed-form expressions for the list-decoding error exponents for the power-constrained AWGN channel, which may be of independent interest beyond multiple packing."}],"article_type":"original","month":"02","oa":1,"corr_author":"1","date_updated":"2025-09-04T11:32:49Z","external_id":{"isi":["001166812100008"],"arxiv":["2211.04408"]},"publication_identifier":{"eissn":["1557-9654"],"issn":["0018-9448"]},"doi":"10.1109/TIT.2023.3334032","quality_controlled":"1","citation":{"short":"Y. Zhang, S. Vatedka, IEEE Transactions on Information Theory 70 (2024) 1008–1039.","ama":"Zhang Y, Vatedka S. Multiple packing: Lower bounds via error exponents. <i>IEEE Transactions on Information Theory</i>. 2024;70(2):1008-1039. doi:<a href=\"https://doi.org/10.1109/TIT.2023.3334032\">10.1109/TIT.2023.3334032</a>","chicago":"Zhang, Yihan, and Shashank Vatedka. “Multiple Packing: Lower Bounds via Error Exponents.” <i>IEEE Transactions on Information Theory</i>. IEEE, 2024. <a href=\"https://doi.org/10.1109/TIT.2023.3334032\">https://doi.org/10.1109/TIT.2023.3334032</a>.","mla":"Zhang, Yihan, and Shashank Vatedka. “Multiple Packing: Lower Bounds via Error Exponents.” <i>IEEE Transactions on Information Theory</i>, vol. 70, no. 2, IEEE, 2024, pp. 1008–39, doi:<a href=\"https://doi.org/10.1109/TIT.2023.3334032\">10.1109/TIT.2023.3334032</a>.","apa":"Zhang, Y., &#38; Vatedka, S. (2024). Multiple packing: Lower bounds via error exponents. <i>IEEE Transactions on Information Theory</i>. IEEE. <a href=\"https://doi.org/10.1109/TIT.2023.3334032\">https://doi.org/10.1109/TIT.2023.3334032</a>","ista":"Zhang Y, Vatedka S. 2024. Multiple packing: Lower bounds via error exponents. IEEE Transactions on Information Theory. 70(2), 1008–1039.","ieee":"Y. Zhang and S. Vatedka, “Multiple packing: Lower bounds via error exponents,” <i>IEEE Transactions on Information Theory</i>, vol. 70, no. 2. IEEE, pp. 1008–1039, 2024."},"main_file_link":[{"open_access":"1","url":"https://doi.org/10.48550/arXiv.2211.04408"}],"user_id":"317138e5-6ab7-11ef-aa6d-ffef3953e345"},{"oa":1,"corr_author":"1","date_updated":"2025-09-08T09:18:00Z","external_id":{"arxiv":["2403.10656"],"isi":["001304426903055"]},"publication":"Proceedings of the 2024 IEEE International Symposium on Information Theory","department":[{"_id":"MaMo"}],"_id":"17893","abstract":[{"text":"Strong data processing inequalities (SDPI) are an important object of study in Information Theory and have been well studied for f -divergences. Universal upper and lower bounds have been provided along with several applications, connecting them to impossibility (converse) results, concentration of measure, hypercontractivity, and so on. In this paper, we study Renyi divergence and the corresponding SDPI constant whose behavior seems to deviate from that of ordinary <1>-divergences. In particular, one can find examples showing that the universal upper bound relating its SDPI constant to the one of Total Variation does not hold in general. In this work, we prove, however, that the universal lower bound involving the SDPI constant of the Chi-square divergence does indeed hold. Furthermore, we also provide a characterization of the distribution that achieves the supremum when is equal to 2 and consequently compute the SDPI constant for Renyi divergence of the general binary channel.","lang":"eng"}],"month":"08","conference":{"location":"Athens, Greece","start_date":"2024-07-07","name":"ISIT: International Symposium on Information Theory","end_date":"2024-07-12"},"quality_controlled":"1","citation":{"ieee":"L. Jin, A. R. Esposito, and M. Gastpar, “Properties of the strong data processing constant for Rényi divergence,” in <i>Proceedings of the 2024 IEEE International Symposium on Information Theory</i>, Athens, Greece, 2024, pp. 3178–3183.","chicago":"Jin, Lifu, Amedeo Roberto Esposito, and Michael Gastpar. “Properties of the Strong Data Processing Constant for Rényi Divergence.” In <i>Proceedings of the 2024 IEEE International Symposium on Information Theory</i>, 3178–83. Institute of Electrical and Electronics Engineers, 2024. <a href=\"https://doi.org/10.1109/ISIT57864.2024.10619367\">https://doi.org/10.1109/ISIT57864.2024.10619367</a>.","mla":"Jin, Lifu, et al. “Properties of the Strong Data Processing Constant for Rényi Divergence.” <i>Proceedings of the 2024 IEEE International Symposium on Information Theory</i>, Institute of Electrical and Electronics Engineers, 2024, pp. 3178–83, doi:<a href=\"https://doi.org/10.1109/ISIT57864.2024.10619367\">10.1109/ISIT57864.2024.10619367</a>.","apa":"Jin, L., Esposito, A. R., &#38; Gastpar, M. (2024). Properties of the strong data processing constant for Rényi divergence. In <i>Proceedings of the 2024 IEEE International Symposium on Information Theory</i> (pp. 3178–3183). Athens, Greece: Institute of Electrical and Electronics Engineers. <a href=\"https://doi.org/10.1109/ISIT57864.2024.10619367\">https://doi.org/10.1109/ISIT57864.2024.10619367</a>","ista":"Jin L, Esposito AR, Gastpar M. 2024. Properties of the strong data processing constant for Rényi divergence. Proceedings of the 2024 IEEE International Symposium on Information Theory. ISIT: International Symposium on Information Theory, 3178–3183.","ama":"Jin L, Esposito AR, Gastpar M. Properties of the strong data processing constant for Rényi divergence. In: <i>Proceedings of the 2024 IEEE International Symposium on Information Theory</i>. Institute of Electrical and Electronics Engineers; 2024:3178-3183. doi:<a href=\"https://doi.org/10.1109/ISIT57864.2024.10619367\">10.1109/ISIT57864.2024.10619367</a>","short":"L. Jin, A.R. Esposito, M. Gastpar, in:, Proceedings of the 2024 IEEE International Symposium on Information Theory, Institute of Electrical and Electronics Engineers, 2024, pp. 3178–3183."},"main_file_link":[{"url":"https://doi.org/10.48550/arXiv.2403.10656 ","open_access":"1"}],"user_id":"317138e5-6ab7-11ef-aa6d-ffef3953e345","publication_identifier":{"isbn":["9798350382846"],"issn":["2157-8095"]},"doi":"10.1109/ISIT57864.2024.10619367","date_published":"2024-08-19T00:00:00Z","isi":1,"language":[{"iso":"eng"}],"scopus_import":"1","day":"19","page":"3178-3183","type":"conference","arxiv":1,"article_processing_charge":"No","title":"Properties of the strong data processing constant for Rényi divergence","oa_version":"Preprint","author":[{"last_name":"Jin","first_name":"Lifu","full_name":"Jin, Lifu"},{"first_name":"Amedeo Roberto","full_name":"Esposito, Amedeo Roberto","last_name":"Esposito","id":"9583e921-e1ad-11ec-9862-cef099626dc9"},{"last_name":"Gastpar","first_name":"Michael","full_name":"Gastpar, Michael"}],"year":"2024","acknowledgement":"The work in this paper was supported in part by the Swiss National Science Foundation under Grant 200364.\r\n","date_created":"2024-09-08T22:01:12Z","status":"public","publication_status":"published","publisher":"Institute of Electrical and Electronics Engineers"},{"corr_author":"1","date_updated":"2025-09-08T09:18:44Z","external_id":{"isi":["001304426902023"]},"publication":"Proceedings of the 2024 IEEE International Symposium on Information Theory ","department":[{"_id":"MaMo"}],"_id":"17894","abstract":[{"lang":"eng","text":"Sibson's α -mutual information has received renewed attention recently in several contexts: concentration of measure under dependence, statistical learning, hypothesis testing, and estimation theory. In this work, we introduce several variational representations of Sibson's α -mutual information: 1) as a supremum over joint distributions of (a combination of) KL divergences; and 2) as a supremum over functions of opportune expected values. Leveraging them, we produce a variety of novel and known results, including a generalization of transportation-cost inequalities and Fano's inequality."}],"month":"08","conference":{"end_date":"2024-07-12","name":"ISIT: International Symposium on Information Theory","start_date":"2024-07-07","location":"Athens, Greece"},"quality_controlled":"1","citation":{"ieee":"A. R. Esposito, M. Gastpar, and I. Issa, “Variational characterizations of Sibson’s α-mutual information,” in <i>Proceedings of the 2024 IEEE International Symposium on Information Theory </i>, Athens, Greece, 2024, pp. 2110–2115.","chicago":"Esposito, Amedeo Roberto, Michael Gastpar, and Ibrahim Issa. “Variational Characterizations of Sibson’s α-Mutual Information.” In <i>Proceedings of the 2024 IEEE International Symposium on Information Theory </i>, 2110–15. Institute of Electrical and Electronics Engineers, 2024. <a href=\"https://doi.org/10.1109/ISIT57864.2024.10619378\">https://doi.org/10.1109/ISIT57864.2024.10619378</a>.","apa":"Esposito, A. R., Gastpar, M., &#38; Issa, I. (2024). Variational characterizations of Sibson’s α-mutual information. In <i>Proceedings of the 2024 IEEE International Symposium on Information Theory </i> (pp. 2110–2115). Athens, Greece: Institute of Electrical and Electronics Engineers. <a href=\"https://doi.org/10.1109/ISIT57864.2024.10619378\">https://doi.org/10.1109/ISIT57864.2024.10619378</a>","ista":"Esposito AR, Gastpar M, Issa I. 2024. Variational characterizations of Sibson’s α-mutual information. Proceedings of the 2024 IEEE International Symposium on Information Theory . ISIT: International Symposium on Information Theory, 2110–2115.","mla":"Esposito, Amedeo Roberto, et al. “Variational Characterizations of Sibson’s α-Mutual Information.” <i>Proceedings of the 2024 IEEE International Symposium on Information Theory </i>, Institute of Electrical and Electronics Engineers, 2024, pp. 2110–15, doi:<a href=\"https://doi.org/10.1109/ISIT57864.2024.10619378\">10.1109/ISIT57864.2024.10619378</a>.","ama":"Esposito AR, Gastpar M, Issa I. Variational characterizations of Sibson’s α-mutual information. In: <i>Proceedings of the 2024 IEEE International Symposium on Information Theory </i>. Institute of Electrical and Electronics Engineers; 2024:2110-2115. doi:<a href=\"https://doi.org/10.1109/ISIT57864.2024.10619378\">10.1109/ISIT57864.2024.10619378</a>","short":"A.R. Esposito, M. Gastpar, I. Issa, in:, Proceedings of the 2024 IEEE International Symposium on Information Theory , Institute of Electrical and Electronics Engineers, 2024, pp. 2110–2115."},"user_id":"317138e5-6ab7-11ef-aa6d-ffef3953e345","publication_identifier":{"issn":["2157-8095"],"isbn":["9798350382846"]},"doi":"10.1109/ISIT57864.2024.10619378","date_published":"2024-08-19T00:00:00Z","isi":1,"language":[{"iso":"eng"}],"scopus_import":"1","day":"19","page":"2110-2115","type":"conference","article_processing_charge":"No","title":"Variational characterizations of Sibson's α-mutual information","oa_version":"None","author":[{"first_name":"Amedeo Roberto","full_name":"Esposito, Amedeo Roberto","id":"9583e921-e1ad-11ec-9862-cef099626dc9","last_name":"Esposito"},{"full_name":"Gastpar, Michael","first_name":"Michael","last_name":"Gastpar"},{"first_name":"Ibrahim","full_name":"Issa, Ibrahim","last_name":"Issa"}],"year":"2024","acknowledgement":"The work in this paper was supported in part by the Swiss National Science Foundation under Grant 200364.","date_created":"2024-09-08T22:01:12Z","status":"public","publication_status":"published","publisher":"Institute of Electrical and Electronics Engineers"},{"page":"1586-1591","day":"19","_id":"17895","abstract":[{"lang":"eng","text":"We propose a concatenated code construction for a class of discrete-alphabet oblivious arbitrarily varying channels (AVCs) with cost constraints. The code has time and space complexity polynomial in the blocklength n . It uses a Reed-Solomon outer code, logarithmic blocklength random inner codes, and stochastic encoding by permuting the codeword before transmission. When the channel satisfies a condition called strong DS-nonsymmetrizability (a modified version of nonsymmetrizability originally due to Dobrushin and Stambler), we show that the code achieves a rate that for a variety of oblivious AVCs (such as classically studied error/erasure channels) match the known capacities."}],"type":"conference","month":"08","isi":1,"language":[{"iso":"eng"}],"publication":"Proceedings of the 2024 IEEE International Symposium on Information Theory ","scopus_import":"1","department":[{"_id":"MaMo"}],"date_updated":"2025-09-08T09:19:25Z","external_id":{"isi":["001304426901091"]},"date_published":"2024-08-19T00:00:00Z","publication_status":"published","doi":"10.1109/ISIT57864.2024.10619362","publisher":"Institute of Electrical and Electronics Engineers","acknowledgement":"The work of M. Langberg and A. D. Sarwate was supported in part by the US NSF under awards CCF-1909451 and CCF1909468. B. K. Dey was supported in part by the Bharti Centre\r\nfor Communication in IIT Bombay. ","publication_identifier":{"isbn":["9798350382846"],"issn":["2157-8095"]},"date_created":"2024-09-08T22:01:12Z","status":"public","author":[{"first_name":"B. K.","full_name":"Dey, B. K.","last_name":"Dey"},{"last_name":"Jaggi","full_name":"Jaggi, S.","first_name":"S."},{"full_name":"Langberg, M.","first_name":"M.","last_name":"Langberg"},{"full_name":"Sarwate, A. D.","first_name":"A. D.","last_name":"Sarwate"},{"full_name":"Zhang, Yihan","first_name":"Yihan","last_name":"Zhang","id":"2ce5da42-b2ea-11eb-bba5-9f264e9d002c","orcid":"0000-0002-6465-6258"}],"citation":{"chicago":"Dey, B. K., S. Jaggi, M. Langberg, A. D. Sarwate, and Yihan Zhang. “Computationally Efficient Codes for Strongly Dobrushin-Stambler Nonsymmetrizable Oblivious AVCs.” In <i>Proceedings of the 2024 IEEE International Symposium on Information Theory </i>, 1586–91. Institute of Electrical and Electronics Engineers, 2024. <a href=\"https://doi.org/10.1109/ISIT57864.2024.10619362\">https://doi.org/10.1109/ISIT57864.2024.10619362</a>.","apa":"Dey, B. K., Jaggi, S., Langberg, M., Sarwate, A. D., &#38; Zhang, Y. (2024). Computationally efficient codes for strongly Dobrushin-Stambler nonsymmetrizable oblivious AVCs. In <i>Proceedings of the 2024 IEEE International Symposium on Information Theory </i> (pp. 1586–1591). Athens, Greece: Institute of Electrical and Electronics Engineers. <a href=\"https://doi.org/10.1109/ISIT57864.2024.10619362\">https://doi.org/10.1109/ISIT57864.2024.10619362</a>","ista":"Dey BK, Jaggi S, Langberg M, Sarwate AD, Zhang Y. 2024. Computationally efficient codes for strongly Dobrushin-Stambler nonsymmetrizable oblivious AVCs. Proceedings of the 2024 IEEE International Symposium on Information Theory . ISIT: International Symposium on Information Theory, 1586–1591.","mla":"Dey, B. K., et al. “Computationally Efficient Codes for Strongly Dobrushin-Stambler Nonsymmetrizable Oblivious AVCs.” <i>Proceedings of the 2024 IEEE International Symposium on Information Theory </i>, Institute of Electrical and Electronics Engineers, 2024, pp. 1586–91, doi:<a href=\"https://doi.org/10.1109/ISIT57864.2024.10619362\">10.1109/ISIT57864.2024.10619362</a>.","ieee":"B. K. Dey, S. Jaggi, M. Langberg, A. D. Sarwate, and Y. Zhang, “Computationally efficient codes for strongly Dobrushin-Stambler nonsymmetrizable oblivious AVCs,” in <i>Proceedings of the 2024 IEEE International Symposium on Information Theory </i>, Athens, Greece, 2024, pp. 1586–1591.","short":"B.K. Dey, S. Jaggi, M. Langberg, A.D. Sarwate, Y. Zhang, in:, Proceedings of the 2024 IEEE International Symposium on Information Theory , Institute of Electrical and Electronics Engineers, 2024, pp. 1586–1591.","ama":"Dey BK, Jaggi S, Langberg M, Sarwate AD, Zhang Y. Computationally efficient codes for strongly Dobrushin-Stambler nonsymmetrizable oblivious AVCs. In: <i>Proceedings of the 2024 IEEE International Symposium on Information Theory </i>. Institute of Electrical and Electronics Engineers; 2024:1586-1591. doi:<a href=\"https://doi.org/10.1109/ISIT57864.2024.10619362\">10.1109/ISIT57864.2024.10619362</a>"},"year":"2024","user_id":"317138e5-6ab7-11ef-aa6d-ffef3953e345","conference":{"location":"Athens, Greece","start_date":"2024-07-07","end_date":"2024-07-12","name":"ISIT: International Symposium on Information Theory"},"title":"Computationally efficient codes for strongly Dobrushin-Stambler nonsymmetrizable oblivious AVCs","quality_controlled":"1","article_processing_charge":"No","oa_version":"None"},{"article_processing_charge":"No","title":"Codes for adversaries: Between worst-case and average-case jamming","oa_version":"None","author":[{"last_name":"Dey","first_name":"Bikash Kumar","full_name":"Dey, Bikash Kumar"},{"first_name":"Sidharth","full_name":"Jaggi, Sidharth","last_name":"Jaggi"},{"first_name":"Michael","full_name":"Langberg, Michael","last_name":"Langberg"},{"last_name":"Sarwate","first_name":"Anand D.","full_name":"Sarwate, Anand D."},{"orcid":"0000-0002-6465-6258","full_name":"Zhang, Yihan","first_name":"Yihan","last_name":"Zhang","id":"2ce5da42-b2ea-11eb-bba5-9f264e9d002c"}],"year":"2024","status":"public","date_created":"2024-12-15T23:01:50Z","publication_status":"published","publisher":"Now Publishers","issue":"3-4","date_published":"2024-12-03T00:00:00Z","language":[{"iso":"eng"}],"scopus_import":"1","page":"300-588","day":"03","type":"journal_article","OA_type":"closed access","quality_controlled":"1","citation":{"short":"B.K. Dey, S. Jaggi, M. Langberg, A.D. Sarwate, Y. Zhang, Foundations and Trends in Communications and Information Theory 21 (2024) 300–588.","ama":"Dey BK, Jaggi S, Langberg M, Sarwate AD, Zhang Y. Codes for adversaries: Between worst-case and average-case jamming. <i>Foundations and Trends in Communications and Information Theory</i>. 2024;21(3-4):300-588. doi:<a href=\"https://doi.org/10.1561/0100000112\">10.1561/0100000112</a>","chicago":"Dey, Bikash Kumar, Sidharth Jaggi, Michael Langberg, Anand D. Sarwate, and Yihan Zhang. “Codes for Adversaries: Between Worst-Case and Average-Case Jamming.” <i>Foundations and Trends in Communications and Information Theory</i>. Now Publishers, 2024. <a href=\"https://doi.org/10.1561/0100000112\">https://doi.org/10.1561/0100000112</a>.","ista":"Dey BK, Jaggi S, Langberg M, Sarwate AD, Zhang Y. 2024. Codes for adversaries: Between worst-case and average-case jamming. Foundations and Trends in Communications and Information Theory. 21(3–4), 300–588.","apa":"Dey, B. K., Jaggi, S., Langberg, M., Sarwate, A. D., &#38; Zhang, Y. (2024). Codes for adversaries: Between worst-case and average-case jamming. <i>Foundations and Trends in Communications and Information Theory</i>. Now Publishers. <a href=\"https://doi.org/10.1561/0100000112\">https://doi.org/10.1561/0100000112</a>","mla":"Dey, Bikash Kumar, et al. “Codes for Adversaries: Between Worst-Case and Average-Case Jamming.” <i>Foundations and Trends in Communications and Information Theory</i>, vol. 21, no. 3–4, Now Publishers, 2024, pp. 300–588, doi:<a href=\"https://doi.org/10.1561/0100000112\">10.1561/0100000112</a>.","ieee":"B. K. Dey, S. Jaggi, M. Langberg, A. D. Sarwate, and Y. Zhang, “Codes for adversaries: Between worst-case and average-case jamming,” <i>Foundations and Trends in Communications and Information Theory</i>, vol. 21, no. 3–4. Now Publishers, pp. 300–588, 2024."},"user_id":"2DF688A6-F248-11E8-B48F-1D18A9856A87","publication_identifier":{"issn":["1567-2190"],"eissn":["1567-2328"]},"doi":"10.1561/0100000112","corr_author":"1","date_updated":"2024-12-16T10:38:44Z","volume":21,"publication":"Foundations and Trends in Communications and Information Theory","department":[{"_id":"MaMo"}],"_id":"18652","abstract":[{"lang":"eng","text":"Over the last 70 years, information theory and coding has enabled communication technologies that have had an astounding impact on our lives. This is possible due to the match between encoding/decoding strategies and corresponding channel models. Traditional studies of channels have taken one of two extremes: Shannon-theoretic models are inherently average-case in which channel noise is governed by a memoryless stochastic process, whereas coding-theoretic (referred to as “Hamming”) models take a worst-case, adversarial, view of the noise. However, for several existing and emerging communication systems the Shannon/average-case view may be too optimistic, whereas the Hamming/worstcase view may be too pessimistic. This monograph takes up the challenge of studying adversarial channel models that lie between the Shannon and Hamming extremes."}],"intvolume":"        21","article_type":"original","month":"12"}]
