[{"month":"02","publisher":"IEEE","scopus_import":"1","quality_controlled":"1","oa_version":"None","abstract":[{"text":"We propose a novel approach to concentration for non-independent random variables. The main idea is to “pretend” that the random variables are independent and pay a multiplicative price measuring how far they are from actually being independent. This price is encapsulated in the Hellinger integral between the joint and the product of the marginals, which is then upper bounded leveraging tensorisation properties. Our bounds represent a natural generalisation of concentration inequalities in the presence of dependence: we recover exactly the classical bounds (McDiarmid’s inequality) when the random variables are independent. Furthermore, in a “large deviations” regime, we obtain the same decay in the probability as for the independent case, even when the random variables display non-trivial dependencies. To show this, we consider a number of applications of interest. First, we provide a bound for Markov chains with finite state space. Then, we consider the Simple Symmetric Random Walk, which is a non-contracting Markov chain, and a non-Markovian setting in which the stochastic process depends on its entire past. To conclude, we propose an application to Markov Chain Monte Carlo methods, where our approach leads to an improved lower bound on the minimum burn-in period required to reach a certain accuracy. In all of these settings, we provide a regime of parameters in which our bound fares better than what the state of the art can provide.","lang":"eng"}],"date_created":"2024-03-24T23:01:00Z","related_material":{"record":[{"relation":"earlier_version","id":"14922","status":"public"}]},"doi":"10.1109/TIT.2024.3367767","date_published":"2024-02-20T00:00:00Z","language":[{"iso":"eng"}],"publication":"IEEE Transactions on Information Theory","day":"20","year":"2024","publication_status":"inpress","publication_identifier":{"issn":["0018-9448"],"eissn":["1557-9654"]},"status":"public","project":[{"_id":"059876FA-7A3F-11EA-A408-12923DDC885E","name":"Prix Lopez-Loretta 2019 - Marco Mondelli"}],"type":"journal_article","article_type":"original","_id":"15172","department":[{"_id":"MaMo"}],"title":"Concentration without independence via information measures","external_id":{"arxiv":["2303.07245"]},"article_processing_charge":"No","author":[{"last_name":"Esposito","full_name":"Esposito, Amedeo Roberto","first_name":"Amedeo Roberto","id":"9583e921-e1ad-11ec-9862-cef099626dc9"},{"id":"27EB676C-8706-11E9-9510-7717E6697425","first_name":"Marco","last_name":"Mondelli","orcid":"0000-0002-3242-7020","full_name":"Mondelli, Marco"}],"user_id":"2DF688A6-F248-11E8-B48F-1D18A9856A87","citation":{"chicago":"Esposito, Amedeo Roberto, and Marco Mondelli. “Concentration without Independence via Information Measures.” IEEE Transactions on Information Theory. IEEE, n.d. https://doi.org/10.1109/TIT.2024.3367767.","ista":"Esposito AR, Mondelli M. Concentration without independence via information measures. IEEE Transactions on Information Theory.","mla":"Esposito, Amedeo Roberto, and Marco Mondelli. “Concentration without Independence via Information Measures.” IEEE Transactions on Information Theory, IEEE, doi:10.1109/TIT.2024.3367767.","apa":"Esposito, A. R., & Mondelli, M. (n.d.). Concentration without independence via information measures. IEEE Transactions on Information Theory. IEEE. https://doi.org/10.1109/TIT.2024.3367767","ama":"Esposito AR, Mondelli M. Concentration without independence via information measures. IEEE Transactions on Information Theory. doi:10.1109/TIT.2024.3367767","short":"A.R. Esposito, M. Mondelli, IEEE Transactions on Information Theory (n.d.).","ieee":"A. R. Esposito and M. Mondelli, “Concentration without independence via information measures,” IEEE Transactions on Information Theory. IEEE."},"date_updated":"2024-03-25T07:15:51Z"},{"author":[{"full_name":"Resch, Nicolas","last_name":"Resch","first_name":"Nicolas"},{"first_name":"Chen","full_name":"Yuan, Chen","last_name":"Yuan"},{"first_name":"Yihan","id":"2ce5da42-b2ea-11eb-bba5-9f264e9d002c","last_name":"Zhang","orcid":"0000-0002-6465-6258","full_name":"Zhang, Yihan"}],"external_id":{"arxiv":["2210.07754"]},"article_processing_charge":"Yes","title":"Zero-rate thresholds and new capacity bounds for list-decoding and list-recovery","citation":{"chicago":"Resch, Nicolas, Chen Yuan, and Yihan Zhang. “Zero-Rate Thresholds and New Capacity Bounds for List-Decoding and List-Recovery.” In 50th International Colloquium on Automata, Languages, and Programming, Vol. 261. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2023. https://doi.org/10.4230/LIPIcs.ICALP.2023.99.","ista":"Resch N, Yuan C, Zhang Y. 2023. Zero-rate thresholds and new capacity bounds for list-decoding and list-recovery. 50th International Colloquium on Automata, Languages, and Programming. ICALP: International Colloquium on Automata, Languages, and Programming, LIPIcs, vol. 261, 99.","mla":"Resch, Nicolas, et al. “Zero-Rate Thresholds and New Capacity Bounds for List-Decoding and List-Recovery.” 50th International Colloquium on Automata, Languages, and Programming, vol. 261, 99, Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2023, doi:10.4230/LIPIcs.ICALP.2023.99.","ama":"Resch N, Yuan C, Zhang Y. Zero-rate thresholds and new capacity bounds for list-decoding and list-recovery. In: 50th International Colloquium on Automata, Languages, and Programming. Vol 261. Schloss Dagstuhl - Leibniz-Zentrum für Informatik; 2023. doi:10.4230/LIPIcs.ICALP.2023.99","apa":"Resch, N., Yuan, C., & Zhang, Y. (2023). Zero-rate thresholds and new capacity bounds for list-decoding and list-recovery. In 50th International Colloquium on Automata, Languages, and Programming (Vol. 261). Paderborn, Germany: Schloss Dagstuhl - Leibniz-Zentrum für Informatik. https://doi.org/10.4230/LIPIcs.ICALP.2023.99","ieee":"N. Resch, C. Yuan, and Y. Zhang, “Zero-rate thresholds and new capacity bounds for list-decoding and list-recovery,” in 50th International Colloquium on Automata, Languages, and Programming, Paderborn, Germany, 2023, vol. 261.","short":"N. Resch, C. Yuan, Y. Zhang, in:, 50th International Colloquium on Automata, Languages, and Programming, Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2023."},"user_id":"2DF688A6-F248-11E8-B48F-1D18A9856A87","article_number":"99","date_published":"2023-07-01T00:00:00Z","doi":"10.4230/LIPIcs.ICALP.2023.99","date_created":"2023-08-20T22:01:13Z","has_accepted_license":"1","year":"2023","day":"01","publication":"50th International Colloquium on Automata, Languages, and Programming","publisher":"Schloss Dagstuhl - Leibniz-Zentrum für Informatik","quality_controlled":"1","oa":1,"acknowledgement":"Nicolas Resch: Research supported in part by ERC H2020 grant No.74079 (ALGSTRONGCRYPTO). Chen Yuan: Research supported in part by the National Key Research and Development Projects under Grant 2022YFA1004900 and Grant 2021YFE0109900, the National Natural Science Foundation of China under Grant 12101403 and Grant 12031011.\r\nAcknowledgements YZ is grateful to Shashank Vatedka, Diyuan Wu and Fengxing Zhu for inspiring discussions.","file_date_updated":"2023-08-21T07:23:18Z","department":[{"_id":"MaMo"}],"date_updated":"2023-08-21T07:26:01Z","ddc":["000"],"type":"conference","conference":{"name":"ICALP: International Colloquium on Automata, Languages, and Programming","end_date":"2023-07-14","location":"Paderborn, Germany","start_date":"2023-07-10"},"tmp":{"legal_code_url":"https://creativecommons.org/licenses/by/4.0/legalcode","image":"/images/cc_by.png","name":"Creative Commons Attribution 4.0 International Public License (CC-BY 4.0)","short":"CC BY (4.0)"},"status":"public","_id":"14083","volume":261,"license":"https://creativecommons.org/licenses/by/4.0/","publication_identifier":{"issn":["1868-8969"],"isbn":["9783959772785"]},"publication_status":"published","file":[{"file_name":"2023_LIPIcsICALP_Resch.pdf","date_created":"2023-08-21T07:23:18Z","file_size":1141497,"date_updated":"2023-08-21T07:23:18Z","creator":"dernst","success":1,"file_id":"14091","checksum":"a449143fec3fbebb092cb8ef3b53c226","content_type":"application/pdf","relation":"main_file","access_level":"open_access"}],"language":[{"iso":"eng"}],"scopus_import":"1","alternative_title":["LIPIcs"],"month":"07","intvolume":" 261","abstract":[{"lang":"eng","text":"In this work we consider the list-decodability and list-recoverability of arbitrary q-ary codes, for all integer values of q ≥ 2. A code is called (p,L)_q-list-decodable if every radius pn Hamming ball contains less than L codewords; (p,𝓁,L)_q-list-recoverability is a generalization where we place radius pn Hamming balls on every point of a combinatorial rectangle with side length 𝓁 and again stipulate that there be less than L codewords.\r\nOur main contribution is to precisely calculate the maximum value of p for which there exist infinite families of positive rate (p,𝓁,L)_q-list-recoverable codes, the quantity we call the zero-rate threshold. Denoting this value by p_*, we in fact show that codes correcting a p_*+ε fraction of errors must have size O_ε(1), i.e., independent of n. Such a result is typically referred to as a \"Plotkin bound.\" To complement this, a standard random code with expurgation construction shows that there exist positive rate codes correcting a p_*-ε fraction of errors. We also follow a classical proof template (typically attributed to Elias and Bassalygo) to derive from the zero-rate threshold other tradeoffs between rate and decoding radius for list-decoding and list-recovery.\r\nTechnically, proving the Plotkin bound boils down to demonstrating the Schur convexity of a certain function defined on the q-simplex as well as the convexity of a univariate function derived from it. We remark that an earlier argument claimed similar results for q-ary list-decoding; however, we point out that this earlier proof is flawed."}],"oa_version":"Published Version"},{"related_material":{"link":[{"relation":"software","url":"https://github.com/fcamilli95/Structured-PCA-"}]},"volume":120,"issue":"30","publication_status":"published","publication_identifier":{"eissn":["1091-6490"]},"language":[{"iso":"eng"}],"file":[{"creator":"dernst","date_updated":"2023-07-31T07:30:48Z","file_size":995933,"date_created":"2023-07-31T07:30:48Z","file_name":"2023_PNAS_Barbier.pdf","access_level":"open_access","relation":"main_file","content_type":"application/pdf","file_id":"13323","checksum":"1fc06228afdb3aa80cf8e7766bcf9dc5","success":1}],"scopus_import":"1","intvolume":" 120","month":"07","abstract":[{"lang":"eng","text":"How do statistical dependencies in measurement noise influence high-dimensional inference? To answer this, we study the paradigmatic spiked matrix model of principal components analysis (PCA), where a rank-one matrix is corrupted by additive noise. We go beyond the usual independence assumption on the noise entries, by drawing the noise from a low-order polynomial orthogonal matrix ensemble. The resulting noise correlations make the setting relevant for applications but analytically challenging. We provide characterization of the Bayes optimal limits of inference in this model. If the spike is rotation invariant, we show that standard spectral PCA is optimal. However, for more general priors, both PCA and the existing approximate message-passing algorithm (AMP) fall short of achieving the information-theoretic limits, which we compute using the replica method from statistical physics. We thus propose an AMP, inspired by the theory of adaptive Thouless–Anderson–Palmer equations, which is empirically observed to saturate the conjectured theoretical limit. This AMP comes with a rigorous state evolution analysis tracking its performance. Although we focus on specific noise distributions, our methodology can be generalized to a wide class of trace matrix ensembles at the cost of more involved expressions. Finally, despite the seemingly strong assumption of rotation-invariant noise, our theory empirically predicts algorithmic performance on real data, pointing at strong universality properties."}],"oa_version":"Published Version","pmid":1,"file_date_updated":"2023-07-31T07:30:48Z","department":[{"_id":"MaMo"}],"date_updated":"2023-10-17T11:44:55Z","ddc":["000"],"tmp":{"legal_code_url":"https://creativecommons.org/licenses/by/4.0/legalcode","image":"/images/cc_by.png","name":"Creative Commons Attribution 4.0 International Public License (CC-BY 4.0)","short":"CC BY (4.0)"},"article_type":"original","type":"journal_article","status":"public","_id":"13315","date_created":"2023-07-30T22:01:02Z","date_published":"2023-07-25T00:00:00Z","doi":"10.1073/pnas.2302028120","year":"2023","has_accepted_license":"1","publication":"Proceedings of the National Academy of Sciences of the United States of America","day":"25","oa":1,"quality_controlled":"1","publisher":"National Academy of Sciences","acknowledgement":"J.B. was funded by the European Union (ERC, CHORAL, project number 101039794). Views and opinions expressed are however those of the author(s) only and do not necessarily reflect those of the European Union or the European Research Council. Neither the European Union nor the granting authority can be held responsible for them. M.M. was supported by the 2019 Lopez-Loreta Prize. We would like to thank the reviewers for the insightful comments and, in particular, for suggesting the BAMP-inspired denoisers leading to AMP-AP.","article_processing_charge":"Yes (in subscription journal)","external_id":{"pmid":["37463204"]},"author":[{"first_name":"Jean","last_name":"Barbier","full_name":"Barbier, Jean"},{"first_name":"Francesco","full_name":"Camilli, Francesco","last_name":"Camilli"},{"first_name":"Marco","id":"27EB676C-8706-11E9-9510-7717E6697425","last_name":"Mondelli","orcid":"0000-0002-3242-7020","full_name":"Mondelli, Marco"},{"last_name":"Sáenz","full_name":"Sáenz, Manuel","first_name":"Manuel"}],"title":"Fundamental limits in structured principal component analysis and how to reach them","citation":{"mla":"Barbier, Jean, et al. “Fundamental Limits in Structured Principal Component Analysis and How to Reach Them.” Proceedings of the National Academy of Sciences of the United States of America, vol. 120, no. 30, e2302028120, National Academy of Sciences, 2023, doi:10.1073/pnas.2302028120.","short":"J. Barbier, F. Camilli, M. Mondelli, M. Sáenz, Proceedings of the National Academy of Sciences of the United States of America 120 (2023).","ieee":"J. Barbier, F. Camilli, M. Mondelli, and M. Sáenz, “Fundamental limits in structured principal component analysis and how to reach them,” Proceedings of the National Academy of Sciences of the United States of America, vol. 120, no. 30. National Academy of Sciences, 2023.","apa":"Barbier, J., Camilli, F., Mondelli, M., & Sáenz, M. (2023). Fundamental limits in structured principal component analysis and how to reach them. Proceedings of the National Academy of Sciences of the United States of America. National Academy of Sciences. https://doi.org/10.1073/pnas.2302028120","ama":"Barbier J, Camilli F, Mondelli M, Sáenz M. Fundamental limits in structured principal component analysis and how to reach them. Proceedings of the National Academy of Sciences of the United States of America. 2023;120(30). doi:10.1073/pnas.2302028120","chicago":"Barbier, Jean, Francesco Camilli, Marco Mondelli, and Manuel Sáenz. “Fundamental Limits in Structured Principal Component Analysis and How to Reach Them.” Proceedings of the National Academy of Sciences of the United States of America. National Academy of Sciences, 2023. https://doi.org/10.1073/pnas.2302028120.","ista":"Barbier J, Camilli F, Mondelli M, Sáenz M. 2023. Fundamental limits in structured principal component analysis and how to reach them. Proceedings of the National Academy of Sciences of the United States of America. 120(30), e2302028120."},"user_id":"2DF688A6-F248-11E8-B48F-1D18A9856A87","project":[{"name":"Prix Lopez-Loretta 2019 - Marco Mondelli","_id":"059876FA-7A3F-11EA-A408-12923DDC885E"}],"article_number":"e2302028120"},{"_id":"14459","status":"public","conference":{"start_date":"2023-07-23","end_date":"2023-07-29","location":"Honolulu, Hawaii, HI, United States","name":"ICML: International Conference on Machine Learning"},"type":"conference","date_updated":"2023-10-31T08:52:28Z","department":[{"_id":"MaMo"},{"_id":"DaAl"}],"oa_version":"Preprint","abstract":[{"text":"Autoencoders are a popular model in many branches of machine learning and lossy data compression. However, their fundamental limits, the performance of gradient methods and the features learnt during optimization remain poorly understood, even in the two-layer setting. In fact, earlier work has considered either linear autoencoders or specific training regimes (leading to vanishing or diverging compression rates). Our paper addresses this gap by focusing on non-linear two-layer autoencoders trained in the challenging proportional regime in which the input dimension scales linearly with the size of the representation. Our results characterize the minimizers of the population risk, and show that such minimizers are achieved by gradient methods; their structure is also unveiled, thus leading to a concise description of the features obtained via training. For the special case of a sign activation function, our analysis establishes the fundamental limits for the lossy compression of Gaussian sources via (shallow) autoencoders. Finally, while the results are proved for Gaussian data, numerical simulations on standard datasets display the universality of the theoretical predictions.","lang":"eng"}],"intvolume":" 202","month":"07","main_file_link":[{"url":"https://doi.org/10.48550/arXiv.2212.13468","open_access":"1"}],"alternative_title":["PMLR"],"scopus_import":"1","language":[{"iso":"eng"}],"publication_status":"published","publication_identifier":{"eissn":["2640-3498"]},"volume":202,"project":[{"name":"Prix Lopez-Loretta 2019 - Marco Mondelli","_id":"059876FA-7A3F-11EA-A408-12923DDC885E"}],"user_id":"2DF688A6-F248-11E8-B48F-1D18A9856A87","citation":{"ista":"Shevchenko A, Kögler K, Hassani H, Mondelli M. 2023. Fundamental limits of two-layer autoencoders, and achieving them with gradient methods. Proceedings of the 40th International Conference on Machine Learning. ICML: International Conference on Machine Learning, PMLR, vol. 202, 31151–31209.","chicago":"Shevchenko, Aleksandr, Kevin Kögler, Hamed Hassani, and Marco Mondelli. “Fundamental Limits of Two-Layer Autoencoders, and Achieving Them with Gradient Methods.” In Proceedings of the 40th International Conference on Machine Learning, 202:31151–209. ML Research Press, 2023.","ama":"Shevchenko A, Kögler K, Hassani H, Mondelli M. Fundamental limits of two-layer autoencoders, and achieving them with gradient methods. In: Proceedings of the 40th International Conference on Machine Learning. Vol 202. ML Research Press; 2023:31151-31209.","apa":"Shevchenko, A., Kögler, K., Hassani, H., & Mondelli, M. (2023). Fundamental limits of two-layer autoencoders, and achieving them with gradient methods. In Proceedings of the 40th International Conference on Machine Learning (Vol. 202, pp. 31151–31209). Honolulu, Hawaii, HI, United States: ML Research Press.","ieee":"A. Shevchenko, K. Kögler, H. Hassani, and M. Mondelli, “Fundamental limits of two-layer autoencoders, and achieving them with gradient methods,” in Proceedings of the 40th International Conference on Machine Learning, Honolulu, Hawaii, HI, United States, 2023, vol. 202, pp. 31151–31209.","short":"A. Shevchenko, K. Kögler, H. Hassani, M. Mondelli, in:, Proceedings of the 40th International Conference on Machine Learning, ML Research Press, 2023, pp. 31151–31209.","mla":"Shevchenko, Aleksandr, et al. “Fundamental Limits of Two-Layer Autoencoders, and Achieving Them with Gradient Methods.” Proceedings of the 40th International Conference on Machine Learning, vol. 202, ML Research Press, 2023, pp. 31151–209."},"title":"Fundamental limits of two-layer autoencoders, and achieving them with gradient methods","article_processing_charge":"No","external_id":{"arxiv":["2212.13468"]},"author":[{"id":"F2B06EC2-C99E-11E9-89F0-752EE6697425","first_name":"Aleksandr","full_name":"Shevchenko, Aleksandr","last_name":"Shevchenko"},{"full_name":"Kögler, Kevin","last_name":"Kögler","first_name":"Kevin","id":"94ec913c-dc85-11ea-9058-e5051ab2428b"},{"full_name":"Hassani, Hamed","last_name":"Hassani","first_name":"Hamed"},{"first_name":"Marco","id":"27EB676C-8706-11E9-9510-7717E6697425","last_name":"Mondelli","orcid":"0000-0002-3242-7020","full_name":"Mondelli, Marco"}],"acknowledgement":"Aleksandr Shevchenko, Kevin Kogler and Marco Mondelli are supported by the 2019 Lopez-Loreta Prize. Hamed Hassani acknowledges the support by the NSF CIF award (1910056) and the NSF Institute for CORE Emerging Methods in Data Science (EnCORE).","oa":1,"quality_controlled":"1","publisher":"ML Research Press","publication":"Proceedings of the 40th International Conference on Machine Learning","day":"30","year":"2023","date_created":"2023-10-29T23:01:17Z","date_published":"2023-07-30T00:00:00Z","page":"31151-31209"},{"_id":"12838","type":"journal_article","article_type":"original","status":"public","date_updated":"2023-12-13T11:16:46Z","department":[{"_id":"MaMo"}],"abstract":[{"lang":"eng","text":"We study the problem of high-dimensional multiple packing in Euclidean space. Multiple packing is a natural generalization of sphere packing and is defined as follows. Let N > 0 and L ∈ Z ≽2 . A multiple packing is a set C of points in R n such that any point in R n lies in the intersection of at most L – 1 balls of radius √ nN around points in C . Given a well-known connection with coding theory, multiple packings can be viewed as the Euclidean analog of list-decodable codes, which are well-studied for finite fields. In this paper, we derive the best known lower bounds on the optimal density of list-decodable infinite constellations for constant L under a stronger notion called average-radius multiple packing. To this end, we apply tools from high-dimensional geometry and large deviation theory."}],"oa_version":"Preprint","scopus_import":"1","main_file_link":[{"url":"https://doi.org/10.48550/arXiv.2211.04407","open_access":"1"}],"month":"07","intvolume":" 69","publication_identifier":{"issn":["0018-9448"],"eissn":["1557-9654"]},"publication_status":"published","language":[{"iso":"eng"}],"issue":"7","volume":69,"citation":{"mla":"Zhang, Yihan, and Shashank Vatedka. “Multiple Packing: Lower Bounds via Infinite Constellations.” IEEE Transactions on Information Theory, vol. 69, no. 7, IEEE, 2023, pp. 4513–27, doi:10.1109/TIT.2023.3260950.","short":"Y. Zhang, S. Vatedka, IEEE Transactions on Information Theory 69 (2023) 4513–4527.","ieee":"Y. Zhang and S. Vatedka, “Multiple packing: Lower bounds via infinite constellations,” IEEE Transactions on Information Theory, vol. 69, no. 7. IEEE, pp. 4513–4527, 2023.","ama":"Zhang Y, Vatedka S. Multiple packing: Lower bounds via infinite constellations. IEEE Transactions on Information Theory. 2023;69(7):4513-4527. doi:10.1109/TIT.2023.3260950","apa":"Zhang, Y., & Vatedka, S. (2023). Multiple packing: Lower bounds via infinite constellations. IEEE Transactions on Information Theory. IEEE. https://doi.org/10.1109/TIT.2023.3260950","chicago":"Zhang, Yihan, and Shashank Vatedka. “Multiple Packing: Lower Bounds via Infinite Constellations.” IEEE Transactions on Information Theory. IEEE, 2023. https://doi.org/10.1109/TIT.2023.3260950.","ista":"Zhang Y, Vatedka S. 2023. Multiple packing: Lower bounds via infinite constellations. IEEE Transactions on Information Theory. 69(7), 4513–4527."},"user_id":"2DF688A6-F248-11E8-B48F-1D18A9856A87","author":[{"id":"2ce5da42-b2ea-11eb-bba5-9f264e9d002c","first_name":"Yihan","orcid":"0000-0002-6465-6258","full_name":"Zhang, Yihan","last_name":"Zhang"},{"last_name":"Vatedka","full_name":"Vatedka, Shashank","first_name":"Shashank"}],"article_processing_charge":"No","external_id":{"arxiv":["2211.04407"],"isi":["001017307000023"]},"title":"Multiple packing: Lower bounds via infinite constellations","acknowledgement":"YZ thanks Jiajin Li for making the observation given by Equation (23). He also would like to thank Nir Ailon and Ely Porat for several helpful conversations throughout this project, and Alexander Barg for insightful comments on the manuscript.\r\nYZ has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 682203-ERC-[Inf-Speed-Tradeoff]. The work of SV was supported by a seed grant from IIT Hyderabad and the start-up research grant from the Science and Engineering Research Board, India (SRG/2020/000910).","publisher":"IEEE","quality_controlled":"1","oa":1,"isi":1,"year":"2023","day":"01","publication":"IEEE Transactions on Information Theory","page":"4513-4527","doi":"10.1109/TIT.2023.3260950","date_published":"2023-07-01T00:00:00Z","date_created":"2023-04-16T22:01:09Z"},{"title":"Approximate message passing for multi-layer estimation in rotationally invariant models","author":[{"last_name":"Xu","full_name":"Xu, Yizhou","first_name":"Yizhou"},{"full_name":"Hou, Tian Qi","last_name":"Hou","first_name":"Tian Qi"},{"first_name":"Shan Suo","last_name":"Liang","full_name":"Liang, Shan Suo"},{"last_name":"Mondelli","full_name":"Mondelli, Marco","orcid":"0000-0002-3242-7020","id":"27EB676C-8706-11E9-9510-7717E6697425","first_name":"Marco"}],"external_id":{"isi":["001031733100053"],"arxiv":["2212.01572"]},"article_processing_charge":"No","user_id":"2DF688A6-F248-11E8-B48F-1D18A9856A87","citation":{"ieee":"Y. Xu, T. Q. Hou, S. S. Liang, and M. Mondelli, “Approximate message passing for multi-layer estimation in rotationally invariant models,” in 2023 IEEE Information Theory Workshop, Saint-Malo, France, 2023, pp. 294–298.","short":"Y. Xu, T.Q. Hou, S.S. Liang, M. Mondelli, in:, 2023 IEEE Information Theory Workshop, Institute of Electrical and Electronics Engineers, 2023, pp. 294–298.","ama":"Xu Y, Hou TQ, Liang SS, Mondelli M. Approximate message passing for multi-layer estimation in rotationally invariant models. In: 2023 IEEE Information Theory Workshop. Institute of Electrical and Electronics Engineers; 2023:294-298. doi:10.1109/ITW55543.2023.10160238","apa":"Xu, Y., Hou, T. Q., Liang, S. S., & Mondelli, M. (2023). Approximate message passing for multi-layer estimation in rotationally invariant models. In 2023 IEEE Information Theory Workshop (pp. 294–298). Saint-Malo, France: Institute of Electrical and Electronics Engineers. https://doi.org/10.1109/ITW55543.2023.10160238","mla":"Xu, Yizhou, et al. “Approximate Message Passing for Multi-Layer Estimation in Rotationally Invariant Models.” 2023 IEEE Information Theory Workshop, Institute of Electrical and Electronics Engineers, 2023, pp. 294–98, doi:10.1109/ITW55543.2023.10160238.","ista":"Xu Y, Hou TQ, Liang SS, Mondelli M. 2023. Approximate message passing for multi-layer estimation in rotationally invariant models. 2023 IEEE Information Theory Workshop. ITW: Information Theory Workshop, 294–298.","chicago":"Xu, Yizhou, Tian Qi Hou, Shan Suo Liang, and Marco Mondelli. “Approximate Message Passing for Multi-Layer Estimation in Rotationally Invariant Models.” In 2023 IEEE Information Theory Workshop, 294–98. Institute of Electrical and Electronics Engineers, 2023. https://doi.org/10.1109/ITW55543.2023.10160238."},"project":[{"name":"Prix Lopez-Loretta 2019 - Marco Mondelli","_id":"059876FA-7A3F-11EA-A408-12923DDC885E"}],"doi":"10.1109/ITW55543.2023.10160238","date_published":"2023-05-01T00:00:00Z","date_created":"2023-07-30T22:01:04Z","page":"294-298","day":"01","publication":"2023 IEEE Information Theory Workshop","isi":1,"year":"2023","publisher":"Institute of Electrical and Electronics Engineers","quality_controlled":"1","oa":1,"acknowledgement":"Marco Mondelli was partially supported by the 2019 Lopez-Loreta prize.","department":[{"_id":"MaMo"}],"date_updated":"2023-12-13T11:35:46Z","status":"public","type":"conference","conference":{"name":"ITW: Information Theory Workshop","location":"Saint-Malo, France","end_date":"2023-04-28","start_date":"2023-04-23"},"_id":"13321","language":[{"iso":"eng"}],"publication_identifier":{"eissn":["2475-4218"],"isbn":["9798350301496"]},"publication_status":"published","month":"05","scopus_import":"1","main_file_link":[{"url":"https://doi.org/10.48550/arXiv.2212.01572","open_access":"1"}],"oa_version":"Preprint","abstract":[{"lang":"eng","text":"We consider the problem of reconstructing the signal and the hidden variables from observations coming from a multi-layer network with rotationally invariant weight matrices. The multi-layer structure models inference from deep generative priors, and the rotational invariance imposed on the weights generalizes the i.i.d. Gaussian assumption by allowing for a complex correlation structure, which is typical in applications. In this work, we present a new class of approximate message passing (AMP) algorithms and give a state evolution recursion which precisely characterizes their performance in the large system limit. In contrast with the existing multi-layer VAMP (ML-VAMP) approach, our proposed AMP – dubbed multilayer rotationally invariant generalized AMP (ML-RI-GAMP) – provides a natural generalization beyond Gaussian designs, in the sense that it recovers the existing Gaussian AMP as a special case. Furthermore, ML-RI-GAMP exhibits a significantly lower complexity than ML-VAMP, as the computationally intensive singular value decomposition is replaced by an estimation of the moments of the design matrices. Finally, our numerical results show that this complexity gain comes at little to no cost in the performance of the algorithm."}]},{"month":"11","quality_controlled":"1","publisher":"IEEE","scopus_import":"1","oa":1,"main_file_link":[{"open_access":"1","url":"https://doi.org/10.48550/arXiv.2211.04408"}],"oa_version":"Preprint","abstract":[{"text":"We derive lower bounds on the maximal rates for multiple packings in high-dimensional Euclidean spaces. For any N > 0 and L ∈ Z ≥2 , a multiple packing is a set C of points in R n such that any point in R n lies in the intersection of at most L - 1 balls of radius √ nN around points in C . This is a natural generalization of the sphere packing problem. We study the multiple packing problem for both bounded point sets whose points have norm at most √ nP for some constant P > 0, and unbounded point sets whose points are allowed to be anywhere in R n . Given a well-known connection with coding theory, multiple packings can be viewed as the Euclidean analog of list-decodable codes, which are well-studied over finite fields. We derive the best known lower bounds on the optimal multiple packing density. This is accomplished by establishing an inequality which relates the list-decoding error exponent for additive white Gaussian noise channels, a quantity of average-case nature, to the list-decoding radius, a quantity of worst-case nature. We also derive novel bounds on the list-decoding error exponent for infinite constellations and closed-form expressions for the list-decoding error exponents for the power-constrained AWGN channel, which may be of independent interest beyond multiple packing.","lang":"eng"}],"doi":"10.1109/TIT.2023.3334032","date_published":"2023-11-16T00:00:00Z","date_created":"2023-12-10T23:01:00Z","day":"16","language":[{"iso":"eng"}],"publication":"IEEE Transactions on Information Theory","publication_identifier":{"eissn":["1557-9654"],"issn":["0018-9448"]},"publication_status":"epub_ahead","year":"2023","status":"public","type":"journal_article","article_type":"original","_id":"14665","title":"Multiple packing: Lower bounds via error exponents","department":[{"_id":"MaMo"}],"author":[{"orcid":"0000-0002-6465-6258","full_name":"Zhang, Yihan","last_name":"Zhang","id":"2ce5da42-b2ea-11eb-bba5-9f264e9d002c","first_name":"Yihan"},{"full_name":"Vatedka, Shashank","last_name":"Vatedka","first_name":"Shashank"}],"article_processing_charge":"No","external_id":{"arxiv":["2211.04408"]},"user_id":"2DF688A6-F248-11E8-B48F-1D18A9856A87","citation":{"ista":"Zhang Y, Vatedka S. 2023. Multiple packing: Lower bounds via error exponents. IEEE Transactions on Information Theory.","chicago":"Zhang, Yihan, and Shashank Vatedka. “Multiple Packing: Lower Bounds via Error Exponents.” IEEE Transactions on Information Theory. IEEE, 2023. https://doi.org/10.1109/TIT.2023.3334032.","ieee":"Y. Zhang and S. Vatedka, “Multiple packing: Lower bounds via error exponents,” IEEE Transactions on Information Theory. IEEE, 2023.","short":"Y. Zhang, S. Vatedka, IEEE Transactions on Information Theory (2023).","apa":"Zhang, Y., & Vatedka, S. (2023). Multiple packing: Lower bounds via error exponents. IEEE Transactions on Information Theory. IEEE. https://doi.org/10.1109/TIT.2023.3334032","ama":"Zhang Y, Vatedka S. Multiple packing: Lower bounds via error exponents. IEEE Transactions on Information Theory. 2023. doi:10.1109/TIT.2023.3334032","mla":"Zhang, Yihan, and Shashank Vatedka. “Multiple Packing: Lower Bounds via Error Exponents.” IEEE Transactions on Information Theory, IEEE, 2023, doi:10.1109/TIT.2023.3334032."},"date_updated":"2023-12-18T07:46:45Z"},{"citation":{"ista":"Zhang Y. 2023. Zero-error communication over adversarial MACs. IEEE Transactions on Information Theory. 69(7), 4093–4127.","chicago":"Zhang, Yihan. “Zero-Error Communication over Adversarial MACs.” IEEE Transactions on Information Theory. Institute of Electrical and Electronics Engineers, 2023. https://doi.org/10.1109/tit.2023.3257239.","ama":"Zhang Y. Zero-error communication over adversarial MACs. IEEE Transactions on Information Theory. 2023;69(7):4093-4127. doi:10.1109/tit.2023.3257239","apa":"Zhang, Y. (2023). Zero-error communication over adversarial MACs. IEEE Transactions on Information Theory. Institute of Electrical and Electronics Engineers. https://doi.org/10.1109/tit.2023.3257239","short":"Y. Zhang, IEEE Transactions on Information Theory 69 (2023) 4093–4127.","ieee":"Y. Zhang, “Zero-error communication over adversarial MACs,” IEEE Transactions on Information Theory, vol. 69, no. 7. Institute of Electrical and Electronics Engineers, pp. 4093–4127, 2023.","mla":"Zhang, Yihan. “Zero-Error Communication over Adversarial MACs.” IEEE Transactions on Information Theory, vol. 69, no. 7, Institute of Electrical and Electronics Engineers, 2023, pp. 4093–127, doi:10.1109/tit.2023.3257239."},"user_id":"2DF688A6-F248-11E8-B48F-1D18A9856A87","author":[{"full_name":"Zhang, Yihan","orcid":"0000-0002-6465-6258","last_name":"Zhang","id":"2ce5da42-b2ea-11eb-bba5-9f264e9d002c","first_name":"Yihan"}],"article_processing_charge":"No","external_id":{"arxiv":["2101.12426"]},"title":"Zero-error communication over adversarial MACs","year":"2023","day":"01","publication":"IEEE Transactions on Information Theory","page":"4093-4127","date_published":"2023-07-01T00:00:00Z","doi":"10.1109/tit.2023.3257239","date_created":"2024-01-08T13:04:54Z","acknowledgement":"The author would like to thank Amitalok J. Budkuley and Sidharth Jaggi for many helpful discussions at the early stage of this work. He would also like to thank Nir Ailon, Qi Cao, and Chandra Nair for discussions on a related problem regarding zero-error binary adder MACs.\r\nThe work of Yihan Zhang was supported by the European Union’s Horizon 2020 Research and Innovation Programme under Grant 682203-ERC-[Inf-Speed-Tradeoff]","publisher":"Institute of Electrical and Electronics Engineers","quality_controlled":"1","oa":1,"date_updated":"2024-01-09T08:45:24Z","department":[{"_id":"MaMo"}],"_id":"14751","article_type":"original","type":"journal_article","status":"public","keyword":["Computer Science Applications","Information Systems"],"publication_identifier":{"eissn":["1557-9654"],"issn":["0018-9448"]},"publication_status":"published","language":[{"iso":"eng"}],"volume":69,"issue":"7","abstract":[{"text":"We consider zero-error communication over a two-transmitter deterministic adversarial multiple access channel (MAC) governed by an adversary who has access to the transmissions of both senders (hence called omniscient ) and aims to maliciously corrupt the communication. None of the encoders, jammer and decoder is allowed to randomize using private or public randomness. This enforces a combinatorial nature of the problem. Our model covers a large family of channels studied in the literature, including all deterministic discrete memoryless noisy or noiseless MACs. In this work, given an arbitrary two-transmitter deterministic omniscient adversarial MAC, we characterize when the capacity region: 1) has nonempty interior (in particular, is two-dimensional); 2) consists of two line segments (in particular, has empty interior); 3) consists of one line segment (in particular, is one-dimensional); 4) or only contains (0,0) (in particular, is zero-dimensional). This extends a recent result by Wang et al. (201 9) from the point-to-point setting to the multiple access setting. Indeed, our converse arguments build upon their generalized Plotkin bound and involve delicate case analysis. One of the technical challenges is to take care of both “joint confusability” and “marginal confusability”. In particular, the treatment of marginal confusability does not follow from the point-to-point results by Wang et al. Our achievability results follow from random coding with expurgation.","lang":"eng"}],"oa_version":"Preprint","scopus_import":"1","main_file_link":[{"url":"https://doi.org/10.48550/arXiv.2101.12426","open_access":"1"}],"month":"07","intvolume":" 69"},{"oa":1,"publisher":"Institute of Electrical and Electronics Engineers","quality_controlled":"1","acknowledgement":"Nikita Polyanskii’s research was conducted in part during October 2020 - December 2021 with the Technical University of Munich and the Skolkovo Institute of Science and Technology. His work was supported by the German Research Foundation (Deutsche Forschungsgemeinschaft, DFG) under Grant No. WA3907/1-1 and the Russian Foundation for Basic Research (RFBR)\r\nunder Grant No. 20-01-00559.\r\nYihan Zhang is supported by funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 682203-ERC-[Inf-Speed-Tradeoff].","date_created":"2023-07-23T22:01:14Z","doi":"10.1109/TIT.2023.3292219","date_published":"2023-07-04T00:00:00Z","page":"6340-6357","publication":"IEEE Transactions on Information Theory","day":"04","year":"2023","isi":1,"title":"Codes for the Z-channel","external_id":{"isi":["001069680100011"],"arxiv":["2105.01427"]},"article_processing_charge":"No","author":[{"first_name":"Nikita","last_name":"Polyanskii","full_name":"Polyanskii, Nikita"},{"last_name":"Zhang","orcid":"0000-0002-6465-6258","full_name":"Zhang, Yihan","id":"2ce5da42-b2ea-11eb-bba5-9f264e9d002c","first_name":"Yihan"}],"user_id":"2DF688A6-F248-11E8-B48F-1D18A9856A87","citation":{"mla":"Polyanskii, Nikita, and Yihan Zhang. “Codes for the Z-Channel.” IEEE Transactions on Information Theory, vol. 69, no. 10, Institute of Electrical and Electronics Engineers, 2023, pp. 6340–57, doi:10.1109/TIT.2023.3292219.","ama":"Polyanskii N, Zhang Y. Codes for the Z-channel. IEEE Transactions on Information Theory. 2023;69(10):6340-6357. doi:10.1109/TIT.2023.3292219","apa":"Polyanskii, N., & Zhang, Y. (2023). Codes for the Z-channel. IEEE Transactions on Information Theory. Institute of Electrical and Electronics Engineers. https://doi.org/10.1109/TIT.2023.3292219","ieee":"N. Polyanskii and Y. Zhang, “Codes for the Z-channel,” IEEE Transactions on Information Theory, vol. 69, no. 10. Institute of Electrical and Electronics Engineers, pp. 6340–6357, 2023.","short":"N. Polyanskii, Y. Zhang, IEEE Transactions on Information Theory 69 (2023) 6340–6357.","chicago":"Polyanskii, Nikita, and Yihan Zhang. “Codes for the Z-Channel.” IEEE Transactions on Information Theory. Institute of Electrical and Electronics Engineers, 2023. https://doi.org/10.1109/TIT.2023.3292219.","ista":"Polyanskii N, Zhang Y. 2023. Codes for the Z-channel. IEEE Transactions on Information Theory. 69(10), 6340–6357."},"intvolume":" 69","month":"07","main_file_link":[{"url":"https://doi.org/10.48550/arXiv.2105.01427","open_access":"1"}],"scopus_import":"1","oa_version":"Preprint","abstract":[{"text":"This paper is a collection of results on combinatorial properties of codes for the Z-channel . A Z-channel with error fraction τ takes as input a length- n binary codeword and injects in an adversarial manner up to n τ asymmetric errors, i.e., errors that only zero out bits but do not flip 0’s to 1’s. It is known that the largest ( L - 1)-list-decodable code for the Z-channel with error fraction τ has exponential size (in n ) if τ is less than a critical value that we call the ( L - 1)- list-decoding Plotkin point and has constant size if τ is larger than the threshold. The ( L -1)-list-decoding Plotkin point is known to be L -1/L-1 – L -L/ L-1 , which equals 1/4 for unique-decoding with L -1 = 1. In this paper, we derive various results for the size of the largest codes above and below the list-decoding Plotkin point. In particular, we show that the largest ( L -1)-list-decodable code ε-above the Plotkin point, for any given sufficiently small positive constant ε > 0, has size Θ L (ε -3/2 ) for any L - 1 ≥ 1. We also devise upper and lower bounds on the exponential size of codes below the list-decoding Plotkin point.","lang":"eng"}],"volume":69,"issue":"10","language":[{"iso":"eng"}],"publication_status":"published","publication_identifier":{"issn":["0018-9448"],"eissn":["1557-9654"]},"status":"public","article_type":"original","type":"journal_article","_id":"13269","department":[{"_id":"MaMo"}],"date_updated":"2024-01-29T11:10:54Z"},{"author":[{"full_name":"Bombari, Simone","last_name":"Bombari","first_name":"Simone","id":"ca726dda-de17-11ea-bc14-f9da834f63aa"},{"first_name":"Shayan","id":"f5a2b424-e339-11ed-8435-ff3b4fe70cf8","full_name":"Kiyani, Shayan","last_name":"Kiyani"},{"first_name":"Marco","id":"27EB676C-8706-11E9-9510-7717E6697425","last_name":"Mondelli","full_name":"Mondelli, Marco","orcid":"0000-0002-3242-7020"}],"external_id":{"arxiv":["2302.01629"]},"article_processing_charge":"No","title":"Beyond the universal law of robustness: Sharper laws for random features and neural tangent kernels","citation":{"mla":"Bombari, Simone, et al. “Beyond the Universal Law of Robustness: Sharper Laws for Random Features and Neural Tangent Kernels.” Proceedings of the 40th International Conference on Machine Learning, vol. 202, ML Research Press, 2023, pp. 2738–76.","apa":"Bombari, S., Kiyani, S., & Mondelli, M. (2023). Beyond the universal law of robustness: Sharper laws for random features and neural tangent kernels. In Proceedings of the 40th International Conference on Machine Learning (Vol. 202, pp. 2738–2776). Honolulu, HI, United States: ML Research Press.","ama":"Bombari S, Kiyani S, Mondelli M. Beyond the universal law of robustness: Sharper laws for random features and neural tangent kernels. In: Proceedings of the 40th International Conference on Machine Learning. Vol 202. ML Research Press; 2023:2738-2776.","ieee":"S. Bombari, S. Kiyani, and M. Mondelli, “Beyond the universal law of robustness: Sharper laws for random features and neural tangent kernels,” in Proceedings of the 40th International Conference on Machine Learning, Honolulu, HI, United States, 2023, vol. 202, pp. 2738–2776.","short":"S. Bombari, S. Kiyani, M. Mondelli, in:, Proceedings of the 40th International Conference on Machine Learning, ML Research Press, 2023, pp. 2738–2776.","chicago":"Bombari, Simone, Shayan Kiyani, and Marco Mondelli. “Beyond the Universal Law of Robustness: Sharper Laws for Random Features and Neural Tangent Kernels.” In Proceedings of the 40th International Conference on Machine Learning, 202:2738–76. ML Research Press, 2023.","ista":"Bombari S, Kiyani S, Mondelli M. 2023. Beyond the universal law of robustness: Sharper laws for random features and neural tangent kernels. Proceedings of the 40th International Conference on Machine Learning. ICML: International Conference on Machine Learning, PMLR, vol. 202, 2738–2776."},"user_id":"2DF688A6-F248-11E8-B48F-1D18A9856A87","project":[{"_id":"059876FA-7A3F-11EA-A408-12923DDC885E","name":"Prix Lopez-Loretta 2019 - Marco Mondelli"}],"page":"2738-2776","date_published":"2023-10-27T00:00:00Z","date_created":"2023-04-23T16:11:03Z","year":"2023","day":"27","publication":"Proceedings of the 40th International Conference on Machine Learning","quality_controlled":"1","publisher":"ML Research Press","oa":1,"acknowledgement":"Simone Bombari and Marco Mondelli were partially supported by the 2019 Lopez-Loreta prize, and\r\nthe authors would like to thank Hamed Hassani for helpful discussions.\r\n","department":[{"_id":"GradSch"},{"_id":"MaMo"}],"date_updated":"2024-02-06T07:42:00Z","type":"conference","conference":{"name":"ICML: International Conference on Machine Learning","location":"Honolulu, HI, United States","end_date":"2023-07-29","start_date":"2023-07-23"},"status":"public","_id":"12859","related_material":{"link":[{"relation":"software","url":"https://github.com/simone-bombari/beyond-universal-robustness"}]},"volume":202,"publication_status":"published","language":[{"iso":"eng"}],"alternative_title":["PMLR"],"main_file_link":[{"open_access":"1","url":"https://arxiv.org/abs/2302.01629"}],"month":"10","intvolume":" 202","abstract":[{"lang":"eng","text":"Machine learning models are vulnerable to adversarial perturbations, and a thought-provoking paper by Bubeck and Sellke has analyzed this phenomenon through the lens of over-parameterization: interpolating smoothly the data requires significantly more parameters than simply memorizing it. However, this \"universal\" law provides only a necessary condition for robustness, and it is unable to discriminate between models. In this paper, we address these gaps by focusing on empirical risk minimization in two prototypical settings, namely, random features and the neural tangent kernel (NTK). We prove that, for random features, the model is not robust for any degree of over-parameterization, even when the necessary condition coming from the universal law of robustness is satisfied. In contrast, for even activations, the NTK model meets the universal lower bound, and it is robust as soon as the necessary condition on over-parameterization is fulfilled. This also addresses a conjecture in prior work by Bubeck, Li and Nagaraj. Our analysis decouples the effect of the kernel of the model from an \"interaction matrix\", which describes the interaction with the test data and captures the effect of the activation. Our theoretical results are corroborated by numerical evidence on both synthetic and standard datasets (MNIST, CIFAR-10)."}],"oa_version":"Preprint"},{"date_published":"2023-12-15T00:00:00Z","date_created":"2024-02-02T11:17:41Z","day":"15","language":[{"iso":"eng"}],"publication":"37th Annual Conference on Neural Information Processing Systems","publication_status":"inpress","year":"2023","month":"12","quality_controlled":"1","alternative_title":["NeurIPS"],"main_file_link":[{"url":" https://doi.org/10.48550/arXiv.2305.13165","open_access":"1"}],"oa":1,"oa_version":"Preprint","acknowledgement":"M. M. is partially supported by the 2019 Lopez-Loreta Prize. The authors would like to thank Eugenia Iofinova, Bernd Prach and Simone Bombari for valuable feedback on the manuscript.","abstract":[{"text":"Neural collapse (NC) refers to the surprising structure of the last layer of deep neural networks in the terminal phase of gradient descent training. Recently, an increasing amount of experimental evidence has pointed to the propagation of NC to earlier layers of neural networks. However, while the NC in the last layer is well studied theoretically, much less is known about its multi-layered counterpart - deep neural collapse (DNC). In particular, existing work focuses either on linear layers or only on the last two layers at the price of an extra assumption. Our paper fills this gap by generalizing the established analytical framework for NC - the unconstrained features model - to multiple non-linear layers. Our key technical contribution is to show that, in a deep unconstrained features model, the unique global optimum for binary classification exhibits all the properties typical of DNC. This explains the existing experimental evidence of DNC. We also empirically show that (i) by optimizing deep unconstrained features models via gradient descent, the resulting solution agrees well with our theory, and (ii) trained networks recover the unconstrained features suitable for the occurrence of DNC, thus supporting the validity of this modeling principle.","lang":"eng"}],"department":[{"_id":"MaMo"},{"_id":"ChLa"}],"title":"Deep neural collapse is provably optimal for the deep unconstrained features model","author":[{"full_name":"Súkeník, Peter","last_name":"Súkeník","id":"d64d6a8d-eb8e-11eb-b029-96fd216dec3c","first_name":"Peter"},{"first_name":"Marco","id":"27EB676C-8706-11E9-9510-7717E6697425","orcid":"0000-0002-3242-7020","full_name":"Mondelli, Marco","last_name":"Mondelli"},{"id":"40C20FD2-F248-11E8-B48F-1D18A9856A87","first_name":"Christoph","orcid":"0000-0001-8622-7887","full_name":"Lampert, Christoph","last_name":"Lampert"}],"article_processing_charge":"No","external_id":{"arxiv":["2305.13165"]},"user_id":"2DF688A6-F248-11E8-B48F-1D18A9856A87","date_updated":"2024-02-06T07:53:26Z","citation":{"ista":"Súkeník P, Mondelli M, Lampert C. Deep neural collapse is provably optimal for the deep unconstrained features model. 37th Annual Conference on Neural Information Processing Systems. NeurIPS: Neural Information Processing Systems, NeurIPS, .","chicago":"Súkeník, Peter, Marco Mondelli, and Christoph Lampert. “Deep Neural Collapse Is Provably Optimal for the Deep Unconstrained Features Model.” In 37th Annual Conference on Neural Information Processing Systems, n.d.","ama":"Súkeník P, Mondelli M, Lampert C. Deep neural collapse is provably optimal for the deep unconstrained features model. In: 37th Annual Conference on Neural Information Processing Systems.","apa":"Súkeník, P., Mondelli, M., & Lampert, C. (n.d.). Deep neural collapse is provably optimal for the deep unconstrained features model. In 37th Annual Conference on Neural Information Processing Systems. New Orleans, LA, United States.","short":"P. Súkeník, M. Mondelli, C. Lampert, in:, 37th Annual Conference on Neural Information Processing Systems, n.d.","ieee":"P. Súkeník, M. Mondelli, and C. Lampert, “Deep neural collapse is provably optimal for the deep unconstrained features model,” in 37th Annual Conference on Neural Information Processing Systems, New Orleans, LA, United States.","mla":"Súkeník, Peter, et al. “Deep Neural Collapse Is Provably Optimal for the Deep Unconstrained Features Model.” 37th Annual Conference on Neural Information Processing Systems."},"status":"public","project":[{"_id":"059876FA-7A3F-11EA-A408-12923DDC885E","name":"Prix Lopez-Loretta 2019 - Marco Mondelli"}],"type":"conference","conference":{"location":"New Orleans, LA, United States","end_date":"2023-12-16","start_date":"2023-12-10","name":"NeurIPS: Neural Information Processing Systems"},"_id":"14921"},{"acknowledgement":"D. Wu and M. Mondelli are partially supported by the 2019 Lopez-Loreta Prize. V. Kungurtsev was supported by the OP VVV project CZ.02.1.01/0.0/0.0/16_019/0000765 \"Research Center for Informatics\".","oa_version":"Published Version","abstract":[{"text":"The stochastic heavy ball method (SHB), also known as stochastic gradient descent (SGD) with Polyak's momentum, is widely used in training neural networks. However, despite the remarkable success of such algorithm in practice, its theoretical characterization remains limited. In this paper, we focus on neural networks with two and three layers and provide a rigorous understanding of the properties of the solutions found by SHB: \\emph{(i)} stability after dropping out part of the neurons, \\emph{(ii)} connectivity along a low-loss path, and \\emph{(iii)} convergence to the global optimum.\r\nTo achieve this goal, we take a mean-field view and relate the SHB dynamics to a certain partial differential equation in the limit of large network widths. This mean-field perspective has inspired a recent line of work focusing on SGD while, in contrast, our paper considers an algorithm with momentum. More specifically, after proving existence and uniqueness of the limit differential equations, we show convergence to the global optimum and give a quantitative bound between the mean-field limit and the SHB dynamics of a finite-width network. Armed with this last bound, we are able to establish the dropout-stability and connectivity of SHB solutions.","lang":"eng"}],"month":"02","main_file_link":[{"open_access":"1","url":"https://doi.org/10.48550/arXiv.2210.06819"}],"oa":1,"alternative_title":["TMLR"],"quality_controlled":"1","publisher":"ML Research Press","publication":"Transactions on Machine Learning Research","language":[{"iso":"eng"}],"day":"28","year":"2023","publication_status":"published","has_accepted_license":"1","date_created":"2024-02-02T11:21:56Z","date_published":"2023-02-28T00:00:00Z","_id":"14924","status":"public","project":[{"name":"Prix Lopez-Loretta 2019 - Marco Mondelli","_id":"059876FA-7A3F-11EA-A408-12923DDC885E"}],"tmp":{"legal_code_url":"https://creativecommons.org/licenses/by/4.0/legalcode","image":"/images/cc_by.png","name":"Creative Commons Attribution 4.0 International Public License (CC-BY 4.0)","short":"CC BY (4.0)"},"type":"conference","user_id":"2DF688A6-F248-11E8-B48F-1D18A9856A87","citation":{"ista":"Wu D, Kungurtsev V, Mondelli M. 2023. Mean-field analysis for heavy ball methods: Dropout-stability, connectivity, and global convergence. Transactions on Machine Learning Research. , TMLR, .","chicago":"Wu, Diyuan, Vyacheslav Kungurtsev, and Marco Mondelli. “Mean-Field Analysis for Heavy Ball Methods: Dropout-Stability, Connectivity, and Global Convergence.” In Transactions on Machine Learning Research. ML Research Press, 2023.","apa":"Wu, D., Kungurtsev, V., & Mondelli, M. (2023). Mean-field analysis for heavy ball methods: Dropout-stability, connectivity, and global convergence. In Transactions on Machine Learning Research. ML Research Press.","ama":"Wu D, Kungurtsev V, Mondelli M. Mean-field analysis for heavy ball methods: Dropout-stability, connectivity, and global convergence. In: Transactions on Machine Learning Research. ML Research Press; 2023.","short":"D. Wu, V. Kungurtsev, M. Mondelli, in:, Transactions on Machine Learning Research, ML Research Press, 2023.","ieee":"D. Wu, V. Kungurtsev, and M. Mondelli, “Mean-field analysis for heavy ball methods: Dropout-stability, connectivity, and global convergence,” in Transactions on Machine Learning Research, 2023.","mla":"Wu, Diyuan, et al. “Mean-Field Analysis for Heavy Ball Methods: Dropout-Stability, Connectivity, and Global Convergence.” Transactions on Machine Learning Research, ML Research Press, 2023."},"date_updated":"2024-02-06T08:53:15Z","department":[{"_id":"MaMo"}],"title":"Mean-field analysis for heavy ball methods: Dropout-stability, connectivity, and global convergence","external_id":{"arxiv":["2210.06819"]},"article_processing_charge":"No","author":[{"last_name":"Wu","full_name":"Wu, Diyuan","id":"1a5914c2-896a-11ed-bdf8-fb80621a0635","first_name":"Diyuan"},{"last_name":"Kungurtsev","full_name":"Kungurtsev, Vyacheslav","first_name":"Vyacheslav"},{"full_name":"Mondelli, Marco","orcid":"0000-0002-3242-7020","last_name":"Mondelli","id":"27EB676C-8706-11E9-9510-7717E6697425","first_name":"Marco"}]},{"abstract":[{"lang":"eng","text":"We study the performance of a Bayesian statistician who estimates a rank-one signal corrupted by non-symmetric rotationally invariant noise with a generic distribution of singular values. As the signal-to-noise ratio and the noise structure are unknown, a Gaussian setup is incorrectly assumed. We derive the exact analytic expression for the error of the mismatched Bayes estimator and also provide the analysis of an approximate message passing (AMP) algorithm. The first result exploits the asymptotic behavior of spherical integrals for rectangular matrices and of low-rank matrix perturbations; the second one relies on the design and analysis of an auxiliary AMP. The numerical experiments show that there is a performance gap between the AMP and Bayes estimators, which is due to the incorrect estimation of the signal norm."}],"oa_version":"Preprint","quality_controlled":"1","publisher":"IEEE","main_file_link":[{"open_access":"1","url":"https://doi.org/10.48550/arXiv.2302.03306"}],"oa":1,"month":"06","publication_status":"inpress","year":"2023","day":"30","language":[{"iso":"eng"}],"publication":"Proceedings of 2023 IEEE International Symposium on Information Theory","date_published":"2023-06-30T00:00:00Z","doi":"10.1109/isit54713.2023.10206671","date_created":"2024-02-02T11:20:39Z","_id":"14923","type":"conference","conference":{"name":"ISIT: IEEE International Symposium on Information Theory","start_date":"2023-06-25","location":"Taipei, Taiwan","end_date":"2023-06-30"},"status":"public","citation":{"ieee":"T. Fu, Y. Liu, J. Barbier, M. Mondelli, S. Liang, and T. Hou, “Mismatched estimation of non-symmetric rank-one matrices corrupted by structured noise,” in Proceedings of 2023 IEEE International Symposium on Information Theory, Taipei, Taiwan.","short":"T. Fu, Y. Liu, J. Barbier, M. Mondelli, S. Liang, T. Hou, in:, Proceedings of 2023 IEEE International Symposium on Information Theory, IEEE, n.d.","ama":"Fu T, Liu Y, Barbier J, Mondelli M, Liang S, Hou T. Mismatched estimation of non-symmetric rank-one matrices corrupted by structured noise. In: Proceedings of 2023 IEEE International Symposium on Information Theory. IEEE. doi:10.1109/isit54713.2023.10206671","apa":"Fu, T., Liu, Y., Barbier, J., Mondelli, M., Liang, S., & Hou, T. (n.d.). Mismatched estimation of non-symmetric rank-one matrices corrupted by structured noise. In Proceedings of 2023 IEEE International Symposium on Information Theory. Taipei, Taiwan: IEEE. https://doi.org/10.1109/isit54713.2023.10206671","mla":"Fu, Teng, et al. “Mismatched Estimation of Non-Symmetric Rank-One Matrices Corrupted by Structured Noise.” Proceedings of 2023 IEEE International Symposium on Information Theory, IEEE, doi:10.1109/isit54713.2023.10206671.","ista":"Fu T, Liu Y, Barbier J, Mondelli M, Liang S, Hou T. Mismatched estimation of non-symmetric rank-one matrices corrupted by structured noise. Proceedings of 2023 IEEE International Symposium on Information Theory. ISIT: IEEE International Symposium on Information Theory.","chicago":"Fu, Teng, YuHao Liu, Jean Barbier, Marco Mondelli, ShanSuo Liang, and TianQi Hou. “Mismatched Estimation of Non-Symmetric Rank-One Matrices Corrupted by Structured Noise.” In Proceedings of 2023 IEEE International Symposium on Information Theory. IEEE, n.d. https://doi.org/10.1109/isit54713.2023.10206671."},"date_updated":"2024-02-14T14:34:03Z","user_id":"2DF688A6-F248-11E8-B48F-1D18A9856A87","author":[{"first_name":"Teng","last_name":"Fu","full_name":"Fu, Teng"},{"full_name":"Liu, YuHao","last_name":"Liu","first_name":"YuHao"},{"first_name":"Jean","last_name":"Barbier","full_name":"Barbier, Jean"},{"first_name":"Marco","id":"27EB676C-8706-11E9-9510-7717E6697425","full_name":"Mondelli, Marco","orcid":"0000-0002-3242-7020","last_name":"Mondelli"},{"full_name":"Liang, ShanSuo","last_name":"Liang","first_name":"ShanSuo"},{"first_name":"TianQi","last_name":"Hou","full_name":"Hou, TianQi"}],"external_id":{"arxiv":["2302.03306"]},"article_processing_charge":"No","department":[{"_id":"MaMo"}],"title":"Mismatched estimation of non-symmetric rank-one matrices corrupted by structured noise"},{"publisher":"IEEE","quality_controlled":"1","oa":1,"acknowledgement":"The authors are partially supported by the 2019 Lopez-Loreta Prize. They would also like to thank Professor Jan Maas for providing valuable suggestions and comments on an early version of the work.","page":"400-405","date_published":"2023-06-30T00:00:00Z","doi":"10.1109/isit54713.2023.10206899","date_created":"2024-02-02T11:18:40Z","year":"2023","day":"30","publication":"Proceedings of 2023 IEEE International Symposium on Information Theory","project":[{"name":"Prix Lopez-Loretta 2019 - Marco Mondelli","_id":"059876FA-7A3F-11EA-A408-12923DDC885E"}],"author":[{"id":"9583e921-e1ad-11ec-9862-cef099626dc9","first_name":"Amedeo Roberto","last_name":"Esposito","full_name":"Esposito, Amedeo Roberto"},{"first_name":"Marco","id":"27EB676C-8706-11E9-9510-7717E6697425","full_name":"Mondelli, Marco","orcid":"0000-0002-3242-7020","last_name":"Mondelli"}],"article_processing_charge":"No","external_id":{"arxiv":["2303.07245"]},"title":"Concentration without independence via information measures","citation":{"ama":"Esposito AR, Mondelli M. Concentration without independence via information measures. In: Proceedings of 2023 IEEE International Symposium on Information Theory. IEEE; 2023:400-405. doi:10.1109/isit54713.2023.10206899","apa":"Esposito, A. R., & Mondelli, M. (2023). Concentration without independence via information measures. In Proceedings of 2023 IEEE International Symposium on Information Theory (pp. 400–405). Taipei, Taiwan: IEEE. https://doi.org/10.1109/isit54713.2023.10206899","short":"A.R. Esposito, M. Mondelli, in:, Proceedings of 2023 IEEE International Symposium on Information Theory, IEEE, 2023, pp. 400–405.","ieee":"A. R. Esposito and M. Mondelli, “Concentration without independence via information measures,” in Proceedings of 2023 IEEE International Symposium on Information Theory, Taipei, Taiwan, 2023, pp. 400–405.","mla":"Esposito, Amedeo Roberto, and Marco Mondelli. “Concentration without Independence via Information Measures.” Proceedings of 2023 IEEE International Symposium on Information Theory, IEEE, 2023, pp. 400–05, doi:10.1109/isit54713.2023.10206899.","ista":"Esposito AR, Mondelli M. 2023. Concentration without independence via information measures. Proceedings of 2023 IEEE International Symposium on Information Theory. ISIT: IEEE International Symposium on Information Theory, 400–405.","chicago":"Esposito, Amedeo Roberto, and Marco Mondelli. “Concentration without Independence via Information Measures.” In Proceedings of 2023 IEEE International Symposium on Information Theory, 400–405. IEEE, 2023. https://doi.org/10.1109/isit54713.2023.10206899."},"user_id":"2DF688A6-F248-11E8-B48F-1D18A9856A87","main_file_link":[{"open_access":"1","url":"https://doi.org/10.48550/arXiv.2303.07245"}],"month":"06","abstract":[{"lang":"eng","text":"We propose a novel approach to concentration for non-independent random variables. The main idea is to ``pretend'' that the random variables are independent and pay a multiplicative price measuring how far they are from actually being independent. This price is encapsulated in the Hellinger integral between the joint and the product of the marginals, which is then upper bounded leveraging tensorisation properties. Our bounds represent a natural generalisation of concentration inequalities in the presence of dependence: we recover exactly the classical bounds (McDiarmid's inequality) when the random variables are independent. Furthermore, in a ``large deviations'' regime, we obtain the same decay in the probability as for the independent case, even when the random variables display non-trivial dependencies. To show this, we consider a number of applications of interest. First, we provide a bound for Markov chains with finite state space. Then, we consider the Simple Symmetric Random Walk, which is a non-contracting Markov chain, and a non-Markovian setting in which the stochastic process depends on its entire past. To conclude, we propose an application to Markov Chain Monte Carlo methods, where our approach leads to an improved lower bound on the minimum burn-in period required to reach a certain accuracy. In all of these settings, we provide a regime of parameters in which our bound fares better than what the state of the art can provide."}],"oa_version":"Preprint","related_material":{"record":[{"relation":"later_version","status":"public","id":"15172"}]},"publication_identifier":{"eissn":["2157-8117"],"eisbn":["9781665475549"]},"publication_status":"published","language":[{"iso":"eng"}],"type":"conference","conference":{"name":"ISIT: IEEE International Symposium on Information Theory","start_date":"2023-06-25","end_date":"2023-06-30","location":"Taipei, Taiwan"},"status":"public","_id":"14922","department":[{"_id":"MaMo"}],"date_updated":"2024-03-25T07:15:51Z"},{"_id":"11420","tmp":{"legal_code_url":"https://creativecommons.org/licenses/by/4.0/legalcode","image":"/images/cc_by.png","name":"Creative Commons Attribution 4.0 International Public License (CC-BY 4.0)","short":"CC BY (4.0)"},"article_type":"original","type":"journal_article","status":"public","date_updated":"2022-05-30T08:34:14Z","ddc":["000"],"file_date_updated":"2022-05-30T08:22:55Z","department":[{"_id":"MaMo"},{"_id":"DaAl"}],"abstract":[{"text":"Understanding the properties of neural networks trained via stochastic gradient descent (SGD) is at the heart of the theory of deep learning. In this work, we take a mean-field view, and consider a two-layer ReLU network trained via noisy-SGD for a univariate regularized regression problem. Our main result is that SGD with vanishingly small noise injected in the gradients is biased towards a simple solution: at convergence, the ReLU network implements a piecewise linear map of the inputs, and the number of “knot” points -- i.e., points where the tangent of the ReLU network estimator changes -- between two consecutive training inputs is at most three. In particular, as the number of neurons of the network grows, the SGD dynamics is captured by the solution of a gradient flow and, at convergence, the distribution of the weights approaches the unique minimizer of a related free energy, which has a Gibbs form. Our key technical contribution consists in the analysis of the estimator resulting from this minimizer: we show that its second derivative vanishes everywhere, except at some specific locations which represent the “knot” points. We also provide empirical evidence that knots at locations distinct from the data points might occur, as predicted by our theory.","lang":"eng"}],"oa_version":"Published Version","scopus_import":"1","intvolume":" 23","month":"04","publication_status":"published","publication_identifier":{"eissn":["1533-7928"],"issn":["1532-4435"]},"language":[{"iso":"eng"}],"file":[{"date_created":"2022-05-30T08:22:55Z","file_name":"21-1365.pdf","creator":"cchlebak","date_updated":"2022-05-30T08:22:55Z","file_size":1521701,"file_id":"11422","checksum":"d4ff5d1affb34848b5c5e4002483fc62","success":1,"access_level":"open_access","relation":"main_file","content_type":"application/pdf"}],"related_material":{"link":[{"relation":"other","url":"https://www.jmlr.org/papers/v23/21-1365.html"}]},"issue":"130","volume":23,"project":[{"_id":"059876FA-7A3F-11EA-A408-12923DDC885E","name":"Prix Lopez-Loretta 2019 - Marco Mondelli"}],"citation":{"mla":"Shevchenko, Aleksandr, et al. “Mean-Field Analysis of Piecewise Linear Solutions for Wide ReLU Networks.” Journal of Machine Learning Research, vol. 23, no. 130, Journal of Machine Learning Research, 2022, pp. 1–55.","apa":"Shevchenko, A., Kungurtsev, V., & Mondelli, M. (2022). Mean-field analysis of piecewise linear solutions for wide ReLU networks. Journal of Machine Learning Research. Journal of Machine Learning Research.","ama":"Shevchenko A, Kungurtsev V, Mondelli M. Mean-field analysis of piecewise linear solutions for wide ReLU networks. Journal of Machine Learning Research. 2022;23(130):1-55.","short":"A. Shevchenko, V. Kungurtsev, M. Mondelli, Journal of Machine Learning Research 23 (2022) 1–55.","ieee":"A. Shevchenko, V. Kungurtsev, and M. Mondelli, “Mean-field analysis of piecewise linear solutions for wide ReLU networks,” Journal of Machine Learning Research, vol. 23, no. 130. Journal of Machine Learning Research, pp. 1–55, 2022.","chicago":"Shevchenko, Aleksandr, Vyacheslav Kungurtsev, and Marco Mondelli. “Mean-Field Analysis of Piecewise Linear Solutions for Wide ReLU Networks.” Journal of Machine Learning Research. Journal of Machine Learning Research, 2022.","ista":"Shevchenko A, Kungurtsev V, Mondelli M. 2022. Mean-field analysis of piecewise linear solutions for wide ReLU networks. Journal of Machine Learning Research. 23(130), 1–55."},"user_id":"8b945eb4-e2f2-11eb-945a-df72226e66a9","external_id":{"arxiv":["2111.02278"]},"article_processing_charge":"No","author":[{"last_name":"Shevchenko","full_name":"Shevchenko, Aleksandr","first_name":"Aleksandr","id":"F2B06EC2-C99E-11E9-89F0-752EE6697425"},{"first_name":"Vyacheslav","last_name":"Kungurtsev","full_name":"Kungurtsev, Vyacheslav"},{"first_name":"Marco","id":"27EB676C-8706-11E9-9510-7717E6697425","last_name":"Mondelli","orcid":"0000-0002-3242-7020","full_name":"Mondelli, Marco"}],"title":"Mean-field analysis of piecewise linear solutions for wide ReLU networks","acknowledgement":"We would like to thank Mert Pilanci for several exploratory discussions in the early stage\r\nof the project, Jan Maas for clarifications about Jordan et al. (1998), and Max Zimmer for\r\nsuggestive numerical experiments. A. Shevchenko and M. Mondelli are partially supported\r\nby the 2019 Lopez-Loreta Prize. V. Kungurtsev acknowledges support to the OP VVV\r\nproject CZ.02.1.01/0.0/0.0/16 019/0000765 Research Center for Informatics.\r\n","oa":1,"quality_controlled":"1","publisher":"Journal of Machine Learning Research","year":"2022","has_accepted_license":"1","publication":"Journal of Machine Learning Research","day":"01","page":"1-55","date_created":"2022-05-29T22:01:54Z","date_published":"2022-04-01T00:00:00Z"},{"_id":"12011","status":"public","conference":{"location":"Espoo, Finland","end_date":"2022-07-01","start_date":"2022-06-26","name":"ISIT: Internation Symposium on Information Theory"},"type":"conference","date_updated":"2022-09-05T09:09:15Z","department":[{"_id":"MaMo"}],"oa_version":"Preprint","abstract":[{"lang":"eng","text":"We characterize the capacity for the discrete-time arbitrarily varying channel with discrete inputs, outputs, and states when (a) the encoder and decoder do not share common randomness, (b) the input and state are subject to cost constraints, (c) the transition matrix of the channel is deterministic given the state, and (d) at each time step the adversary can only observe the current and past channel inputs when choosing the state at that time. The achievable strategy involves stochastic encoding together with list decoding and a disambiguation step. The converse uses a two-phase \"babble-and-push\" strategy where the adversary chooses the state randomly in the first phase, list decodes the output, and then chooses state inputs to symmetrize the channel in the second phase. These results generalize prior work on specific channels models (additive, erasure) to general discrete alphabets and models."}],"intvolume":" 2022","month":"08","main_file_link":[{"open_access":"1","url":" https://doi.org/10.48550/arXiv.2205.06708"}],"scopus_import":"1","language":[{"iso":"eng"}],"publication_status":"published","publication_identifier":{"isbn":["9781665421591"],"issn":["2157-8095"]},"volume":2022,"user_id":"2DF688A6-F248-11E8-B48F-1D18A9856A87","citation":{"mla":"Zhang, Yihan, et al. “The Capacity of Causal Adversarial Channels.” 2022 IEEE International Symposium on Information Theory, vol. 2022, IEEE, 2022, pp. 2523–28, doi:10.1109/ISIT50566.2022.9834709.","apa":"Zhang, Y., Jaggi, S., Langberg, M., & Sarwate, A. D. (2022). The capacity of causal adversarial channels. In 2022 IEEE International Symposium on Information Theory (Vol. 2022, pp. 2523–2528). Espoo, Finland: IEEE. https://doi.org/10.1109/ISIT50566.2022.9834709","ama":"Zhang Y, Jaggi S, Langberg M, Sarwate AD. The capacity of causal adversarial channels. In: 2022 IEEE International Symposium on Information Theory. Vol 2022. IEEE; 2022:2523-2528. doi:10.1109/ISIT50566.2022.9834709","short":"Y. Zhang, S. Jaggi, M. Langberg, A.D. Sarwate, in:, 2022 IEEE International Symposium on Information Theory, IEEE, 2022, pp. 2523–2528.","ieee":"Y. Zhang, S. Jaggi, M. Langberg, and A. D. Sarwate, “The capacity of causal adversarial channels,” in 2022 IEEE International Symposium on Information Theory, Espoo, Finland, 2022, vol. 2022, pp. 2523–2528.","chicago":"Zhang, Yihan, Sidharth Jaggi, Michael Langberg, and Anand D. Sarwate. “The Capacity of Causal Adversarial Channels.” In 2022 IEEE International Symposium on Information Theory, 2022:2523–28. IEEE, 2022. https://doi.org/10.1109/ISIT50566.2022.9834709.","ista":"Zhang Y, Jaggi S, Langberg M, Sarwate AD. 2022. The capacity of causal adversarial channels. 2022 IEEE International Symposium on Information Theory. ISIT: Internation Symposium on Information Theory vol. 2022, 2523–2528."},"title":"The capacity of causal adversarial channels","article_processing_charge":"No","external_id":{"arxiv":["2205.06708"]},"author":[{"id":"2ce5da42-b2ea-11eb-bba5-9f264e9d002c","first_name":"Yihan","last_name":"Zhang","full_name":"Zhang, Yihan"},{"last_name":"Jaggi","full_name":"Jaggi, Sidharth","first_name":"Sidharth"},{"full_name":"Langberg, Michael","last_name":"Langberg","first_name":"Michael"},{"last_name":"Sarwate","full_name":"Sarwate, Anand D.","first_name":"Anand D."}],"acknowledgement":"The work of ADS and ML was supported in part by the US National Science Foundation under awards CCF-1909468 and CCF-1909451.","oa":1,"publisher":"IEEE","quality_controlled":"1","publication":"2022 IEEE International Symposium on Information Theory","day":"03","year":"2022","date_created":"2022-09-04T22:02:03Z","doi":"10.1109/ISIT50566.2022.9834709","date_published":"2022-08-03T00:00:00Z","page":"2523-2528"},{"_id":"12017","conference":{"name":"ISIT: Internation Symposium on Information Theory","start_date":"2022-06-26","end_date":"2022-07-01","location":"Espoo, Finland"},"type":"conference","status":"public","citation":{"ista":"Yadav AK, Alimohammadi M, Zhang Y, Budkuley AJ, Jaggi S. 2022. New results on AVCs with omniscient and myopic adversaries. 2022 IEEE International Symposium on Information Theory. ISIT: Internation Symposium on Information Theory vol. 2022, 2535–2540.","chicago":"Yadav, Anuj Kumar, Mohammadreza Alimohammadi, Yihan Zhang, Amitalok J. Budkuley, and Sidharth Jaggi. “New Results on AVCs with Omniscient and Myopic Adversaries.” In 2022 IEEE International Symposium on Information Theory, 2022:2535–40. Institute of Electrical and Electronics Engineers, 2022. https://doi.org/10.1109/ISIT50566.2022.9834632.","ieee":"A. K. Yadav, M. Alimohammadi, Y. Zhang, A. J. Budkuley, and S. Jaggi, “New results on AVCs with omniscient and myopic adversaries,” in 2022 IEEE International Symposium on Information Theory, Espoo, Finland, 2022, vol. 2022, pp. 2535–2540.","short":"A.K. Yadav, M. Alimohammadi, Y. Zhang, A.J. Budkuley, S. Jaggi, in:, 2022 IEEE International Symposium on Information Theory, Institute of Electrical and Electronics Engineers, 2022, pp. 2535–2540.","ama":"Yadav AK, Alimohammadi M, Zhang Y, Budkuley AJ, Jaggi S. New results on AVCs with omniscient and myopic adversaries. In: 2022 IEEE International Symposium on Information Theory. Vol 2022. Institute of Electrical and Electronics Engineers; 2022:2535-2540. doi:10.1109/ISIT50566.2022.9834632","apa":"Yadav, A. K., Alimohammadi, M., Zhang, Y., Budkuley, A. J., & Jaggi, S. (2022). New results on AVCs with omniscient and myopic adversaries. In 2022 IEEE International Symposium on Information Theory (Vol. 2022, pp. 2535–2540). Espoo, Finland: Institute of Electrical and Electronics Engineers. https://doi.org/10.1109/ISIT50566.2022.9834632","mla":"Yadav, Anuj Kumar, et al. “New Results on AVCs with Omniscient and Myopic Adversaries.” 2022 IEEE International Symposium on Information Theory, vol. 2022, Institute of Electrical and Electronics Engineers, 2022, pp. 2535–40, doi:10.1109/ISIT50566.2022.9834632."},"date_updated":"2023-02-13T09:00:14Z","user_id":"2DF688A6-F248-11E8-B48F-1D18A9856A87","article_processing_charge":"No","author":[{"first_name":"Anuj Kumar","last_name":"Yadav","full_name":"Yadav, Anuj Kumar"},{"first_name":"Mohammadreza","last_name":"Alimohammadi","full_name":"Alimohammadi, Mohammadreza"},{"id":"2ce5da42-b2ea-11eb-bba5-9f264e9d002c","first_name":"Yihan","full_name":"Zhang, Yihan","last_name":"Zhang"},{"full_name":"Budkuley, Amitalok J.","last_name":"Budkuley","first_name":"Amitalok J."},{"full_name":"Jaggi, Sidharth","last_name":"Jaggi","first_name":"Sidharth"}],"department":[{"_id":"MaMo"}],"title":"New results on AVCs with omniscient and myopic adversaries","abstract":[{"text":"In the classic adversarial communication problem, two parties communicate over a noisy channel in the presence of a malicious jamming adversary. The arbitrarily varying channels (AVCs) offer an elegant framework to study a wide range of interesting adversary models. The optimal throughput or capacity over such AVCs is intimately tied to the underlying adversary model; in some cases, capacity is unknown and the problem is known to be notoriously hard. The omniscient adversary, one which knows the sender’s entire channel transmission a priori, is one of such classic models of interest; the capacity under such an adversary remains an exciting open problem. The myopic adversary is a generalization of that model where the adversary’s observation may be corrupted over a noisy discrete memoryless channel. Through the adversary’s myopicity, one can unify the slew of different adversary models, ranging from the omniscient adversary to one that is completely blind to the transmission (the latter is the well known oblivious model where the capacity is fully characterized).In this work, we present new results on the capacity under both the omniscient and myopic adversary models. We completely characterize the positive capacity threshold over general AVCs with omniscient adversaries. The characterization is in terms of two key combinatorial objects: the set of completely positive distributions and the CP-confusability set. For omniscient AVCs with positive capacity, we present non-trivial lower and upper bounds on the capacity; unlike some of the previous bounds, our bounds hold under fairly general input and jamming constraints. Our lower bound improves upon the generalized Gilbert-Varshamov bound for general AVCs while the upper bound generalizes the well known Elias-Bassalygo bound (known for binary and q-ary alphabets). For the myopic AVCs, we build on prior results known for the so-called sufficiently myopic model, and present new results on the positive rate communication threshold over the so-called insufficiently myopic regime (a completely insufficient myopic adversary specializes to an omniscient adversary). We present interesting examples for the widely studied models of adversarial bit-flip and bit-erasure channels. In fact, for the bit-flip AVC with additive adversarial noise as well as random noise, we completely characterize the omniscient model capacity when the random noise is sufficiently large vis-a-vis the adversary’s budget.","lang":"eng"}],"oa_version":"None","scopus_import":"1","quality_controlled":"1","publisher":"Institute of Electrical and Electronics Engineers","intvolume":" 2022","month":"08","year":"2022","publication_status":"published","publication_identifier":{"isbn":["9781665421591"],"issn":["2157-8095"]},"publication":"2022 IEEE International Symposium on Information Theory","language":[{"iso":"eng"}],"day":"03","page":"2535-2540","date_created":"2022-09-04T22:02:06Z","date_published":"2022-08-03T00:00:00Z","doi":"10.1109/ISIT50566.2022.9834632","volume":2022},{"type":"conference","conference":{"start_date":"2022-06-26","location":"Espoo, Finland","end_date":"2022-07-01","name":"ISIT: Internation Symposium on Information Theory"},"status":"public","_id":"12013","author":[{"full_name":"Joshi, Pranav","last_name":"Joshi","first_name":"Pranav"},{"first_name":"Amritakshya","last_name":"Purkayastha","full_name":"Purkayastha, Amritakshya"},{"full_name":"Zhang, Yihan","last_name":"Zhang","first_name":"Yihan","id":"2ce5da42-b2ea-11eb-bba5-9f264e9d002c"},{"first_name":"Amitalok J.","last_name":"Budkuley","full_name":"Budkuley, Amitalok J."},{"first_name":"Sidharth","full_name":"Jaggi, Sidharth","last_name":"Jaggi"}],"article_processing_charge":"No","title":"On the capacity of additive AVCs with feedback","department":[{"_id":"MaMo"}],"date_updated":"2022-09-05T10:23:35Z","citation":{"ama":"Joshi P, Purkayastha A, Zhang Y, Budkuley AJ, Jaggi S. On the capacity of additive AVCs with feedback. In: 2022 IEEE International Symposium on Information Theory. Vol 2022. IEEE; 2022:504-509. doi:10.1109/ISIT50566.2022.9834850","apa":"Joshi, P., Purkayastha, A., Zhang, Y., Budkuley, A. J., & Jaggi, S. (2022). On the capacity of additive AVCs with feedback. In 2022 IEEE International Symposium on Information Theory (Vol. 2022, pp. 504–509). Espoo, Finland: IEEE. https://doi.org/10.1109/ISIT50566.2022.9834850","short":"P. Joshi, A. Purkayastha, Y. Zhang, A.J. Budkuley, S. Jaggi, in:, 2022 IEEE International Symposium on Information Theory, IEEE, 2022, pp. 504–509.","ieee":"P. Joshi, A. Purkayastha, Y. Zhang, A. J. Budkuley, and S. Jaggi, “On the capacity of additive AVCs with feedback,” in 2022 IEEE International Symposium on Information Theory, Espoo, Finland, 2022, vol. 2022, pp. 504–509.","mla":"Joshi, Pranav, et al. “On the Capacity of Additive AVCs with Feedback.” 2022 IEEE International Symposium on Information Theory, vol. 2022, IEEE, 2022, pp. 504–09, doi:10.1109/ISIT50566.2022.9834850.","ista":"Joshi P, Purkayastha A, Zhang Y, Budkuley AJ, Jaggi S. 2022. On the capacity of additive AVCs with feedback. 2022 IEEE International Symposium on Information Theory. ISIT: Internation Symposium on Information Theory vol. 2022, 504–509.","chicago":"Joshi, Pranav, Amritakshya Purkayastha, Yihan Zhang, Amitalok J. Budkuley, and Sidharth Jaggi. “On the Capacity of Additive AVCs with Feedback.” In 2022 IEEE International Symposium on Information Theory, 2022:504–9. IEEE, 2022. https://doi.org/10.1109/ISIT50566.2022.9834850."},"user_id":"2DF688A6-F248-11E8-B48F-1D18A9856A87","quality_controlled":"1","publisher":"IEEE","scopus_import":"1","month":"08","intvolume":" 2022","abstract":[{"lang":"eng","text":"We consider the problem of communication over adversarial channels with feedback. Two parties comprising sender Alice and receiver Bob seek to communicate reliably. An adversary James observes Alice's channel transmission entirely and chooses, maliciously, its additive channel input or jamming state thereby corrupting Bob's observation. Bob can communicate over a one-way reverse link with Alice; we assume that transmissions over this feedback link cannot be corrupted by James. Our goal in this work is to study the optimum throughput or capacity over such channels with feedback. We first present results for the quadratically-constrained additive channel where communication is known to be impossible when the noise-to-signal (power) ratio (NSR) is at least 1. We present a novel achievability scheme to establish that positive rate communication is possible even when the NSR is as high as 8/9. We also present new converse upper bounds on the capacity of this channel under potentially stochastic encoders and decoders. We also study feedback communication over the more widely studied q-ary alphabet channel under additive noise. For the q -ary channel, where q > 2, it is well known that capacity is positive under full feedback if and only if the adversary can corrupt strictly less than half the transmitted symbols. We generalize this result and show that the same threshold holds for positive rate communication when the noiseless feedback may only be partial; our scheme employs a stochastic decoder. We extend this characterization, albeit partially, to fully deterministic schemes under partial noiseless feedback. We also present new converse upper bounds for q-ary channels under full feedback, where the encoder and/or decoder may privately randomize. Our converse results bring to the fore an interesting alternate expression for the well known converse bound for the q—ary channel under full feedback which, when specialized to the binary channel, also equals its known capacity."}],"oa_version":"None","page":"504-509","date_published":"2022-08-03T00:00:00Z","doi":"10.1109/ISIT50566.2022.9834850","volume":2022,"date_created":"2022-09-04T22:02:04Z","publication_identifier":{"isbn":["9781665421591"],"issn":["2157-8095"]},"publication_status":"published","year":"2022","day":"03","language":[{"iso":"eng"}],"publication":"2022 IEEE International Symposium on Information Theory"},{"department":[{"_id":"MaMo"}],"date_updated":"2022-09-05T10:35:40Z","status":"public","conference":{"name":"ISIT: Internation Symposium on Information Theory","end_date":"2022-07-01","location":"Espoo, Finland","start_date":"2022-06-26"},"type":"conference","_id":"12016","volume":2022,"language":[{"iso":"eng"}],"publication_status":"published","publication_identifier":{"isbn":["9781665421591"],"issn":["2157-8095"]},"intvolume":" 2022","month":"08","main_file_link":[{"url":"https://doi.org/10.48550/arXiv.2201.10082","open_access":"1"}],"scopus_import":"1","oa_version":"Preprint","abstract":[{"text":"We consider the problem of coded distributed computing using polar codes. The average execution time of a coded computing system is related to the error probability for transmission over the binary erasure channel in recent work by Soleymani, Jamali and Mahdavifar, where the performance of binary linear codes is investigated. In this paper, we focus on polar codes and unveil a connection between the average execution time and the scaling exponent μ of the family of codes. In the finite-length characterization of polar codes, the scaling exponent is a key object capturing the speed of convergence to capacity. In particular, we show that (i) the gap between the normalized average execution time of polar codes and that of optimal MDS codes is O(n –1/μ ), and (ii) this upper bound can be improved to roughly O(n –1/2 ) by considering polar codes with large kernels. We conjecture that these bounds could be improved to O(n –2/μ ) and O(n –1 ), respectively, and provide a heuristic argument as well as numerical evidence supporting this view.","lang":"eng"}],"title":"Polar coded computing: The role of the scaling exponent","article_processing_charge":"No","external_id":{"arxiv":["2201.10082"]},"author":[{"full_name":"Fathollahi, Dorsa","last_name":"Fathollahi","first_name":"Dorsa"},{"id":"27EB676C-8706-11E9-9510-7717E6697425","first_name":"Marco","full_name":"Mondelli, Marco","orcid":"0000-0002-3242-7020","last_name":"Mondelli"}],"user_id":"2DF688A6-F248-11E8-B48F-1D18A9856A87","citation":{"ista":"Fathollahi D, Mondelli M. 2022. Polar coded computing: The role of the scaling exponent. 2022 IEEE International Symposium on Information Theory. ISIT: Internation Symposium on Information Theory vol. 2022, 2154–2159.","chicago":"Fathollahi, Dorsa, and Marco Mondelli. “Polar Coded Computing: The Role of the Scaling Exponent.” In 2022 IEEE International Symposium on Information Theory, 2022:2154–59. IEEE, 2022. https://doi.org/10.1109/ISIT50566.2022.9834712.","ama":"Fathollahi D, Mondelli M. Polar coded computing: The role of the scaling exponent. In: 2022 IEEE International Symposium on Information Theory. Vol 2022. IEEE; 2022:2154-2159. doi:10.1109/ISIT50566.2022.9834712","apa":"Fathollahi, D., & Mondelli, M. (2022). Polar coded computing: The role of the scaling exponent. In 2022 IEEE International Symposium on Information Theory (Vol. 2022, pp. 2154–2159). Espoo, Finland: IEEE. https://doi.org/10.1109/ISIT50566.2022.9834712","ieee":"D. Fathollahi and M. Mondelli, “Polar coded computing: The role of the scaling exponent,” in 2022 IEEE International Symposium on Information Theory, Espoo, Finland, 2022, vol. 2022, pp. 2154–2159.","short":"D. Fathollahi, M. Mondelli, in:, 2022 IEEE International Symposium on Information Theory, IEEE, 2022, pp. 2154–2159.","mla":"Fathollahi, Dorsa, and Marco Mondelli. “Polar Coded Computing: The Role of the Scaling Exponent.” 2022 IEEE International Symposium on Information Theory, vol. 2022, IEEE, 2022, pp. 2154–59, doi:10.1109/ISIT50566.2022.9834712."},"project":[{"_id":"059876FA-7A3F-11EA-A408-12923DDC885E","name":"Prix Lopez-Loretta 2019 - Marco Mondelli"}],"date_created":"2022-09-04T22:02:05Z","date_published":"2022-08-03T00:00:00Z","doi":"10.1109/ISIT50566.2022.9834712","page":"2154-2159","publication":"2022 IEEE International Symposium on Information Theory","day":"03","year":"2022","oa":1,"quality_controlled":"1","publisher":"IEEE","acknowledgement":"D. Fathollahi and M. Mondelli were partially supported by the 2019 Lopez-Loreta Prize. The authors thank Hamed Hassani and Hessam Mahdavifar for helpful discussions."},{"year":"2022","day":"03","publication":"2022 IEEE International Symposium on Information Theory","page":"1623-1628","doi":"10.1109/ISIT50566.2022.9834711","date_published":"2022-08-03T00:00:00Z","date_created":"2022-09-04T22:02:04Z","quality_controlled":"1","publisher":"IEEE","oa":1,"citation":{"ieee":"S. Torkamani, J. B. Ebrahimi, P. Sadeghi, R. G. L. D’Oliveira, and M. Médard, “Heterogeneous differential privacy via graphs,” in 2022 IEEE International Symposium on Information Theory, Espoo, Finland, 2022, vol. 2022, pp. 1623–1628.","short":"S. Torkamani, J.B. Ebrahimi, P. Sadeghi, R.G.L. D’Oliveira, M. Médard, in:, 2022 IEEE International Symposium on Information Theory, IEEE, 2022, pp. 1623–1628.","apa":"Torkamani, S., Ebrahimi, J. B., Sadeghi, P., D’Oliveira, R. G. L., & Médard, M. (2022). Heterogeneous differential privacy via graphs. In 2022 IEEE International Symposium on Information Theory (Vol. 2022, pp. 1623–1628). Espoo, Finland: IEEE. https://doi.org/10.1109/ISIT50566.2022.9834711","ama":"Torkamani S, Ebrahimi JB, Sadeghi P, D’Oliveira RGL, Médard M. Heterogeneous differential privacy via graphs. In: 2022 IEEE International Symposium on Information Theory. Vol 2022. IEEE; 2022:1623-1628. doi:10.1109/ISIT50566.2022.9834711","mla":"Torkamani, Sahel, et al. “Heterogeneous Differential Privacy via Graphs.” 2022 IEEE International Symposium on Information Theory, vol. 2022, IEEE, 2022, pp. 1623–28, doi:10.1109/ISIT50566.2022.9834711.","ista":"Torkamani S, Ebrahimi JB, Sadeghi P, D’Oliveira RGL, Médard M. 2022. Heterogeneous differential privacy via graphs. 2022 IEEE International Symposium on Information Theory. ISIT: Internation Symposium on Information Theory vol. 2022, 1623–1628.","chicago":"Torkamani, Sahel, Javad B. Ebrahimi, Parastoo Sadeghi, Rafael G.L. D’Oliveira, and Muriel Médard. “Heterogeneous Differential Privacy via Graphs.” In 2022 IEEE International Symposium on Information Theory, 2022:1623–28. IEEE, 2022. https://doi.org/10.1109/ISIT50566.2022.9834711."},"user_id":"2DF688A6-F248-11E8-B48F-1D18A9856A87","author":[{"full_name":"Torkamani, Sahel","last_name":"Torkamani","first_name":"Sahel","id":"0503e7f8-2d05-11ed-aa17-db0640c720fc"},{"last_name":"Ebrahimi","full_name":"Ebrahimi, Javad B.","first_name":"Javad B."},{"full_name":"Sadeghi, Parastoo","last_name":"Sadeghi","first_name":"Parastoo"},{"first_name":"Rafael G.L.","last_name":"D'Oliveira","full_name":"D'Oliveira, Rafael G.L."},{"last_name":"Médard","full_name":"Médard, Muriel","first_name":"Muriel"}],"external_id":{"arxiv":["2203.15429"]},"article_processing_charge":"No","title":"Heterogeneous differential privacy via graphs","publication_identifier":{"isbn":["9781665421591"],"issn":["2157-8095"]},"publication_status":"published","language":[{"iso":"eng"}],"volume":2022,"abstract":[{"text":"This paper is eligible for the Jack Keil Wolf ISIT Student Paper Award. We generalize a previous framework for designing utility-optimal differentially private (DP) mechanisms via graphs, where datasets are vertices in the graph and edges represent dataset neighborhood. The boundary set contains datasets where an individual’s response changes the binary-valued query compared to its neighbors. Previous work was limited to the homogeneous case where the privacy parameter ε across all datasets was the same and the mechanism at boundary datasets was identical. In our work, the mechanism can take different distributions at the boundary and the privacy parameter ε is a function of neighboring datasets, which recovers an earlier definition of personalized DP as special case. The problem is how to extend the mechanism, which is only defined at the boundary set, to other datasets in the graph in a computationally efficient and utility optimal manner. Using the concept of strongest induced DP condition we solve this problem efficiently in polynomial time (in the size of the graph).","lang":"eng"}],"oa_version":"Preprint","scopus_import":"1","main_file_link":[{"url":"https://doi.org/10.48550/arXiv.2203.15429","open_access":"1"}],"month":"08","intvolume":" 2022","date_updated":"2022-09-05T10:28:35Z","department":[{"_id":"MaMo"}],"_id":"12012","type":"conference","conference":{"end_date":"2022-07-01","location":"Espoo, Finland","start_date":"2022-06-26","name":"ISIT: Internation Symposium on Information Theory"},"status":"public"},{"_id":"12018","status":"public","conference":{"end_date":"2022-07-01","location":"Espoo, Finland","start_date":"2022-06-26","name":"ISIT: Internation Symposium on Information Theory"},"type":"conference","user_id":"2DF688A6-F248-11E8-B48F-1D18A9856A87","date_updated":"2023-02-13T09:02:06Z","citation":{"mla":"Zhang, Yihan, and Shashank Vatedka. “Lower Bounds on List Decoding Capacity Using Error Exponents.” 2022 IEEE International Symposium on Information Theory, vol. 2022, Institute of Electrical and Electronics Engineers, 2022, pp. 1324–29, doi:10.1109/ISIT50566.2022.9834815.","ama":"Zhang Y, Vatedka S. Lower bounds on list decoding capacity using error exponents. In: 2022 IEEE International Symposium on Information Theory. Vol 2022. Institute of Electrical and Electronics Engineers; 2022:1324-1329. doi:10.1109/ISIT50566.2022.9834815","apa":"Zhang, Y., & Vatedka, S. (2022). Lower bounds on list decoding capacity using error exponents. In 2022 IEEE International Symposium on Information Theory (Vol. 2022, pp. 1324–1329). Espoo, Finland: Institute of Electrical and Electronics Engineers. https://doi.org/10.1109/ISIT50566.2022.9834815","ieee":"Y. Zhang and S. Vatedka, “Lower bounds on list decoding capacity using error exponents,” in 2022 IEEE International Symposium on Information Theory, Espoo, Finland, 2022, vol. 2022, pp. 1324–1329.","short":"Y. Zhang, S. Vatedka, in:, 2022 IEEE International Symposium on Information Theory, Institute of Electrical and Electronics Engineers, 2022, pp. 1324–1329.","chicago":"Zhang, Yihan, and Shashank Vatedka. “Lower Bounds on List Decoding Capacity Using Error Exponents.” In 2022 IEEE International Symposium on Information Theory, 2022:1324–29. Institute of Electrical and Electronics Engineers, 2022. https://doi.org/10.1109/ISIT50566.2022.9834815.","ista":"Zhang Y, Vatedka S. 2022. Lower bounds on list decoding capacity using error exponents. 2022 IEEE International Symposium on Information Theory. ISIT: Internation Symposium on Information Theory vol. 2022, 1324–1329."},"department":[{"_id":"MaMo"}],"title":"Lower bounds on list decoding capacity using error exponents","article_processing_charge":"No","author":[{"first_name":"Yihan","id":"2ce5da42-b2ea-11eb-bba5-9f264e9d002c","full_name":"Zhang, Yihan","last_name":"Zhang"},{"first_name":"Shashank","last_name":"Vatedka","full_name":"Vatedka, Shashank"}],"oa_version":"None","abstract":[{"text":"We study the problem of characterizing the maximal rates of list decoding in Euclidean spaces for finite list sizes. For any positive integer L ≥ 2 and real N > 0, we say that a subset C⊂Rn is an (N,L – 1)-multiple packing or an (N,L– 1)-list decodable code if every Euclidean ball of radius nN−−−√ in ℝ n contains no more than L − 1 points of C. We study this problem with and without ℓ 2 norm constraints on C, and derive the best-known lower bounds on the maximal rate for (N,L−1) multiple packing. Our bounds are obtained via error exponents for list decoding over Additive White Gaussian Noise (AWGN) channels. We establish a curious inequality which relates the error exponent, a quantity of average-case nature, to the list-decoding radius, a quantity of worst-case nature. We derive various bounds on the error exponent for list decoding in both bounded and unbounded settings which could be of independent interest beyond multiple packing.","lang":"eng"}],"intvolume":" 2022","month":"08","publisher":"Institute of Electrical and Electronics Engineers","scopus_import":"1","quality_controlled":"1","language":[{"iso":"eng"}],"publication":"2022 IEEE International Symposium on Information Theory","day":"03","publication_status":"published","year":"2022","publication_identifier":{"isbn":["9781665421591"],"issn":["2157-8095"]},"date_created":"2022-09-04T22:02:06Z","doi":"10.1109/ISIT50566.2022.9834815","volume":2022,"date_published":"2022-08-03T00:00:00Z","page":"1324-1329"},{"_id":"12015","status":"public","conference":{"start_date":"2022-06-26","end_date":"2022-07-01","location":"Espoo, Finland","name":"ISIT: Internation Symposium on Information Theory"},"type":"conference","user_id":"2DF688A6-F248-11E8-B48F-1D18A9856A87","citation":{"mla":"Zhang, Yihan, and Shashank Vatedka. “Lower Bounds for Multiple Packing.” 2022 IEEE International Symposium on Information Theory, vol. 2022, IEEE, 2022, pp. 3085–90, doi:10.1109/ISIT50566.2022.9834443.","ieee":"Y. Zhang and S. Vatedka, “Lower bounds for multiple packing,” in 2022 IEEE International Symposium on Information Theory, Espoo, Finland, 2022, vol. 2022, pp. 3085–3090.","short":"Y. Zhang, S. Vatedka, in:, 2022 IEEE International Symposium on Information Theory, IEEE, 2022, pp. 3085–3090.","apa":"Zhang, Y., & Vatedka, S. (2022). Lower bounds for multiple packing. In 2022 IEEE International Symposium on Information Theory (Vol. 2022, pp. 3085–3090). Espoo, Finland: IEEE. https://doi.org/10.1109/ISIT50566.2022.9834443","ama":"Zhang Y, Vatedka S. Lower bounds for multiple packing. In: 2022 IEEE International Symposium on Information Theory. Vol 2022. IEEE; 2022:3085-3090. doi:10.1109/ISIT50566.2022.9834443","chicago":"Zhang, Yihan, and Shashank Vatedka. “Lower Bounds for Multiple Packing.” In 2022 IEEE International Symposium on Information Theory, 2022:3085–90. IEEE, 2022. https://doi.org/10.1109/ISIT50566.2022.9834443.","ista":"Zhang Y, Vatedka S. 2022. Lower bounds for multiple packing. 2022 IEEE International Symposium on Information Theory. ISIT: Internation Symposium on Information Theory vol. 2022, 3085–3090."},"date_updated":"2022-09-05T10:39:04Z","department":[{"_id":"MaMo"}],"title":"Lower bounds for multiple packing","article_processing_charge":"No","author":[{"last_name":"Zhang","full_name":"Zhang, Yihan","id":"2ce5da42-b2ea-11eb-bba5-9f264e9d002c","first_name":"Yihan"},{"full_name":"Vatedka, Shashank","last_name":"Vatedka","first_name":"Shashank"}],"oa_version":"None","abstract":[{"text":"We study the problem of high-dimensional multiple packing in Euclidean space. Multiple packing is a natural generalization of sphere packing and is defined as follows. Let P, N > 0 and L∈Z≥2. A multiple packing is a set C of points in Bn(0–,nP−−−√) such that any point in ℝ n lies in the intersection of at most L – 1 balls of radius nN−−−√ around points in C. 1 In this paper, we derive two lower bounds on the largest possible density of a multiple packing. These bounds are obtained through a stronger notion called average-radius multiple packing. Specifically, we exactly pin down the asymptotics of (expurgated) Gaussian codes and (expurgated) spherical codes under average-radius multiple packing. To this end, we apply tools from high-dimensional geometry and large deviation theory. The bound for spherical codes matches the previous best known bound which was obtained for the standard (weaker) notion of multiple packing through a curious connection with error exponents [Bli99], [ZV21]. The bound for Gaussian codes suggests that they are strictly inferior to spherical codes.","lang":"eng"}],"intvolume":" 2022","month":"08","scopus_import":"1","publisher":"IEEE","quality_controlled":"1","publication":"2022 IEEE International Symposium on Information Theory","language":[{"iso":"eng"}],"day":"03","year":"2022","publication_status":"published","publication_identifier":{"issn":["2157-8095"],"isbn":["9781665421591"]},"date_created":"2022-09-04T22:02:05Z","volume":2022,"date_published":"2022-08-03T00:00:00Z","doi":"10.1109/ISIT50566.2022.9834443","page":"3085-3090"},{"user_id":"2DF688A6-F248-11E8-B48F-1D18A9856A87","citation":{"short":"Y. Zhang, S. Vatedka, in:, 2022 IEEE International Symposium on Information Theory, IEEE, 2022, pp. 2559–2564.","ieee":"Y. Zhang and S. Vatedka, “List-decodability of Poisson Point Processes,” in 2022 IEEE International Symposium on Information Theory, Espoo, Finland, 2022, vol. 2022, pp. 2559–2564.","apa":"Zhang, Y., & Vatedka, S. (2022). List-decodability of Poisson Point Processes. In 2022 IEEE International Symposium on Information Theory (Vol. 2022, pp. 2559–2564). Espoo, Finland: IEEE. https://doi.org/10.1109/ISIT50566.2022.9834512","ama":"Zhang Y, Vatedka S. List-decodability of Poisson Point Processes. In: 2022 IEEE International Symposium on Information Theory. Vol 2022. IEEE; 2022:2559-2564. doi:10.1109/ISIT50566.2022.9834512","mla":"Zhang, Yihan, and Shashank Vatedka. “List-Decodability of Poisson Point Processes.” 2022 IEEE International Symposium on Information Theory, vol. 2022, IEEE, 2022, pp. 2559–64, doi:10.1109/ISIT50566.2022.9834512.","ista":"Zhang Y, Vatedka S. 2022. List-decodability of Poisson Point Processes. 2022 IEEE International Symposium on Information Theory. ISIT: Internation Symposium on Information Theory vol. 2022, 2559–2564.","chicago":"Zhang, Yihan, and Shashank Vatedka. “List-Decodability of Poisson Point Processes.” In 2022 IEEE International Symposium on Information Theory, 2022:2559–64. IEEE, 2022. https://doi.org/10.1109/ISIT50566.2022.9834512."},"date_updated":"2022-09-05T09:23:04Z","title":"List-decodability of Poisson Point Processes","department":[{"_id":"MaMo"}],"article_processing_charge":"No","author":[{"id":"2ce5da42-b2ea-11eb-bba5-9f264e9d002c","first_name":"Yihan","full_name":"Zhang, Yihan","last_name":"Zhang"},{"full_name":"Vatedka, Shashank","last_name":"Vatedka","first_name":"Shashank"}],"_id":"12014","status":"public","conference":{"name":"ISIT: Internation Symposium on Information Theory","location":"Espoo, Finland","end_date":"2022-07-01","start_date":"2022-06-26"},"type":"conference","language":[{"iso":"eng"}],"publication":"2022 IEEE International Symposium on Information Theory","day":"03","publication_status":"published","year":"2022","publication_identifier":{"issn":["2157-8095"],"isbn":["9781665421591"]},"date_created":"2022-09-04T22:02:04Z","date_published":"2022-08-03T00:00:00Z","volume":2022,"doi":"10.1109/ISIT50566.2022.9834512","page":"2559-2564","oa_version":"None","abstract":[{"lang":"eng","text":"We study the problem of high-dimensional multiple packing in Euclidean space. Multiple packing is a natural generalization of sphere packing and is defined as follows. Let N > 0 and L∈Z≥2. A multiple packing is a set C of points in Rn such that any point in Rn lies in the intersection of at most L – 1 balls of radius nN−−−√ around points in C. Given a well-known connection with coding theory, multiple packings can be viewed as the Euclidean analog of list-decodable codes, which are well-studied for finite fields. In this paper, we exactly pin down the asymptotic density of (expurgated) Poisson Point Processes under a stronger notion called average-radius multiple packing. To this end, we apply tools from high-dimensional geometry and large deviation theory. This gives rise to the best known lower bound on the largest multiple packing density. Our result corrects a mistake in a previous paper by Blinovsky [Bli05]."}],"intvolume":" 2022","month":"08","quality_controlled":"1","scopus_import":"1","publisher":"IEEE"},{"page":"2553-2558","date_created":"2022-09-04T22:02:07Z","date_published":"2022-08-03T00:00:00Z","volume":2022,"doi":"10.1109/ISIT50566.2022.9834829","publication_status":"published","year":"2022","publication_identifier":{"issn":["2157-8095"],"isbn":["9781665421591"]},"language":[{"iso":"eng"}],"publication":"2022 IEEE International Symposium on Information Theory","day":"03","quality_controlled":"1","scopus_import":"1","publisher":"Institute of Electrical and Electronics Engineers","intvolume":" 2022","month":"08","abstract":[{"text":"This paper studies combinatorial properties of codes for the Z-channel. A Z-channel with error fraction τ takes as input a length-n binary codeword and injects in an adversarial manner up to nτ asymmetric errors, i.e., errors that only zero out bits but do not flip 0’s to 1’s. It is known that the largest (L − 1)-list-decodable code for the Z-channel with error fraction τ has exponential (in n) size if τ is less than a critical value that we call the Plotkin point and has constant size if τ is larger than the threshold. The (L−1)-list-decoding Plotkin point is known to be L−1L−1−L−LL−1. In this paper, we show that the largest (L−1)-list-decodable code ε-above the Plotkin point has size Θ L (ε −3/2 ) for any L − 1 ≥ 1.","lang":"eng"}],"oa_version":"None","article_processing_charge":"No","author":[{"full_name":"Polyanskii, Nikita","last_name":"Polyanskii","first_name":"Nikita"},{"first_name":"Yihan","id":"2ce5da42-b2ea-11eb-bba5-9f264e9d002c","full_name":"Zhang, Yihan","last_name":"Zhang"}],"department":[{"_id":"MaMo"}],"title":"List-decodable zero-rate codes for the Z-channel","date_updated":"2023-02-13T09:02:18Z","citation":{"ista":"Polyanskii N, Zhang Y. 2022. List-decodable zero-rate codes for the Z-channel. 2022 IEEE International Symposium on Information Theory. ISIT: Internation Symposium on Information Theory vol. 2022, 2553–2558.","chicago":"Polyanskii, Nikita, and Yihan Zhang. “List-Decodable Zero-Rate Codes for the Z-Channel.” In 2022 IEEE International Symposium on Information Theory, 2022:2553–58. Institute of Electrical and Electronics Engineers, 2022. https://doi.org/10.1109/ISIT50566.2022.9834829.","apa":"Polyanskii, N., & Zhang, Y. (2022). List-decodable zero-rate codes for the Z-channel. In 2022 IEEE International Symposium on Information Theory (Vol. 2022, pp. 2553–2558). Espoo, Finland: Institute of Electrical and Electronics Engineers. https://doi.org/10.1109/ISIT50566.2022.9834829","ama":"Polyanskii N, Zhang Y. List-decodable zero-rate codes for the Z-channel. In: 2022 IEEE International Symposium on Information Theory. Vol 2022. Institute of Electrical and Electronics Engineers; 2022:2553-2558. doi:10.1109/ISIT50566.2022.9834829","short":"N. Polyanskii, Y. Zhang, in:, 2022 IEEE International Symposium on Information Theory, Institute of Electrical and Electronics Engineers, 2022, pp. 2553–2558.","ieee":"N. Polyanskii and Y. Zhang, “List-decodable zero-rate codes for the Z-channel,” in 2022 IEEE International Symposium on Information Theory, Espoo, Finland, 2022, vol. 2022, pp. 2553–2558.","mla":"Polyanskii, Nikita, and Yihan Zhang. “List-Decodable Zero-Rate Codes for the Z-Channel.” 2022 IEEE International Symposium on Information Theory, vol. 2022, Institute of Electrical and Electronics Engineers, 2022, pp. 2553–58, doi:10.1109/ISIT50566.2022.9834829."},"user_id":"2DF688A6-F248-11E8-B48F-1D18A9856A87","conference":{"start_date":"2022-06-26","end_date":"2022-07-01","location":"Espoo, Finland","name":"ISIT: Internation Symposium on Information Theory"},"type":"conference","status":"public","_id":"12019"},{"date_published":"2022-01-01T00:00:00Z","date_created":"2023-02-10T13:49:04Z","has_accepted_license":"1","year":"2022","publication":"Proceedings of the 39th International Conference on Machine Learning","quality_controlled":"1","publisher":"ML Research Press","oa":1,"acknowledgement":"The authors would like to thank the anonymous reviewers for their helpful comments. KK and MM were partially supported by the 2019 Lopez-Loreta Prize.","author":[{"full_name":"Venkataramanan, Ramji","last_name":"Venkataramanan","first_name":"Ramji"},{"full_name":"Kögler, Kevin","last_name":"Kögler","first_name":"Kevin","id":"94ec913c-dc85-11ea-9058-e5051ab2428b"},{"id":"27EB676C-8706-11E9-9510-7717E6697425","first_name":"Marco","full_name":"Mondelli, Marco","orcid":"0000-0002-3242-7020","last_name":"Mondelli"}],"article_processing_charge":"No","title":"Estimation in rotationally invariant generalized linear models via approximate message passing","citation":{"ama":"Venkataramanan R, Kögler K, Mondelli M. Estimation in rotationally invariant generalized linear models via approximate message passing. In: Proceedings of the 39th International Conference on Machine Learning. Vol 162. ML Research Press; 2022.","apa":"Venkataramanan, R., Kögler, K., & Mondelli, M. (2022). Estimation in rotationally invariant generalized linear models via approximate message passing. In Proceedings of the 39th International Conference on Machine Learning (Vol. 162). Baltimore, MD, United States: ML Research Press.","short":"R. Venkataramanan, K. Kögler, M. Mondelli, in:, Proceedings of the 39th International Conference on Machine Learning, ML Research Press, 2022.","ieee":"R. Venkataramanan, K. Kögler, and M. Mondelli, “Estimation in rotationally invariant generalized linear models via approximate message passing,” in Proceedings of the 39th International Conference on Machine Learning, Baltimore, MD, United States, 2022, vol. 162.","mla":"Venkataramanan, Ramji, et al. “Estimation in Rotationally Invariant Generalized Linear Models via Approximate Message Passing.” Proceedings of the 39th International Conference on Machine Learning, vol. 162, 22, ML Research Press, 2022.","ista":"Venkataramanan R, Kögler K, Mondelli M. 2022. Estimation in rotationally invariant generalized linear models via approximate message passing. Proceedings of the 39th International Conference on Machine Learning. ICML: International Conference on Machine Learning vol. 162, 22.","chicago":"Venkataramanan, Ramji, Kevin Kögler, and Marco Mondelli. “Estimation in Rotationally Invariant Generalized Linear Models via Approximate Message Passing.” In Proceedings of the 39th International Conference on Machine Learning, Vol. 162. ML Research Press, 2022."},"user_id":"2DF688A6-F248-11E8-B48F-1D18A9856A87","project":[{"_id":"059876FA-7A3F-11EA-A408-12923DDC885E","name":"Prix Lopez-Loretta 2019 - Marco Mondelli"}],"article_number":"22","volume":162,"publication_status":"published","file":[{"file_size":2341343,"date_updated":"2023-02-13T10:53:11Z","creator":"dernst","file_name":"2022_PMLR_Venkataramanan.pdf","date_created":"2023-02-13T10:53:11Z","content_type":"application/pdf","relation":"main_file","access_level":"open_access","success":1,"file_id":"12547","checksum":"67436eb0a660789514cdf9db79e84683"}],"language":[{"iso":"eng"}],"intvolume":" 162","abstract":[{"lang":"eng","text":"We consider the problem of signal estimation in generalized linear models defined via rotationally invariant design matrices. Since these matrices can have an arbitrary spectral distribution, this model is well suited for capturing complex correlation structures which often arise in applications. We propose a novel family of approximate message passing (AMP) algorithms for signal estimation, and rigorously characterize their performance in the high-dimensional limit via a state evolution recursion. Our rotationally invariant AMP has complexity of the same order as the existing AMP derived under the restrictive assumption of a Gaussian design; our algorithm also recovers this existing AMP as a special case. Numerical results showcase a performance close to Vector AMP (which is conjectured to be Bayes-optimal in some settings), but obtained with a much lower complexity, as the proposed algorithm does not require a computationally expensive singular value decomposition."}],"oa_version":"Published Version","department":[{"_id":"MaMo"}],"file_date_updated":"2023-02-13T10:53:11Z","date_updated":"2023-02-13T10:54:58Z","ddc":["000"],"type":"conference","conference":{"location":"Baltimore, MD, United States","end_date":"2022-07-23","start_date":"2022-07-17","name":"ICML: International Conference on Machine Learning"},"status":"public","_id":"12540"},{"year":"2022","publication_status":"accepted","day":"20","language":[{"iso":"eng"}],"publication":"arXiv","doi":"10.48550/arXiv.2205.10009","date_published":"2022-05-20T00:00:00Z","date_created":"2023-02-10T13:45:41Z","abstract":[{"lang":"eng","text":"We consider the problem of estimating a rank-1 signal corrupted by structured rotationally invariant noise, and address the following question: how well do inference algorithms perform when the noise statistics is unknown and hence Gaussian noise is assumed? While the matched Bayes-optimal setting with unstructured noise is well understood, the analysis of this mismatched problem is only at its premises. In this paper, we make a step towards understanding the effect of the strong source of mismatch which is the noise statistics. Our main technical contribution is the rigorous analysis of a Bayes estimator and of an approximate message passing (AMP) algorithm, both of which incorrectly assume a Gaussian setup. The first result exploits the theory of spherical integrals and of low-rank matrix perturbations; the idea behind the second one is to design and analyze an artificial AMP which, by taking advantage of the flexibility in the denoisers, is able to \"correct\" the mismatch. Armed with these sharp asymptotic characterizations, we unveil a rich and often unexpected phenomenology. For example, despite AMP is in principle designed to efficiently compute the Bayes estimator, the former is outperformed by the latter in terms of mean-square error. We show that this performance gap is due to an incorrect estimation of the signal norm. In fact, when the SNR is large enough, the overlaps of the AMP and the Bayes estimator coincide, and they even match those of optimal estimators taking into account the structure of the noise."}],"oa_version":"Preprint","oa":1,"main_file_link":[{"open_access":"1","url":"https://doi.org/10.48550/arXiv.2205.10009"}],"month":"05","date_updated":"2023-02-16T09:41:25Z","citation":{"ieee":"J. Barbier, T. Hou, M. Mondelli, and M. Saenz, “The price of ignorance: How much does it cost to forget noise structure in low-rank matrix estimation?,” arXiv. .","short":"J. Barbier, T. Hou, M. Mondelli, M. Saenz, ArXiv (n.d.).","apa":"Barbier, J., Hou, T., Mondelli, M., & Saenz, M. (n.d.). The price of ignorance: How much does it cost to forget noise structure in low-rank matrix estimation? arXiv. https://doi.org/10.48550/arXiv.2205.10009","ama":"Barbier J, Hou T, Mondelli M, Saenz M. The price of ignorance: How much does it cost to forget noise structure in low-rank matrix estimation? arXiv. doi:10.48550/arXiv.2205.10009","mla":"Barbier, Jean, et al. “The Price of Ignorance: How Much Does It Cost to Forget Noise Structure in Low-Rank Matrix Estimation?” ArXiv, 2205.10009, doi:10.48550/arXiv.2205.10009.","ista":"Barbier J, Hou T, Mondelli M, Saenz M. The price of ignorance: How much does it cost to forget noise structure in low-rank matrix estimation? arXiv, 2205.10009.","chicago":"Barbier, Jean, TianQi Hou, Marco Mondelli, and Manuel Saenz. “The Price of Ignorance: How Much Does It Cost to Forget Noise Structure in Low-Rank Matrix Estimation?” ArXiv, n.d. https://doi.org/10.48550/arXiv.2205.10009."},"user_id":"2DF688A6-F248-11E8-B48F-1D18A9856A87","author":[{"first_name":"Jean","last_name":"Barbier","full_name":"Barbier, Jean"},{"first_name":"TianQi","last_name":"Hou","full_name":"Hou, TianQi"},{"id":"27EB676C-8706-11E9-9510-7717E6697425","first_name":"Marco","full_name":"Mondelli, Marco","orcid":"0000-0002-3242-7020","last_name":"Mondelli"},{"last_name":"Saenz","full_name":"Saenz, Manuel","first_name":"Manuel"}],"article_processing_charge":"No","external_id":{"arxiv":["2205.10009"]},"department":[{"_id":"MaMo"}],"title":"The price of ignorance: How much does it cost to forget noise structure in low-rank matrix estimation?","_id":"12536","article_number":"2205.10009","type":"preprint","status":"public"},{"department":[{"_id":"GradSch"},{"_id":"MaMo"}],"title":"Towards differential relational privacy and its use in question answering","article_processing_charge":"No","external_id":{"arxiv":["2203.16701"]},"author":[{"full_name":"Bombari, Simone","last_name":"Bombari","first_name":"Simone","id":"ca726dda-de17-11ea-bc14-f9da834f63aa"},{"first_name":"Alessandro","full_name":"Achille, Alessandro","last_name":"Achille"},{"full_name":"Wang, Zijian","last_name":"Wang","first_name":"Zijian"},{"last_name":"Wang","full_name":"Wang, Yu-Xiang","first_name":"Yu-Xiang"},{"first_name":"Yusheng","last_name":"Xie","full_name":"Xie, Yusheng"},{"last_name":"Singh","full_name":"Singh, Kunwar Yashraj","first_name":"Kunwar Yashraj"},{"first_name":"Srikar","full_name":"Appalaraju, Srikar","last_name":"Appalaraju"},{"full_name":"Mahadevan, Vijay","last_name":"Mahadevan","first_name":"Vijay"},{"last_name":"Soatto","full_name":"Soatto, Stefano","first_name":"Stefano"}],"user_id":"2DF688A6-F248-11E8-B48F-1D18A9856A87","date_updated":"2023-04-25T07:34:49Z","citation":{"chicago":"Bombari, Simone, Alessandro Achille, Zijian Wang, Yu-Xiang Wang, Yusheng Xie, Kunwar Yashraj Singh, Srikar Appalaraju, Vijay Mahadevan, and Stefano Soatto. “Towards Differential Relational Privacy and Its Use in Question Answering.” ArXiv, n.d. https://doi.org/10.48550/arXiv.2203.16701.","ista":"Bombari S, Achille A, Wang Z, Wang Y-X, Xie Y, Singh KY, Appalaraju S, Mahadevan V, Soatto S. Towards differential relational privacy and its use in question answering. arXiv, 2203.16701.","mla":"Bombari, Simone, et al. “Towards Differential Relational Privacy and Its Use in Question Answering.” ArXiv, 2203.16701, doi:10.48550/arXiv.2203.16701.","short":"S. Bombari, A. Achille, Z. Wang, Y.-X. Wang, Y. Xie, K.Y. Singh, S. Appalaraju, V. Mahadevan, S. Soatto, ArXiv (n.d.).","ieee":"S. Bombari et al., “Towards differential relational privacy and its use in question answering,” arXiv. .","apa":"Bombari, S., Achille, A., Wang, Z., Wang, Y.-X., Xie, Y., Singh, K. Y., … Soatto, S. (n.d.). Towards differential relational privacy and its use in question answering. arXiv. https://doi.org/10.48550/arXiv.2203.16701","ama":"Bombari S, Achille A, Wang Z, et al. Towards differential relational privacy and its use in question answering. arXiv. doi:10.48550/arXiv.2203.16701"},"status":"public","type":"preprint","article_number":"2203.16701","_id":"12860","date_created":"2023-04-23T16:11:48Z","doi":"10.48550/arXiv.2203.16701","date_published":"2022-03-30T00:00:00Z","publication":"arXiv","language":[{"iso":"eng"}],"day":"30","year":"2022","publication_status":"submitted","month":"03","oa":1,"main_file_link":[{"url":"https://doi.org/10.48550/arXiv.2203.16701","open_access":"1"}],"oa_version":"Preprint","abstract":[{"lang":"eng","text":"Memorization of the relation between entities in a dataset can lead to privacy issues when using a trained model for question answering. We introduce Relational Memorization (RM) to understand, quantify and control this phenomenon. While bounding general memorization can have detrimental effects on the performance of a trained model, bounding RM does not prevent effective learning. The difference is most pronounced when the data distribution is long-tailed, with many queries having only few training examples: Impeding general memorization prevents effective learning, while impeding only relational memorization still allows learning general properties of the underlying concepts. We formalize the notion of Relational Privacy (RP) and, inspired by Differential Privacy (DP), we provide a possible definition of Differential Relational Privacy (DrP). These notions can be used to describe and compute bounds on the amount of RM in a trained model. We illustrate Relational Privacy concepts in experiments with large-scale models for Question Answering."}]},{"intvolume":" 68","month":"12","main_file_link":[{"url":"https://doi.org/10.48550/arXiv.1901.03790","open_access":"1"}],"scopus_import":"1","oa_version":"Preprint","abstract":[{"lang":"eng","text":"We study the list decodability of different ensembles of codes over the real alphabet under the assumption of an omniscient adversary. It is a well-known result that when the source and the adversary have power constraints P and N respectively, the list decoding capacity is equal to 1/2logP/N. Random spherical codes achieve constant list sizes, and the goal of the present paper is to obtain a better understanding of the smallest achievable list size as a function of the gap to capacity. We show a reduction from arbitrary codes to spherical codes, and derive a lower bound on the list size of typical random spherical codes. We also give an upper bound on the list size achievable using nested Construction-A lattices and infinite Construction-A lattices. We then define and study a class of infinite constellations that generalize Construction-A lattices and prove upper and lower bounds for the same. Other goodness properties such as packing goodness and AWGN goodness of infinite constellations are proved along the way. Finally, we consider random lattices sampled from the Haar distribution and show that if a certain conjecture that originates in analytic number theory is true, then the list size grows as a polynomial function of the gap-to-capacity."}],"issue":"12","volume":68,"language":[{"iso":"eng"}],"publication_status":"published","publication_identifier":{"eissn":["1557-9654"],"issn":["0018-9448"]},"status":"public","article_type":"original","type":"journal_article","_id":"11639","department":[{"_id":"MaMo"}],"date_updated":"2023-08-03T12:12:19Z","oa":1,"quality_controlled":"1","publisher":"IEEE","acknowledgement":"This work was done when Shashank Vatedka was at the Chinese University of Hong Kong, where he was supported in part by CUHK Direct Grants 4055039 and 4055077. He would like to acknowledge funding from a seed grant offered by IIT Hyderabad and the Start-up Research Grant (SRG/2020/000910) from the Science and Engineering Board, India. Yihan Zhang has received funding from the European Union’s Horizon 2020 research and innovation programme\r\nunder grant agreement No 682203-ERC-[Inf-Speed-Tradeoff].","date_created":"2022-07-24T22:01:42Z","date_published":"2022-12-01T00:00:00Z","doi":"10.1109/TIT.2022.3189542","page":"7753-7786","publication":"IEEE Transactions on Information Theory","day":"01","year":"2022","isi":1,"title":"List decoding random Euclidean codes and Infinite constellations","article_processing_charge":"No","external_id":{"arxiv":["1901.03790"],"isi":["000891796100007"]},"author":[{"first_name":"Yihan","id":"2ce5da42-b2ea-11eb-bba5-9f264e9d002c","last_name":"Zhang","full_name":"Zhang, Yihan"},{"first_name":"Shashank","last_name":"Vatedka","full_name":"Vatedka, Shashank"}],"user_id":"4359f0d1-fa6c-11eb-b949-802e58b17ae8","citation":{"mla":"Zhang, Yihan, and Shashank Vatedka. “List Decoding Random Euclidean Codes and Infinite Constellations.” IEEE Transactions on Information Theory, vol. 68, no. 12, IEEE, 2022, pp. 7753–86, doi:10.1109/TIT.2022.3189542.","apa":"Zhang, Y., & Vatedka, S. (2022). List decoding random Euclidean codes and Infinite constellations. IEEE Transactions on Information Theory. IEEE. https://doi.org/10.1109/TIT.2022.3189542","ama":"Zhang Y, Vatedka S. List decoding random Euclidean codes and Infinite constellations. IEEE Transactions on Information Theory. 2022;68(12):7753-7786. doi:10.1109/TIT.2022.3189542","ieee":"Y. Zhang and S. Vatedka, “List decoding random Euclidean codes and Infinite constellations,” IEEE Transactions on Information Theory, vol. 68, no. 12. IEEE, pp. 7753–7786, 2022.","short":"Y. Zhang, S. Vatedka, IEEE Transactions on Information Theory 68 (2022) 7753–7786.","chicago":"Zhang, Yihan, and Shashank Vatedka. “List Decoding Random Euclidean Codes and Infinite Constellations.” IEEE Transactions on Information Theory. IEEE, 2022. https://doi.org/10.1109/TIT.2022.3189542.","ista":"Zhang Y, Vatedka S. 2022. List decoding random Euclidean codes and Infinite constellations. IEEE Transactions on Information Theory. 68(12), 7753–7786."}},{"publication_identifier":{"eissn":["1558-0857"],"issn":["0090-6778"]},"publication_status":"published","language":[{"iso":"eng"}],"issue":"11","volume":70,"abstract":[{"text":"A novel recursive list decoding (RLD) algorithm for Reed-Muller (RM) codes based on successive permutations (SP) of the codeword is presented. A low-complexity SP scheme applied to a subset of the symmetry group of RM codes is first proposed to carefully select a good codeword permutation on the fly. Then, the proposed SP technique is integrated into an improved RLD algorithm that initializes different decoding paths with random codeword permutations, which are sampled from the full symmetry group of RM codes. Finally, efficient latency and complexity reduction schemes are introduced that virtually preserve the error-correction performance of the proposed decoder. Simulation results demonstrate that at the target frame error rate of 10−3 for the RM code of length 256 with 163 information bits, the proposed decoder reduces 6% of the computational complexity and 22% of the decoding latency of the state-of-the-art semi-parallel simplified successive-cancellation decoder with fast Hadamard transform (SSC-FHT) that uses 96 permutations from the full symmetry group of RM codes, while relatively maintaining the error-correction performance and memory consumption of the semi-parallel permuted SSC-FHT decoder.","lang":"eng"}],"oa_version":"Preprint","scopus_import":"1","main_file_link":[{"open_access":"1","url":" https://doi.org/10.48550/arXiv.2109.02122"}],"month":"11","intvolume":" 70","date_updated":"2023-08-04T09:34:43Z","department":[{"_id":"MaMo"}],"_id":"12233","type":"journal_article","article_type":"original","status":"public","isi":1,"year":"2022","day":"01","publication":"IEEE Transactions on Communications","page":"7134-7145","date_published":"2022-11-01T00:00:00Z","doi":"10.1109/tcomm.2022.3211101","date_created":"2023-01-16T09:50:38Z","publisher":"Institute of Electrical and Electronics Engineers","quality_controlled":"1","oa":1,"citation":{"ieee":"N. Doan, S. A. Hashemi, M. Mondelli, and W. J. Gross, “Decoding Reed-Muller codes with successive codeword permutations,” IEEE Transactions on Communications, vol. 70, no. 11. Institute of Electrical and Electronics Engineers, pp. 7134–7145, 2022.","short":"N. Doan, S.A. Hashemi, M. Mondelli, W.J. Gross, IEEE Transactions on Communications 70 (2022) 7134–7145.","apa":"Doan, N., Hashemi, S. A., Mondelli, M., & Gross, W. J. (2022). Decoding Reed-Muller codes with successive codeword permutations. IEEE Transactions on Communications. Institute of Electrical and Electronics Engineers. https://doi.org/10.1109/tcomm.2022.3211101","ama":"Doan N, Hashemi SA, Mondelli M, Gross WJ. Decoding Reed-Muller codes with successive codeword permutations. IEEE Transactions on Communications. 2022;70(11):7134-7145. doi:10.1109/tcomm.2022.3211101","mla":"Doan, Nghia, et al. “Decoding Reed-Muller Codes with Successive Codeword Permutations.” IEEE Transactions on Communications, vol. 70, no. 11, Institute of Electrical and Electronics Engineers, 2022, pp. 7134–45, doi:10.1109/tcomm.2022.3211101.","ista":"Doan N, Hashemi SA, Mondelli M, Gross WJ. 2022. Decoding Reed-Muller codes with successive codeword permutations. IEEE Transactions on Communications. 70(11), 7134–7145.","chicago":"Doan, Nghia, Seyyed Ali Hashemi, Marco Mondelli, and Warren J. Gross. “Decoding Reed-Muller Codes with Successive Codeword Permutations.” IEEE Transactions on Communications. Institute of Electrical and Electronics Engineers, 2022. https://doi.org/10.1109/tcomm.2022.3211101."},"user_id":"4359f0d1-fa6c-11eb-b949-802e58b17ae8","author":[{"first_name":"Nghia","last_name":"Doan","full_name":"Doan, Nghia"},{"last_name":"Hashemi","full_name":"Hashemi, Seyyed Ali","first_name":"Seyyed Ali"},{"orcid":"0000-0002-3242-7020","full_name":"Mondelli, Marco","last_name":"Mondelli","id":"27EB676C-8706-11E9-9510-7717E6697425","first_name":"Marco"},{"first_name":"Warren J.","last_name":"Gross","full_name":"Gross, Warren J."}],"article_processing_charge":"No","external_id":{"arxiv":["2109.02122"],"isi":["000937284600006"]},"title":"Decoding Reed-Muller codes with successive codeword permutations"},{"author":[{"full_name":"Zhang, Yihan","last_name":"Zhang","first_name":"Yihan","id":"2ce5da42-b2ea-11eb-bba5-9f264e9d002c"},{"last_name":"Vatedka","full_name":"Vatedka, Shashank","first_name":"Shashank"},{"full_name":"Jaggi, Sidharth","last_name":"Jaggi","first_name":"Sidharth"},{"first_name":"Anand D.","last_name":"Sarwate","full_name":"Sarwate, Anand D."}],"article_processing_charge":"No","external_id":{"isi":["000838527100004"],"arxiv":["1801.05951"]},"title":"Quadratically constrained myopic adversarial channels","citation":{"chicago":"Zhang, Yihan, Shashank Vatedka, Sidharth Jaggi, and Anand D. Sarwate. “Quadratically Constrained Myopic Adversarial Channels.” IEEE Transactions on Information Theory. Institute of Electrical and Electronics Engineers, 2022. https://doi.org/10.1109/tit.2022.3167554.","ista":"Zhang Y, Vatedka S, Jaggi S, Sarwate AD. 2022. Quadratically constrained myopic adversarial channels. IEEE Transactions on Information Theory. 68(8), 4901–4948.","mla":"Zhang, Yihan, et al. “Quadratically Constrained Myopic Adversarial Channels.” IEEE Transactions on Information Theory, vol. 68, no. 8, Institute of Electrical and Electronics Engineers, 2022, pp. 4901–48, doi:10.1109/tit.2022.3167554.","ieee":"Y. Zhang, S. Vatedka, S. Jaggi, and A. D. Sarwate, “Quadratically constrained myopic adversarial channels,” IEEE Transactions on Information Theory, vol. 68, no. 8. Institute of Electrical and Electronics Engineers, pp. 4901–4948, 2022.","short":"Y. Zhang, S. Vatedka, S. Jaggi, A.D. Sarwate, IEEE Transactions on Information Theory 68 (2022) 4901–4948.","ama":"Zhang Y, Vatedka S, Jaggi S, Sarwate AD. Quadratically constrained myopic adversarial channels. IEEE Transactions on Information Theory. 2022;68(8):4901-4948. doi:10.1109/tit.2022.3167554","apa":"Zhang, Y., Vatedka, S., Jaggi, S., & Sarwate, A. D. (2022). Quadratically constrained myopic adversarial channels. IEEE Transactions on Information Theory. Institute of Electrical and Electronics Engineers. https://doi.org/10.1109/tit.2022.3167554"},"user_id":"4359f0d1-fa6c-11eb-b949-802e58b17ae8","page":"4901-4948","date_published":"2022-08-01T00:00:00Z","doi":"10.1109/tit.2022.3167554","date_created":"2023-01-16T10:01:19Z","isi":1,"year":"2022","day":"01","publication":"IEEE Transactions on Information Theory","quality_controlled":"1","publisher":"Institute of Electrical and Electronics Engineers","oa":1,"department":[{"_id":"MaMo"}],"date_updated":"2023-08-04T10:08:49Z","article_type":"original","type":"journal_article","status":"public","_id":"12273","issue":"8","volume":68,"publication_identifier":{"issn":["0018-9448"],"eissn":["1557-9654"]},"publication_status":"published","language":[{"iso":"eng"}],"scopus_import":"1","main_file_link":[{"url":"https://doi.org/10.48550/arXiv.1801.05951","open_access":"1"}],"month":"08","intvolume":" 68","abstract":[{"lang":"eng","text":"We study communication in the presence of a jamming adversary where quadratic power constraints are imposed on the transmitter and the jammer. The jamming signal is allowed to be a function of the codebook, and a noncausal but noisy observation of the transmitted codeword. For a certain range of the noise-to-signal ratios (NSRs) of the transmitter and the jammer, we are able to characterize the capacity of this channel under deterministic encoding or stochastic encoding, i.e., with no common randomness between the encoder/decoder pair. For the remaining NSR regimes, we determine the capacity under the assumption of a small amount of common randomness (at most 2log(n) bits in one sub-regime, and at most Ω(n) bits in the other sub-regime) available to the encoder-decoder pair. Our proof techniques involve a novel myopic list-decoding result for achievability, and a Plotkin-type push attack for the converse in a subregion of the NSRs, both of which may be of independent interest. We also give bounds on the strong secrecy capacity of this channel assuming that the jammer is simultaneously eavesdropping."}],"oa_version":"Preprint"},{"main_file_link":[{"url":"https://arxiv.org/abs/2012.13378","open_access":"1"}],"scopus_import":"1","intvolume":" 21","month":"06","abstract":[{"lang":"eng","text":"This paper characterizes the latency of the simplified successive-cancellation (SSC) decoding scheme for polar codes under hardware resource constraints. In particular, when the number of processing elements P that can perform SSC decoding operations in parallel is limited, as is the case in practice, the latency of SSC decoding is O(N1-1/μ + N/P log2 log2 N/P), where N is the block length of the code and μ is the scaling exponent of the channel. Three direct consequences of this bound are presented. First, in a fully-parallel implementation where P = N/2, the latency of SSC decoding is O(N1-1/μ), which is sublinear in the block length. This recovers a result from our earlier work. Second, in a fully-serial implementation where P = 1, the latency of SSC decoding scales as O(N log2 log2 N). The multiplicative constant is also calculated: we show that the latency of SSC decoding when P = 1 is given by (2 + o(1))N log2 log2 N. Third, in a semi-parallel implementation, the smallest P that gives the same latency as that of the fully-parallel implementation is P = N1/μ. The tightness of our bound on SSC decoding latency and the applicability of the foregoing results is validated through extensive simulations."}],"oa_version":"Preprint","issue":"6","volume":21,"related_material":{"record":[{"relation":"earlier_version","status":"public","id":"10053"}]},"publication_status":"published","publication_identifier":{"eissn":["1558-2248"],"issn":["1536-1276"]},"language":[{"iso":"eng"}],"article_type":"original","type":"journal_article","status":"public","_id":"10364","department":[{"_id":"MaMo"}],"date_updated":"2023-08-14T06:55:57Z","oa":1,"quality_controlled":"1","publisher":"Institute of Electrical and Electronics Engineers","acknowledgement":"S. A. Hashemi is supported by a Postdoctoral Fellowship from the Natural Sciences and\r\nEngineering Research Council of Canada (NSERC) and by Huawei. M. Mondelli is partially\r\nsupported by the 2019 Lopez-Loreta Prize. A. Fazeli and A. Vardy were supported in part by\r\nthe National Science Foundation under Grant CCF-1764104.","page":"3909-3920","date_created":"2021-11-28T23:01:29Z","doi":"10.1109/TWC.2021.3125626","date_published":"2022-06-01T00:00:00Z","year":"2022","isi":1,"publication":"IEEE Transactions on Wireless Communications","day":"01","project":[{"_id":"059876FA-7A3F-11EA-A408-12923DDC885E","name":"Prix Lopez-Loretta 2019 - Marco Mondelli"}],"external_id":{"arxiv":["2012.13378"],"isi":["000809406400028"]},"article_processing_charge":"No","author":[{"first_name":"Seyyed Ali","last_name":"Hashemi","full_name":"Hashemi, Seyyed Ali"},{"last_name":"Mondelli","full_name":"Mondelli, Marco","orcid":"0000-0002-3242-7020","first_name":"Marco","id":"27EB676C-8706-11E9-9510-7717E6697425"},{"first_name":"Arman","last_name":"Fazeli","full_name":"Fazeli, Arman"},{"first_name":"Alexander","full_name":"Vardy, Alexander","last_name":"Vardy"},{"full_name":"Cioffi, John","last_name":"Cioffi","first_name":"John"},{"first_name":"Andrea","full_name":"Goldsmith, Andrea","last_name":"Goldsmith"}],"title":"Parallelism versus latency in simplified successive-cancellation decoding of polar codes","citation":{"ista":"Hashemi SA, Mondelli M, Fazeli A, Vardy A, Cioffi J, Goldsmith A. 2022. Parallelism versus latency in simplified successive-cancellation decoding of polar codes. IEEE Transactions on Wireless Communications. 21(6), 3909–3920.","chicago":"Hashemi, Seyyed Ali, Marco Mondelli, Arman Fazeli, Alexander Vardy, John Cioffi, and Andrea Goldsmith. “Parallelism versus Latency in Simplified Successive-Cancellation Decoding of Polar Codes.” IEEE Transactions on Wireless Communications. Institute of Electrical and Electronics Engineers, 2022. https://doi.org/10.1109/TWC.2021.3125626.","ieee":"S. A. Hashemi, M. Mondelli, A. Fazeli, A. Vardy, J. Cioffi, and A. Goldsmith, “Parallelism versus latency in simplified successive-cancellation decoding of polar codes,” IEEE Transactions on Wireless Communications, vol. 21, no. 6. Institute of Electrical and Electronics Engineers, pp. 3909–3920, 2022.","short":"S.A. Hashemi, M. Mondelli, A. Fazeli, A. Vardy, J. Cioffi, A. Goldsmith, IEEE Transactions on Wireless Communications 21 (2022) 3909–3920.","ama":"Hashemi SA, Mondelli M, Fazeli A, Vardy A, Cioffi J, Goldsmith A. Parallelism versus latency in simplified successive-cancellation decoding of polar codes. IEEE Transactions on Wireless Communications. 2022;21(6):3909-3920. doi:10.1109/TWC.2021.3125626","apa":"Hashemi, S. A., Mondelli, M., Fazeli, A., Vardy, A., Cioffi, J., & Goldsmith, A. (2022). Parallelism versus latency in simplified successive-cancellation decoding of polar codes. IEEE Transactions on Wireless Communications. Institute of Electrical and Electronics Engineers. https://doi.org/10.1109/TWC.2021.3125626","mla":"Hashemi, Seyyed Ali, et al. “Parallelism versus Latency in Simplified Successive-Cancellation Decoding of Polar Codes.” IEEE Transactions on Wireless Communications, vol. 21, no. 6, Institute of Electrical and Electronics Engineers, 2022, pp. 3909–20, doi:10.1109/TWC.2021.3125626."},"user_id":"4359f0d1-fa6c-11eb-b949-802e58b17ae8"},{"language":[{"iso":"eng"}],"publication_identifier":{"isbn":["9781665483414"]},"publication_status":"published","oa_version":"Preprint","abstract":[{"text":"In this paper, we study the compression of a target two-layer neural network with N nodes into a compressed network with MIEEE Information Theory Workshop, IEEE, 2022, pp. 588–93, doi:10.1109/ITW54588.2022.9965870.","ama":"Amani MH, Bombari S, Mondelli M, Pukdee R, Rini S. Sharp asymptotics on the compression of two-layer neural networks. IEEE Information Theory Workshop. 2022:588-593. doi:10.1109/ITW54588.2022.9965870","apa":"Amani, M. H., Bombari, S., Mondelli, M., Pukdee, R., & Rini, S. (2022). Sharp asymptotics on the compression of two-layer neural networks. IEEE Information Theory Workshop. Mumbai, India: IEEE. https://doi.org/10.1109/ITW54588.2022.9965870","ieee":"M. H. Amani, S. Bombari, M. Mondelli, R. Pukdee, and S. Rini, “Sharp asymptotics on the compression of two-layer neural networks,” IEEE Information Theory Workshop. IEEE, pp. 588–593, 2022.","short":"M.H. Amani, S. Bombari, M. Mondelli, R. Pukdee, S. Rini, IEEE Information Theory Workshop (2022) 588–593.","chicago":"Amani, Mohammad Hossein, Simone Bombari, Marco Mondelli, Rattana Pukdee, and Stefano Rini. “Sharp Asymptotics on the Compression of Two-Layer Neural Networks.” IEEE Information Theory Workshop. IEEE, 2022. https://doi.org/10.1109/ITW54588.2022.9965870.","ista":"Amani MH, Bombari S, Mondelli M, Pukdee R, Rini S. 2022. Sharp asymptotics on the compression of two-layer neural networks. IEEE Information Theory Workshop., 588–593."},"title":"Sharp asymptotics on the compression of two-layer neural networks","author":[{"full_name":"Amani, Mohammad Hossein","last_name":"Amani","first_name":"Mohammad Hossein"},{"full_name":"Bombari, Simone","last_name":"Bombari","id":"ca726dda-de17-11ea-bc14-f9da834f63aa","first_name":"Simone"},{"id":"27EB676C-8706-11E9-9510-7717E6697425","first_name":"Marco","last_name":"Mondelli","full_name":"Mondelli, Marco","orcid":"0000-0002-3242-7020"},{"first_name":"Rattana","full_name":"Pukdee, Rattana","last_name":"Pukdee"},{"last_name":"Rini","full_name":"Rini, Stefano","first_name":"Stefano"}],"article_processing_charge":"No","external_id":{"arxiv":["2205.08199"]}},{"acknowledgement":"The authors were partially supported by the 2019 Lopez-Loreta prize, and they would like to thank\r\nQuynh Nguyen, Mahdi Soltanolkotabi and Adel Javanmard for helpful discussions.\r\n","quality_controlled":"1","publisher":"Curran Associates","oa":1,"year":"2022","day":"24","publication":"36th Conference on Neural Information Processing Systems","page":"7628-7640","date_published":"2022-07-24T00:00:00Z","date_created":"2023-02-10T13:46:37Z","project":[{"_id":"059876FA-7A3F-11EA-A408-12923DDC885E","name":"Prix Lopez-Loretta 2019 - Marco Mondelli"}],"citation":{"ama":"Bombari S, Amani MH, Mondelli M. Memorization and optimization in deep neural networks with minimum over-parameterization. In: 36th Conference on Neural Information Processing Systems. Vol 35. Curran Associates; 2022:7628-7640.","apa":"Bombari, S., Amani, M. H., & Mondelli, M. (2022). Memorization and optimization in deep neural networks with minimum over-parameterization. In 36th Conference on Neural Information Processing Systems (Vol. 35, pp. 7628–7640). Curran Associates.","ieee":"S. Bombari, M. H. Amani, and M. Mondelli, “Memorization and optimization in deep neural networks with minimum over-parameterization,” in 36th Conference on Neural Information Processing Systems, 2022, vol. 35, pp. 7628–7640.","short":"S. Bombari, M.H. Amani, M. Mondelli, in:, 36th Conference on Neural Information Processing Systems, Curran Associates, 2022, pp. 7628–7640.","mla":"Bombari, Simone, et al. “Memorization and Optimization in Deep Neural Networks with Minimum Over-Parameterization.” 36th Conference on Neural Information Processing Systems, vol. 35, Curran Associates, 2022, pp. 7628–40.","ista":"Bombari S, Amani MH, Mondelli M. 2022. Memorization and optimization in deep neural networks with minimum over-parameterization. 36th Conference on Neural Information Processing Systems. vol. 35, 7628–7640.","chicago":"Bombari, Simone, Mohammad Hossein Amani, and Marco Mondelli. “Memorization and Optimization in Deep Neural Networks with Minimum Over-Parameterization.” In 36th Conference on Neural Information Processing Systems, 35:7628–40. Curran Associates, 2022."},"user_id":"2DF688A6-F248-11E8-B48F-1D18A9856A87","author":[{"full_name":"Bombari, Simone","last_name":"Bombari","first_name":"Simone","id":"ca726dda-de17-11ea-bc14-f9da834f63aa"},{"first_name":"Mohammad Hossein","full_name":"Amani, Mohammad Hossein","last_name":"Amani"},{"id":"27EB676C-8706-11E9-9510-7717E6697425","first_name":"Marco","full_name":"Mondelli, Marco","orcid":"0000-0002-3242-7020","last_name":"Mondelli"}],"external_id":{"arxiv":["2205.10217"]},"article_processing_charge":"No","title":"Memorization and optimization in deep neural networks with minimum over-parameterization","abstract":[{"lang":"eng","text":"The Neural Tangent Kernel (NTK) has emerged as a powerful tool to provide memorization, optimization and generalization guarantees in deep neural networks. A line of work has studied the NTK spectrum for two-layer and deep networks with at least a layer with Ω(N) neurons, N being the number of training samples. Furthermore, there is increasing evidence suggesting that deep networks with sub-linear layer widths are powerful memorizers and optimizers, as long as the number of parameters exceeds the number of samples. Thus, a natural open question is whether the NTK is well conditioned in such a challenging sub-linear setup. In this paper, we answer this question in the affirmative. Our key technical contribution is a lower bound on the smallest NTK eigenvalue for deep networks with the minimum possible over-parameterization: the number of parameters is roughly Ω(N) and, hence, the number of neurons is as little as Ω(N−−√). To showcase the applicability of our NTK bounds, we provide two results concerning memorization capacity and optimization guarantees for gradient descent training."}],"oa_version":"Preprint","main_file_link":[{"url":" https://doi.org/10.48550/arXiv.2205.10217","open_access":"1"}],"month":"07","intvolume":" 35","publication_identifier":{"isbn":["9781713871088"]},"publication_status":"published","language":[{"iso":"eng"}],"volume":35,"_id":"12537","type":"conference","status":"public","date_updated":"2023-12-18T11:39:09Z","department":[{"_id":"MaMo"}]},{"file":[{"checksum":"01411ffa76d3e380a0446baeb89b1ef7","file_id":"12481","success":1,"access_level":"open_access","relation":"main_file","content_type":"application/pdf","date_created":"2023-02-02T08:35:52Z","file_name":"2022_JourStatisticalMechanics_Mondelli.pdf","creator":"dernst","date_updated":"2023-02-02T08:35:52Z","file_size":1729997}],"language":[{"iso":"eng"}],"publication_identifier":{"issn":["1742-5468"]},"publication_status":"published","issue":"11","volume":2022,"related_material":{"record":[{"status":"public","id":"10598","relation":"earlier_version"}]},"oa_version":"Published Version","abstract":[{"lang":"eng","text":"We consider the problem of estimating a signal from measurements obtained via a generalized linear model. We focus on estimators based on approximate message passing (AMP), a family of iterative algorithms with many appealing features: the performance of AMP in the high-dimensional limit can be succinctly characterized under suitable model assumptions; AMP can also be tailored to the empirical distribution of the signal entries, and for a wide class of estimation problems, AMP is conjectured to be optimal among all polynomial-time algorithms. However, a major issue of AMP is that in many models (such as phase retrieval), it requires an initialization correlated with the ground-truth signal and independent from the measurement matrix. Assuming that such an initialization is available is typically not realistic. In this paper, we solve this problem by proposing an AMP algorithm initialized with a spectral estimator. With such an initialization, the standard AMP analysis fails since the spectral estimator depends in a complicated way on the design matrix. Our main contribution is a rigorous characterization of the performance of AMP with spectral initialization in the high-dimensional limit. The key technical idea is to define and analyze a two-phase artificial AMP algorithm that first produces the spectral estimator, and then closely approximates the iterates of the true AMP. We also provide numerical results that demonstrate the validity of the proposed approach."}],"month":"11","intvolume":" 2022","scopus_import":"1","ddc":["510","530"],"date_updated":"2024-03-07T10:36:52Z","file_date_updated":"2023-02-02T08:35:52Z","department":[{"_id":"MaMo"}],"_id":"12480","status":"public","keyword":["Statistics","Probability and Uncertainty","Statistics and Probability","Statistical and Nonlinear Physics"],"article_type":"original","type":"journal_article","tmp":{"legal_code_url":"https://creativecommons.org/licenses/by/4.0/legalcode","image":"/images/cc_by.png","name":"Creative Commons Attribution 4.0 International Public License (CC-BY 4.0)","short":"CC BY (4.0)"},"day":"24","publication":"Journal of Statistical Mechanics: Theory and Experiment","has_accepted_license":"1","isi":1,"year":"2022","doi":"10.1088/1742-5468/ac9828","date_published":"2022-11-24T00:00:00Z","date_created":"2023-02-02T08:31:57Z","acknowledgement":"The authors would like to thank Andrea Montanari for helpful discussions.\r\nM Mondelli was partially supported by the 2019 Lopez-Loreta Prize. R Venkataramanan was partially supported by the Alan Turing Institute under the EPSRC Grant\r\nEP/N510129/1.","publisher":"IOP Publishing","quality_controlled":"1","oa":1,"user_id":"4359f0d1-fa6c-11eb-b949-802e58b17ae8","citation":{"chicago":"Mondelli, Marco, and Ramji Venkataramanan. “Approximate Message Passing with Spectral Initialization for Generalized Linear Models.” Journal of Statistical Mechanics: Theory and Experiment. IOP Publishing, 2022. https://doi.org/10.1088/1742-5468/ac9828.","ista":"Mondelli M, Venkataramanan R. 2022. Approximate message passing with spectral initialization for generalized linear models. Journal of Statistical Mechanics: Theory and Experiment. 2022(11), 114003.","mla":"Mondelli, Marco, and Ramji Venkataramanan. “Approximate Message Passing with Spectral Initialization for Generalized Linear Models.” Journal of Statistical Mechanics: Theory and Experiment, vol. 2022, no. 11, 114003, IOP Publishing, 2022, doi:10.1088/1742-5468/ac9828.","apa":"Mondelli, M., & Venkataramanan, R. (2022). Approximate message passing with spectral initialization for generalized linear models. Journal of Statistical Mechanics: Theory and Experiment. IOP Publishing. https://doi.org/10.1088/1742-5468/ac9828","ama":"Mondelli M, Venkataramanan R. Approximate message passing with spectral initialization for generalized linear models. Journal of Statistical Mechanics: Theory and Experiment. 2022;2022(11). doi:10.1088/1742-5468/ac9828","short":"M. Mondelli, R. Venkataramanan, Journal of Statistical Mechanics: Theory and Experiment 2022 (2022).","ieee":"M. Mondelli and R. Venkataramanan, “Approximate message passing with spectral initialization for generalized linear models,” Journal of Statistical Mechanics: Theory and Experiment, vol. 2022, no. 11. IOP Publishing, 2022."},"title":"Approximate message passing with spectral initialization for generalized linear models","author":[{"first_name":"Marco","id":"27EB676C-8706-11E9-9510-7717E6697425","orcid":"0000-0002-3242-7020","full_name":"Mondelli, Marco","last_name":"Mondelli"},{"full_name":"Venkataramanan, Ramji","last_name":"Venkataramanan","first_name":"Ramji"}],"article_processing_charge":"Yes (via OA deal)","external_id":{"isi":["000889589900001"]},"article_number":"114003","project":[{"_id":"059876FA-7A3F-11EA-A408-12923DDC885E","name":"Prix Lopez-Loretta 2019 - Marco Mondelli"}]},{"volume":139,"publication_status":"published","language":[{"iso":"eng"}],"alternative_title":["Proceedings of Machine Learning Research"],"main_file_link":[{"url":"http://proceedings.mlr.press/v139/nguyen21g.html","open_access":"1"}],"intvolume":" 139","abstract":[{"text":"A recent line of work has analyzed the theoretical properties of deep neural networks via the Neural Tangent Kernel (NTK). In particular, the smallest eigenvalue of the NTK has been related to the memorization capacity, the global convergence of gradient descent algorithms and the generalization of deep nets. However, existing results either provide bounds in the two-layer setting or assume that the spectrum of the NTK matrices is bounded away from 0 for multi-layer networks. In this paper, we provide tight bounds on the smallest eigenvalue of NTK matrices for deep ReLU nets, both in the limiting case of infinite widths and for finite widths. In the finite-width setting, the network architectures we consider are fairly general: we require the existence of a wide layer with roughly order of $N$ neurons, $N$ being the number of data samples; and the scaling of the remaining layer widths is arbitrary (up to logarithmic factors). To obtain our results, we analyze various quantities of independent interest: we give lower bounds on the smallest singular value of hidden feature matrices, and upper bounds on the Lipschitz constant of input-output feature maps.","lang":"eng"}],"oa_version":"Published Version","department":[{"_id":"MaMo"}],"date_updated":"2022-01-04T09:59:21Z","type":"conference","conference":{"name":"ICML: International Conference on Machine Learning","end_date":"2021-07-24","location":"Virtual","start_date":"2021-07-18"},"status":"public","_id":"10595","page":"8119-8129","date_published":"2021-01-01T00:00:00Z","date_created":"2022-01-03T10:57:49Z","year":"2021","publication":"Proceedings of the 38th International Conference on Machine Learning","quality_controlled":"1","publisher":"ML Research Press","oa":1,"acknowledgement":"The authors would like to thank the anonymous reviewers for their helpful comments. MM was partially supported\r\nby the 2019 Lopez-Loreta Prize. QN and GM acknowledge support from the European Research Council (ERC) under\r\nthe European Union’s Horizon 2020 research and innovation programme (grant agreement no 757983).","author":[{"first_name":"Quynh","last_name":"Nguyen","full_name":"Nguyen, Quynh"},{"id":"27EB676C-8706-11E9-9510-7717E6697425","first_name":"Marco","last_name":"Mondelli","full_name":"Mondelli, Marco","orcid":"0000-0002-3242-7020"},{"last_name":"Montufar","full_name":"Montufar, Guido F","first_name":"Guido F"}],"external_id":{"arxiv":["2012.11654"]},"article_processing_charge":"No","editor":[{"first_name":"Marina","full_name":"Meila, Marina","last_name":"Meila"},{"first_name":"Tong","full_name":"Zhang, Tong","last_name":"Zhang"}],"title":"Tight bounds on the smallest eigenvalue of the neural tangent kernel for deep ReLU networks","citation":{"chicago":"Nguyen, Quynh, Marco Mondelli, and Guido F Montufar. “Tight Bounds on the Smallest Eigenvalue of the Neural Tangent Kernel for Deep ReLU Networks.” In Proceedings of the 38th International Conference on Machine Learning, edited by Marina Meila and Tong Zhang, 139:8119–29. ML Research Press, 2021.","ista":"Nguyen Q, Mondelli M, Montufar GF. 2021. Tight bounds on the smallest eigenvalue of the neural tangent kernel for deep ReLU networks. Proceedings of the 38th International Conference on Machine Learning. ICML: International Conference on Machine Learning, Proceedings of Machine Learning Research, vol. 139, 8119–8129.","mla":"Nguyen, Quynh, et al. “Tight Bounds on the Smallest Eigenvalue of the Neural Tangent Kernel for Deep ReLU Networks.” Proceedings of the 38th International Conference on Machine Learning, edited by Marina Meila and Tong Zhang, vol. 139, ML Research Press, 2021, pp. 8119–29.","ama":"Nguyen Q, Mondelli M, Montufar GF. Tight bounds on the smallest eigenvalue of the neural tangent kernel for deep ReLU networks. In: Meila M, Zhang T, eds. Proceedings of the 38th International Conference on Machine Learning. Vol 139. ML Research Press; 2021:8119-8129.","apa":"Nguyen, Q., Mondelli, M., & Montufar, G. F. (2021). Tight bounds on the smallest eigenvalue of the neural tangent kernel for deep ReLU networks. In M. Meila & T. Zhang (Eds.), Proceedings of the 38th International Conference on Machine Learning (Vol. 139, pp. 8119–8129). Virtual: ML Research Press.","ieee":"Q. Nguyen, M. Mondelli, and G. F. Montufar, “Tight bounds on the smallest eigenvalue of the neural tangent kernel for deep ReLU networks,” in Proceedings of the 38th International Conference on Machine Learning, Virtual, 2021, vol. 139, pp. 8119–8129.","short":"Q. Nguyen, M. Mondelli, G.F. Montufar, in:, M. Meila, T. Zhang (Eds.), Proceedings of the 38th International Conference on Machine Learning, ML Research Press, 2021, pp. 8119–8129."},"user_id":"8b945eb4-e2f2-11eb-945a-df72226e66a9","project":[{"name":"Prix Lopez-Loretta 2019 - Marco Mondelli","_id":"059876FA-7A3F-11EA-A408-12923DDC885E"}]},{"publication_status":"published","publication_identifier":{"issn":["1058-6393"],"isbn":["9781665458283"]},"language":[{"iso":"eng"}],"volume":"2021-October","abstract":[{"lang":"eng","text":"A two-part successive syndrome-check decoding of polar codes is proposed with the first part successively refining the received codeword and the second part checking its syndrome. A new formulation of the successive-cancellation (SC) decoding algorithm is presented that allows for successively refining the received codeword by comparing the log-likelihood ratio value of a frozen bit with its predefined value. The syndrome of the refined received codeword is then checked for possible errors. In case there are no errors, the decoding process is terminated. Otherwise, the decoder continues to refine the received codeword. The proposed method is extended to the case of SC list (SCL) decoding by terminating the decoding process when the syndrome of the best candidate in the list indicates no errors. Simulation results show that the proposed method reduces the time-complexity of SC and SCL decoders and their fast variants, especially at high signal-to-noise ratios."}],"oa_version":"Preprint","main_file_link":[{"url":" https://doi.org/10.48550/arXiv.2112.00057","open_access":"1"}],"scopus_import":"1","month":"11","date_updated":"2023-02-13T10:44:16Z","department":[{"_id":"MaMo"}],"_id":"10599","conference":{"start_date":"2021-10-31","end_date":"2021-11-03","location":"Virtual, Pacific Grove, CA, United States","name":"ACSSC: Asilomar Conference on Signals, Systems, and Computers"},"type":"conference","status":"public","year":"2021","publication":"Proceedings of the 55th Asilomar Conference on Signals, Systems, and Computers","day":"01","page":"943-947","date_created":"2022-01-03T11:39:51Z","date_published":"2021-11-01T00:00:00Z","doi":"10.1109/IEEECONF53345.2021.9723394","acknowledgement":"This work is supported in part by ONR grant N00014-18-1-2191. S. A. Hashemi was supported by a Postdoctoral Fellowship from the Natural Sciences and Engineering Research Council of Canada (NSERC) and by Huawei. M. Mondelli was partially supported by the 2019 Lopez-Loreta Prize.","oa":1,"quality_controlled":"1","publisher":"Institute of Electrical and Electronics Engineers","citation":{"mla":"Hashemi, Seyyed Ali, et al. “Successive Syndrome-Check Decoding of Polar Codes.” Proceedings of the 55th Asilomar Conference on Signals, Systems, and Computers, vol. 2021–October, Institute of Electrical and Electronics Engineers, 2021, pp. 943–47, doi:10.1109/IEEECONF53345.2021.9723394.","apa":"Hashemi, S. A., Mondelli, M., Cioffi, J., & Goldsmith, A. (2021). Successive syndrome-check decoding of polar codes. In Proceedings of the 55th Asilomar Conference on Signals, Systems, and Computers (Vol. 2021–October, pp. 943–947). Virtual, Pacific Grove, CA, United States: Institute of Electrical and Electronics Engineers. https://doi.org/10.1109/IEEECONF53345.2021.9723394","ama":"Hashemi SA, Mondelli M, Cioffi J, Goldsmith A. Successive syndrome-check decoding of polar codes. In: Proceedings of the 55th Asilomar Conference on Signals, Systems, and Computers. Vol 2021-October. Institute of Electrical and Electronics Engineers; 2021:943-947. doi:10.1109/IEEECONF53345.2021.9723394","ieee":"S. A. Hashemi, M. Mondelli, J. Cioffi, and A. Goldsmith, “Successive syndrome-check decoding of polar codes,” in Proceedings of the 55th Asilomar Conference on Signals, Systems, and Computers, Virtual, Pacific Grove, CA, United States, 2021, vol. 2021–October, pp. 943–947.","short":"S.A. Hashemi, M. Mondelli, J. Cioffi, A. Goldsmith, in:, Proceedings of the 55th Asilomar Conference on Signals, Systems, and Computers, Institute of Electrical and Electronics Engineers, 2021, pp. 943–947.","chicago":"Hashemi, Seyyed Ali, Marco Mondelli, John Cioffi, and Andrea Goldsmith. “Successive Syndrome-Check Decoding of Polar Codes.” In Proceedings of the 55th Asilomar Conference on Signals, Systems, and Computers, 2021–October:943–47. Institute of Electrical and Electronics Engineers, 2021. https://doi.org/10.1109/IEEECONF53345.2021.9723394.","ista":"Hashemi SA, Mondelli M, Cioffi J, Goldsmith A. 2021. Successive syndrome-check decoding of polar codes. Proceedings of the 55th Asilomar Conference on Signals, Systems, and Computers. ACSSC: Asilomar Conference on Signals, Systems, and Computers vol. 2021–October, 943–947."},"user_id":"2DF688A6-F248-11E8-B48F-1D18A9856A87","external_id":{"arxiv":["2112.00057"]},"article_processing_charge":"No","author":[{"full_name":"Hashemi, Seyyed Ali","last_name":"Hashemi","first_name":"Seyyed Ali"},{"last_name":"Mondelli","orcid":"0000-0002-3242-7020","full_name":"Mondelli, Marco","id":"27EB676C-8706-11E9-9510-7717E6697425","first_name":"Marco"},{"last_name":"Cioffi","full_name":"Cioffi, John","first_name":"John"},{"last_name":"Goldsmith","full_name":"Goldsmith, Andrea","first_name":"Andrea"}],"title":"Successive syndrome-check decoding of polar codes","project":[{"name":"Prix Lopez-Loretta 2019 - Marco Mondelli","_id":"059876FA-7A3F-11EA-A408-12923DDC885E"}]},{"acknowledgement":"The authors would like to thank the anonymous reviewers for their helpful comments. MM was partially supported by the 2019 Lopez-Loreta Prize. QN and GM acknowledge support from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (grant agreement no 757983).","publisher":"ML Research Press","quality_controlled":"1","oa":1,"has_accepted_license":"1","year":"2021","day":"01","publication":"Proceedings of the 38th International Conference on Machine Learning","page":"8119-8129","date_published":"2021-07-01T00:00:00Z","date_created":"2023-06-18T22:00:48Z","project":[{"_id":"059876FA-7A3F-11EA-A408-12923DDC885E","name":"Prix Lopez-Loretta 2019 - Marco Mondelli"}],"citation":{"chicago":"Nguyen, Quynh, Marco Mondelli, and Guido Montufar. “Tight Bounds on the Smallest Eigenvalue of the Neural Tangent Kernel for Deep ReLU Networks.” In Proceedings of the 38th International Conference on Machine Learning, 139:8119–29. ML Research Press, 2021.","ista":"Nguyen Q, Mondelli M, Montufar G. 2021. Tight bounds on the smallest Eigenvalue of the neural tangent kernel for deep ReLU networks. Proceedings of the 38th International Conference on Machine Learning. International Conference on Machine Learning vol. 139, 8119–8129.","mla":"Nguyen, Quynh, et al. “Tight Bounds on the Smallest Eigenvalue of the Neural Tangent Kernel for Deep ReLU Networks.” Proceedings of the 38th International Conference on Machine Learning, vol. 139, ML Research Press, 2021, pp. 8119–29.","apa":"Nguyen, Q., Mondelli, M., & Montufar, G. (2021). Tight bounds on the smallest Eigenvalue of the neural tangent kernel for deep ReLU networks. In Proceedings of the 38th International Conference on Machine Learning (Vol. 139, pp. 8119–8129). Virtual: ML Research Press.","ama":"Nguyen Q, Mondelli M, Montufar G. Tight bounds on the smallest Eigenvalue of the neural tangent kernel for deep ReLU networks. In: Proceedings of the 38th International Conference on Machine Learning. Vol 139. ML Research Press; 2021:8119-8129.","short":"Q. Nguyen, M. Mondelli, G. Montufar, in:, Proceedings of the 38th International Conference on Machine Learning, ML Research Press, 2021, pp. 8119–8129.","ieee":"Q. Nguyen, M. Mondelli, and G. Montufar, “Tight bounds on the smallest Eigenvalue of the neural tangent kernel for deep ReLU networks,” in Proceedings of the 38th International Conference on Machine Learning, Virtual, 2021, vol. 139, pp. 8119–8129."},"user_id":"2DF688A6-F248-11E8-B48F-1D18A9856A87","author":[{"last_name":"Nguyen","full_name":"Nguyen, Quynh","first_name":"Quynh"},{"full_name":"Mondelli, Marco","orcid":"0000-0002-3242-7020","last_name":"Mondelli","id":"27EB676C-8706-11E9-9510-7717E6697425","first_name":"Marco"},{"last_name":"Montufar","full_name":"Montufar, Guido","first_name":"Guido"}],"external_id":{"arxiv":["2012.11654"]},"article_processing_charge":"No","title":"Tight bounds on the smallest Eigenvalue of the neural tangent kernel for deep ReLU networks","abstract":[{"text":"A recent line of work has analyzed the theoretical properties of deep neural networks via the Neural Tangent Kernel (NTK). In particular, the smallest eigenvalue of the NTK has been related to the memorization capacity, the global convergence of gradient descent algorithms and the generalization of deep nets. However, existing results either provide bounds in the two-layer setting or assume that the spectrum of the NTK matrices is bounded away from 0 for multi-layer networks. In this paper, we provide tight bounds on the smallest eigenvalue of NTK matrices for deep ReLU nets, both in the limiting case of infinite widths and for finite widths. In the finite-width setting, the network architectures we consider are fairly general: we require the existence of a wide layer with roughly order of N neurons, N being the number of data samples; and the scaling of the remaining layer widths is arbitrary (up to logarithmic factors). To obtain our results, we analyze various quantities of independent interest: we give lower bounds on the smallest singular value of hidden feature matrices, and upper bounds on the Lipschitz constant of input-output feature maps.","lang":"eng"}],"oa_version":"Published Version","scopus_import":"1","month":"07","intvolume":" 139","publication_identifier":{"eissn":["2640-3498"],"isbn":["9781713845065"]},"publication_status":"published","file":[{"file_name":"2021_PMLR_Nguyen.pdf","date_created":"2023-06-19T10:49:12Z","creator":"dernst","file_size":591332,"date_updated":"2023-06-19T10:49:12Z","success":1,"checksum":"19489cf5e16a0596b1f92e317d97c9b0","file_id":"13155","relation":"main_file","access_level":"open_access","content_type":"application/pdf"}],"language":[{"iso":"eng"}],"volume":139,"_id":"13146","type":"conference","tmp":{"legal_code_url":"https://creativecommons.org/licenses/by/4.0/legalcode","image":"/images/cc_by.png","name":"Creative Commons Attribution 4.0 International Public License (CC-BY 4.0)","short":"CC BY (4.0)"},"conference":{"location":"Virtual","end_date":"2021-07-24","start_date":"2021-07-18","name":"International Conference on Machine Learning"},"status":"public","date_updated":"2023-06-19T10:52:51Z","ddc":["000"],"department":[{"_id":"MaMo"}],"file_date_updated":"2023-06-19T10:49:12Z"},{"title":"Sublinear latency for simplified successive cancellation decoding of polar codes","author":[{"orcid":"0000-0002-3242-7020","full_name":"Mondelli, Marco","last_name":"Mondelli","first_name":"Marco","id":"27EB676C-8706-11E9-9510-7717E6697425"},{"last_name":"Hashemi","full_name":"Hashemi, Seyyed Ali","first_name":"Seyyed Ali"},{"full_name":"Cioffi, John M.","last_name":"Cioffi","first_name":"John M."},{"full_name":"Goldsmith, Andrea","last_name":"Goldsmith","first_name":"Andrea"}],"article_processing_charge":"No","external_id":{"arxiv":["1909.04892"],"isi":["000607808800002"]},"user_id":"4359f0d1-fa6c-11eb-b949-802e58b17ae8","citation":{"ama":"Mondelli M, Hashemi SA, Cioffi JM, Goldsmith A. Sublinear latency for simplified successive cancellation decoding of polar codes. IEEE Transactions on Wireless Communications. 2021;20(1):18-27. doi:10.1109/TWC.2020.3022922","apa":"Mondelli, M., Hashemi, S. A., Cioffi, J. M., & Goldsmith, A. (2021). Sublinear latency for simplified successive cancellation decoding of polar codes. IEEE Transactions on Wireless Communications. IEEE. https://doi.org/10.1109/TWC.2020.3022922","ieee":"M. Mondelli, S. A. Hashemi, J. M. Cioffi, and A. Goldsmith, “Sublinear latency for simplified successive cancellation decoding of polar codes,” IEEE Transactions on Wireless Communications, vol. 20, no. 1. IEEE, pp. 18–27, 2021.","short":"M. Mondelli, S.A. Hashemi, J.M. Cioffi, A. Goldsmith, IEEE Transactions on Wireless Communications 20 (2021) 18–27.","mla":"Mondelli, Marco, et al. “Sublinear Latency for Simplified Successive Cancellation Decoding of Polar Codes.” IEEE Transactions on Wireless Communications, vol. 20, no. 1, IEEE, 2021, pp. 18–27, doi:10.1109/TWC.2020.3022922.","ista":"Mondelli M, Hashemi SA, Cioffi JM, Goldsmith A. 2021. Sublinear latency for simplified successive cancellation decoding of polar codes. IEEE Transactions on Wireless Communications. 20(1), 18–27.","chicago":"Mondelli, Marco, Seyyed Ali Hashemi, John M. Cioffi, and Andrea Goldsmith. “Sublinear Latency for Simplified Successive Cancellation Decoding of Polar Codes.” IEEE Transactions on Wireless Communications. IEEE, 2021. https://doi.org/10.1109/TWC.2020.3022922."},"doi":"10.1109/TWC.2020.3022922","date_published":"2021-01-01T00:00:00Z","date_created":"2021-01-31T23:01:21Z","page":"18-27","day":"01","publication":"IEEE Transactions on Wireless Communications","isi":1,"year":"2021","publisher":"IEEE","quality_controlled":"1","oa":1,"acknowledgement":"M. Mondelli was partially supported by grants NSF DMS-1613091, CCF-1714305, IIS-1741162, and ONR N00014-18-1-2729. S. A. Hashemi is supported by a Postdoctoral Fellowship from the Natural Sciences and Engineering Research Council of Canada (NSERC) and by Huawei. The authors would like to thank the anonymous reviewers for their comments that helped improving the quality of the manuscript.","department":[{"_id":"MaMo"}],"date_updated":"2023-08-07T13:36:25Z","status":"public","article_type":"original","type":"journal_article","_id":"9047","issue":"1","volume":20,"related_material":{"record":[{"relation":"earlier_version","id":"8536","status":"public"}]},"language":[{"iso":"eng"}],"publication_identifier":{"issn":["15361276"],"eissn":["15582248"]},"publication_status":"published","month":"01","intvolume":" 20","scopus_import":"1","main_file_link":[{"open_access":"1","url":"https://arxiv.org/abs/1909.04892"}],"oa_version":"Preprint","abstract":[{"lang":"eng","text":"This work analyzes the latency of the simplified successive cancellation (SSC) decoding scheme for polar codes proposed by Alamdar-Yazdi and Kschischang. It is shown that, unlike conventional successive cancellation decoding, where latency is linear in the block length, the latency of SSC decoding is sublinear. More specifically, the latency of SSC decoding is O(N1−1/μ) , where N is the block length and μ is the scaling exponent of the channel, which captures the speed of convergence of the rate to capacity. Numerical results demonstrate the tightness of the bound and show that most of the latency reduction arises from the parallel decoding of subcodes of rate 0 or 1."}]},{"oa":1,"quality_controlled":"1","publisher":"Institute of Electrical and Electronics Engineers","acknowledgement":"S. A. Hashemi is supported by a Postdoctoral Fellowship from the Natural Sciences and Engineering Research Council\r\nof Canada (NSERC) and by Huawei. M. Mondelli is partially supported by the 2019 Lopez-Loreta Prize. A. Fazeli and A. Vardy were supported in part by the National Science Foundation under Grant CCF-1764104.","page":"2369-2374","date_created":"2021-09-27T14:33:14Z","doi":"10.1109/ISIT45174.2021.9518153","date_published":"2021-09-01T00:00:00Z","year":"2021","isi":1,"publication":"2021 IEEE International Symposium on Information Theory","day":"01","project":[{"name":"Prix Lopez-Loretta 2019 - Marco Mondelli","_id":"059876FA-7A3F-11EA-A408-12923DDC885E"}],"external_id":{"isi":["000701502202078"],"arxiv":["2012.13378"]},"article_processing_charge":"No","author":[{"full_name":"Hashemi, Seyyed Ali","last_name":"Hashemi","first_name":"Seyyed Ali"},{"last_name":"Mondelli","full_name":"Mondelli, Marco","orcid":"0000-0002-3242-7020","first_name":"Marco","id":"27EB676C-8706-11E9-9510-7717E6697425"},{"last_name":"Fazeli","full_name":"Fazeli, Arman","first_name":"Arman"},{"last_name":"Vardy","full_name":"Vardy, Alexander","first_name":"Alexander"},{"first_name":"John","full_name":"Cioffi, John","last_name":"Cioffi"},{"last_name":"Goldsmith","full_name":"Goldsmith, Andrea","first_name":"Andrea"}],"title":"Parallelism versus latency in simplified successive-cancellation decoding of polar codes","citation":{"ama":"Hashemi SA, Mondelli M, Fazeli A, Vardy A, Cioffi J, Goldsmith A. Parallelism versus latency in simplified successive-cancellation decoding of polar codes. In: 2021 IEEE International Symposium on Information Theory. Institute of Electrical and Electronics Engineers; 2021:2369-2374. doi:10.1109/ISIT45174.2021.9518153","apa":"Hashemi, S. A., Mondelli, M., Fazeli, A., Vardy, A., Cioffi, J., & Goldsmith, A. (2021). Parallelism versus latency in simplified successive-cancellation decoding of polar codes. In 2021 IEEE International Symposium on Information Theory (pp. 2369–2374). Melbourne, Australia: Institute of Electrical and Electronics Engineers. https://doi.org/10.1109/ISIT45174.2021.9518153","short":"S.A. Hashemi, M. Mondelli, A. Fazeli, A. Vardy, J. Cioffi, A. Goldsmith, in:, 2021 IEEE International Symposium on Information Theory, Institute of Electrical and Electronics Engineers, 2021, pp. 2369–2374.","ieee":"S. A. Hashemi, M. Mondelli, A. Fazeli, A. Vardy, J. Cioffi, and A. Goldsmith, “Parallelism versus latency in simplified successive-cancellation decoding of polar codes,” in 2021 IEEE International Symposium on Information Theory, Melbourne, Australia, 2021, pp. 2369–2374.","mla":"Hashemi, Seyyed Ali, et al. “Parallelism versus Latency in Simplified Successive-Cancellation Decoding of Polar Codes.” 2021 IEEE International Symposium on Information Theory, Institute of Electrical and Electronics Engineers, 2021, pp. 2369–74, doi:10.1109/ISIT45174.2021.9518153.","ista":"Hashemi SA, Mondelli M, Fazeli A, Vardy A, Cioffi J, Goldsmith A. 2021. Parallelism versus latency in simplified successive-cancellation decoding of polar codes. 2021 IEEE International Symposium on Information Theory. ISIT: International Symposium on Information Theory, 2369–2374.","chicago":"Hashemi, Seyyed Ali, Marco Mondelli, Arman Fazeli, Alexander Vardy, John Cioffi, and Andrea Goldsmith. “Parallelism versus Latency in Simplified Successive-Cancellation Decoding of Polar Codes.” In 2021 IEEE International Symposium on Information Theory, 2369–74. Institute of Electrical and Electronics Engineers, 2021. https://doi.org/10.1109/ISIT45174.2021.9518153."},"user_id":"4359f0d1-fa6c-11eb-b949-802e58b17ae8","main_file_link":[{"open_access":"1","url":"https://arxiv.org/abs/2012.13378"}],"scopus_import":"1","month":"09","abstract":[{"text":"This paper characterizes the latency of the simplified successive-cancellation (SSC) decoding scheme for polar codes under hardware resource constraints. In particular, when the number of processing elements P that can perform SSC decoding operations in parallel is limited, as is the case in practice, the latency of SSC decoding is O(N1−1 μ+NPlog2log2NP), where N is the block length of the code and μ is the scaling exponent of polar codes for the channel. Three direct consequences of this bound are presented. First, in a fully-parallel implementation where P=N2 , the latency of SSC decoding is O(N1−1/μ) , which is sublinear in the block length. This recovers a result from an earlier work. Second, in a fully-serial implementation where P=1 , the latency of SSC decoding scales as O(Nlog2log2N) . The multiplicative constant is also calculated: we show that the latency of SSC decoding when P=1 is given by (2+o(1))Nlog2log2N . Third, in a semi-parallel implementation, the smallest P that gives the same latency as that of the fully-parallel implementation is P=N1/μ . The tightness of our bound on SSC decoding latency and the applicability of the foregoing results is validated through extensive simulations.","lang":"eng"}],"oa_version":"Preprint","related_material":{"record":[{"id":"10364","status":"public","relation":"later_version"}]},"publication_status":"published","publication_identifier":{"eisbn":["978-1-5386-8209-8"],"isbn":["978-1-5386-8210-4"],"issn":["2157-8095"]},"language":[{"iso":"eng"}],"conference":{"end_date":"2021-07-20","location":"Melbourne, Australia","start_date":"2021-07-12","name":"ISIT: International Symposium on Information Theory"},"type":"conference","status":"public","_id":"10053","department":[{"_id":"MaMo"}],"date_updated":"2023-08-14T06:55:58Z"},{"date_created":"2022-01-03T11:31:26Z","date_published":"2021-09-01T00:00:00Z","doi":"10.1109/isit45174.2021.9517887","page":"1082-1087","publication":"2021 IEEE International Symposium on Information Theory","day":"01","year":"2021","isi":1,"oa":1,"quality_controlled":"1","publisher":"Institute of Electrical and Electronics Engineers","title":"Sparse multi-decoder recursive projection aggregation for Reed-Muller codes","article_processing_charge":"No","external_id":{"isi":["000701502201029"],"arxiv":["2011.12882"]},"author":[{"first_name":"Dorsa","last_name":"Fathollahi","full_name":"Fathollahi, Dorsa"},{"full_name":"Farsad, Nariman","last_name":"Farsad","first_name":"Nariman"},{"first_name":"Seyyed Ali","last_name":"Hashemi","full_name":"Hashemi, Seyyed Ali"},{"first_name":"Marco","id":"27EB676C-8706-11E9-9510-7717E6697425","orcid":"0000-0002-3242-7020","full_name":"Mondelli, Marco","last_name":"Mondelli"}],"user_id":"4359f0d1-fa6c-11eb-b949-802e58b17ae8","citation":{"mla":"Fathollahi, Dorsa, et al. “Sparse Multi-Decoder Recursive Projection Aggregation for Reed-Muller Codes.” 2021 IEEE International Symposium on Information Theory, Institute of Electrical and Electronics Engineers, 2021, pp. 1082–87, doi:10.1109/isit45174.2021.9517887.","ieee":"D. Fathollahi, N. Farsad, S. A. Hashemi, and M. Mondelli, “Sparse multi-decoder recursive projection aggregation for Reed-Muller codes,” in 2021 IEEE International Symposium on Information Theory, Virtual, Melbourne, Australia, 2021, pp. 1082–1087.","short":"D. Fathollahi, N. Farsad, S.A. Hashemi, M. Mondelli, in:, 2021 IEEE International Symposium on Information Theory, Institute of Electrical and Electronics Engineers, 2021, pp. 1082–1087.","apa":"Fathollahi, D., Farsad, N., Hashemi, S. A., & Mondelli, M. (2021). Sparse multi-decoder recursive projection aggregation for Reed-Muller codes. In 2021 IEEE International Symposium on Information Theory (pp. 1082–1087). Virtual, Melbourne, Australia: Institute of Electrical and Electronics Engineers. https://doi.org/10.1109/isit45174.2021.9517887","ama":"Fathollahi D, Farsad N, Hashemi SA, Mondelli M. Sparse multi-decoder recursive projection aggregation for Reed-Muller codes. In: 2021 IEEE International Symposium on Information Theory. Institute of Electrical and Electronics Engineers; 2021:1082-1087. doi:10.1109/isit45174.2021.9517887","chicago":"Fathollahi, Dorsa, Nariman Farsad, Seyyed Ali Hashemi, and Marco Mondelli. “Sparse Multi-Decoder Recursive Projection Aggregation for Reed-Muller Codes.” In 2021 IEEE International Symposium on Information Theory, 1082–87. Institute of Electrical and Electronics Engineers, 2021. https://doi.org/10.1109/isit45174.2021.9517887.","ista":"Fathollahi D, Farsad N, Hashemi SA, Mondelli M. 2021. Sparse multi-decoder recursive projection aggregation for Reed-Muller codes. 2021 IEEE International Symposium on Information Theory. ISIT: International Symposium on Information Theory, 1082–1087."},"project":[{"name":"Prix Lopez-Loretta 2019 - Marco Mondelli","_id":"059876FA-7A3F-11EA-A408-12923DDC885E"}],"language":[{"iso":"eng"}],"publication_status":"published","publication_identifier":{"eisbn":["978-1-5386-8209-8"],"isbn":["978-1-5386-8210-4"]},"month":"09","main_file_link":[{"open_access":"1","url":"https://arxiv.org/abs/2011.12882"}],"scopus_import":"1","oa_version":"Preprint","abstract":[{"text":"We thank Emmanuel Abbe and Min Ye for providing us the implementation of RPA decoding. D. Fathollahi and M. Mondelli are partially supported by the 2019 Lopez-Loreta Prize. N. Farsad is supported by Discovery Grant from the Natural Sciences and Engineering Research Council of Canada (NSERC) and Canada Foundation for Innovation (CFI), John R. Evans Leader Fund. S. A. Hashemi is supported by a Postdoctoral Fellowship from NSERC.","lang":"eng"}],"department":[{"_id":"MaMo"}],"date_updated":"2023-08-17T06:32:06Z","status":"public","conference":{"name":"ISIT: International Symposium on Information Theory","start_date":"2021-07-12","end_date":"2021-07-20","location":"Virtual, Melbourne, Australia"},"type":"conference","_id":"10597"},{"oa_version":"Published Version","abstract":[{"text":"We study the problem of recovering an unknown signal 𝑥𝑥 given measurements obtained from a generalized linear model with a Gaussian sensing matrix. Two popular solutions are based on a linear estimator 𝑥𝑥^L and a spectral estimator 𝑥𝑥^s. The former is a data-dependent linear combination of the columns of the measurement matrix, and its analysis is quite simple. The latter is the principal eigenvector of a data-dependent matrix, and a recent line of work has studied its performance. In this paper, we show how to optimally combine 𝑥𝑥^L and 𝑥𝑥^s. At the heart of our analysis is the exact characterization of the empirical joint distribution of (𝑥𝑥,𝑥𝑥^L,𝑥𝑥^s) in the high-dimensional limit. This allows us to compute the Bayes-optimal combination of 𝑥𝑥^L and 𝑥𝑥^s, given the limiting distribution of the signal 𝑥𝑥. When the distribution of the signal is Gaussian, then the Bayes-optimal combination has the form 𝜃𝑥𝑥^L+𝑥𝑥^s and we derive the optimal combination coefficient. In order to establish the limiting distribution of (𝑥𝑥,𝑥𝑥^L,𝑥𝑥^s), we design and analyze an approximate message passing algorithm whose iterates give 𝑥𝑥^L and approach 𝑥𝑥^s. Numerical simulations demonstrate the improvement of the proposed combination with respect to the two methods considered separately.","lang":"eng"}],"month":"08","scopus_import":"1","language":[{"iso":"eng"}],"file":[{"file_name":"2021_Springer_Mondelli.pdf","date_created":"2021-12-13T15:47:54Z","file_size":2305731,"date_updated":"2021-12-13T15:47:54Z","creator":"alisjak","success":1,"checksum":"9ea12dd8045a0678000a3a59295221cb","file_id":"10542","content_type":"application/pdf","relation":"main_file","access_level":"open_access"}],"publication_status":"published","publication_identifier":{"issn":["1615-3375"],"eissn":["1615-3383"]},"_id":"10211","keyword":["Applied Mathematics","Computational Theory and Mathematics","Computational Mathematics","Analysis"],"status":"public","tmp":{"legal_code_url":"https://creativecommons.org/licenses/by/4.0/legalcode","image":"/images/cc_by.png","name":"Creative Commons Attribution 4.0 International Public License (CC-BY 4.0)","short":"CC BY (4.0)"},"type":"journal_article","article_type":"original","ddc":["510"],"date_updated":"2023-09-05T14:13:57Z","file_date_updated":"2021-12-13T15:47:54Z","department":[{"_id":"MaMo"}],"acknowledgement":"M. Mondelli would like to thank Andrea Montanari for helpful discussions. All the authors would like to thank the anonymous reviewers for their helpful comments.","oa":1,"publisher":"Springer","quality_controlled":"1","publication":"Foundations of Computational Mathematics","day":"17","year":"2021","isi":1,"has_accepted_license":"1","date_created":"2021-11-03T10:59:08Z","doi":"10.1007/s10208-021-09531-x","date_published":"2021-08-17T00:00:00Z","project":[{"_id":"B67AFEDC-15C9-11EA-A837-991A96BB2854","name":"IST Austria Open Access Fund"}],"user_id":"c635000d-4b10-11ee-a964-aac5a93f6ac1","citation":{"ieee":"M. Mondelli, C. Thrampoulidis, and R. Venkataramanan, “Optimal combination of linear and spectral estimators for generalized linear models,” Foundations of Computational Mathematics. Springer, 2021.","short":"M. Mondelli, C. Thrampoulidis, R. Venkataramanan, Foundations of Computational Mathematics (2021).","ama":"Mondelli M, Thrampoulidis C, Venkataramanan R. Optimal combination of linear and spectral estimators for generalized linear models. Foundations of Computational Mathematics. 2021. doi:10.1007/s10208-021-09531-x","apa":"Mondelli, M., Thrampoulidis, C., & Venkataramanan, R. (2021). Optimal combination of linear and spectral estimators for generalized linear models. Foundations of Computational Mathematics. Springer. https://doi.org/10.1007/s10208-021-09531-x","mla":"Mondelli, Marco, et al. “Optimal Combination of Linear and Spectral Estimators for Generalized Linear Models.” Foundations of Computational Mathematics, Springer, 2021, doi:10.1007/s10208-021-09531-x.","ista":"Mondelli M, Thrampoulidis C, Venkataramanan R. 2021. Optimal combination of linear and spectral estimators for generalized linear models. Foundations of Computational Mathematics.","chicago":"Mondelli, Marco, Christos Thrampoulidis, and Ramji Venkataramanan. “Optimal Combination of Linear and Spectral Estimators for Generalized Linear Models.” Foundations of Computational Mathematics. Springer, 2021. https://doi.org/10.1007/s10208-021-09531-x."},"title":"Optimal combination of linear and spectral estimators for generalized linear models","external_id":{"arxiv":["2008.03326"],"isi":["000685721000001"]},"article_processing_charge":"Yes (via OA deal)","author":[{"id":"27EB676C-8706-11E9-9510-7717E6697425","first_name":"Marco","last_name":"Mondelli","orcid":"0000-0002-3242-7020","full_name":"Mondelli, Marco"},{"first_name":"Christos","full_name":"Thrampoulidis, Christos","last_name":"Thrampoulidis"},{"first_name":"Ramji","full_name":"Venkataramanan, Ramji","last_name":"Venkataramanan"}]},{"_id":"10593","status":"public","type":"conference","conference":{"start_date":"2021-12-06","end_date":"2021-12-14","location":"Virtual","name":"NeurIPS: Neural Information Processing Systems"},"date_updated":"2023-10-17T11:48:23Z","department":[{"_id":"MaMo"}],"oa_version":"Preprint","abstract":[{"lang":"eng","text":"We study the problem of estimating a rank-$1$ signal in the presence of rotationally invariant noise-a class of perturbations more general than Gaussian noise. Principal Component Analysis (PCA) provides a natural estimator, and sharp results on its performance have been obtained in the high-dimensional regime. Recently, an Approximate Message Passing (AMP) algorithm has been proposed as an alternative estimator with the potential to improve the accuracy of PCA. However, the existing analysis of AMP requires an initialization that is both correlated with the signal and independent of the noise, which is often unrealistic in practice. In this work, we combine the two methods, and propose to initialize AMP with PCA. Our main result is a rigorous asymptotic characterization of the performance of this estimator. Both the AMP algorithm and its analysis differ from those previously derived in the Gaussian setting: at every iteration, our AMP algorithm requires a specific term to account for PCA initialization, while in the Gaussian case, PCA initialization affects only the first iteration of AMP. The proof is based on a two-phase artificial AMP that first approximates the PCA estimator and then mimics the true AMP. Our numerical simulations show an excellent agreement between AMP results and theoretical predictions, and suggest an interesting open direction on achieving Bayes-optimal performance."}],"month":"12","intvolume":" 35","scopus_import":"1","main_file_link":[{"url":"https://arxiv.org/abs/2106.02356","open_access":"1"}],"language":[{"iso":"eng"}],"publication_identifier":{"isbn":["9781713845393"],"issn":["1049-5258"]},"publication_status":"published","volume":35,"project":[{"_id":"059876FA-7A3F-11EA-A408-12923DDC885E","name":"Prix Lopez-Loretta 2019 - Marco Mondelli"}],"user_id":"2DF688A6-F248-11E8-B48F-1D18A9856A87","citation":{"chicago":"Mondelli, Marco, and Ramji Venkataramanan. “PCA Initialization for Approximate Message Passing in Rotationally Invariant Models.” In 35th Conference on Neural Information Processing Systems, 35:29616–29. Neural Information Processing Systems Foundation, 2021.","ista":"Mondelli M, Venkataramanan R. 2021. PCA initialization for approximate message passing in rotationally invariant models. 35th Conference on Neural Information Processing Systems. NeurIPS: Neural Information Processing Systems vol. 35, 29616–29629.","mla":"Mondelli, Marco, and Ramji Venkataramanan. “PCA Initialization for Approximate Message Passing in Rotationally Invariant Models.” 35th Conference on Neural Information Processing Systems, vol. 35, Neural Information Processing Systems Foundation, 2021, pp. 29616–29.","ieee":"M. Mondelli and R. Venkataramanan, “PCA initialization for approximate message passing in rotationally invariant models,” in 35th Conference on Neural Information Processing Systems, Virtual, 2021, vol. 35, pp. 29616–29629.","short":"M. Mondelli, R. Venkataramanan, in:, 35th Conference on Neural Information Processing Systems, Neural Information Processing Systems Foundation, 2021, pp. 29616–29629.","apa":"Mondelli, M., & Venkataramanan, R. (2021). PCA initialization for approximate message passing in rotationally invariant models. In 35th Conference on Neural Information Processing Systems (Vol. 35, pp. 29616–29629). Virtual: Neural Information Processing Systems Foundation.","ama":"Mondelli M, Venkataramanan R. PCA initialization for approximate message passing in rotationally invariant models. In: 35th Conference on Neural Information Processing Systems. Vol 35. Neural Information Processing Systems Foundation; 2021:29616-29629."},"title":"PCA initialization for approximate message passing in rotationally invariant models","author":[{"id":"27EB676C-8706-11E9-9510-7717E6697425","first_name":"Marco","full_name":"Mondelli, Marco","orcid":"0000-0002-3242-7020","last_name":"Mondelli"},{"first_name":"Ramji","full_name":"Venkataramanan, Ramji","last_name":"Venkataramanan"}],"external_id":{"arxiv":["2106.02356"]},"article_processing_charge":"No","acknowledgement":"M. Mondelli would like to thank László Erdős for helpful discussions. M. Mondelli was partially supported by the 2019 Lopez-Loreta Prize. R. Venkataramanan was partially supported by the Alan Turing Institute under the EPSRC grant EP/N510129/1.\r\n","publisher":"Neural Information Processing Systems Foundation","quality_controlled":"1","oa":1,"day":"01","publication":"35th Conference on Neural Information Processing Systems","year":"2021","date_published":"2021-12-01T00:00:00Z","date_created":"2022-01-03T10:50:02Z","page":"29616-29629"},{"day":"01","publication":"35th Conference on Neural Information Processing Systems","year":"2021","date_published":"2021-12-01T00:00:00Z","date_created":"2022-01-03T10:56:20Z","acknowledgement":"MM was partially supported by the 2019 Lopez-Loreta Prize. QN and PB acknowledge support from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (grant agreement no 757983).","publisher":"Neural Information Processing Systems Foundation","quality_controlled":"1","oa":1,"user_id":"2DF688A6-F248-11E8-B48F-1D18A9856A87","citation":{"mla":"Nguyen, Quynh, et al. “When Are Solutions Connected in Deep Networks?” 35th Conference on Neural Information Processing Systems, vol. 35, Neural Information Processing Systems Foundation, 2021.","short":"Q. Nguyen, P. Bréchet, M. Mondelli, in:, 35th Conference on Neural Information Processing Systems, Neural Information Processing Systems Foundation, 2021.","ieee":"Q. Nguyen, P. Bréchet, and M. Mondelli, “When are solutions connected in deep networks?,” in 35th Conference on Neural Information Processing Systems, Virtual, 2021, vol. 35.","ama":"Nguyen Q, Bréchet P, Mondelli M. When are solutions connected in deep networks? In: 35th Conference on Neural Information Processing Systems. Vol 35. Neural Information Processing Systems Foundation; 2021.","apa":"Nguyen, Q., Bréchet, P., & Mondelli, M. (2021). When are solutions connected in deep networks? In 35th Conference on Neural Information Processing Systems (Vol. 35). Virtual: Neural Information Processing Systems Foundation.","chicago":"Nguyen, Quynh, Pierre Bréchet, and Marco Mondelli. “When Are Solutions Connected in Deep Networks?” In 35th Conference on Neural Information Processing Systems, Vol. 35. Neural Information Processing Systems Foundation, 2021.","ista":"Nguyen Q, Bréchet P, Mondelli M. 2021. When are solutions connected in deep networks? 35th Conference on Neural Information Processing Systems. 35th Conference on Neural Information Processing Systems vol. 35."},"title":"When are solutions connected in deep networks?","author":[{"full_name":"Nguyen, Quynh","last_name":"Nguyen","first_name":"Quynh"},{"first_name":"Pierre","full_name":"Bréchet, Pierre","last_name":"Bréchet"},{"orcid":"0000-0002-3242-7020","full_name":"Mondelli, Marco","last_name":"Mondelli","id":"27EB676C-8706-11E9-9510-7717E6697425","first_name":"Marco"}],"article_processing_charge":"No","external_id":{"arxiv":["2102.09671"]},"project":[{"name":"Prix Lopez-Loretta 2019 - Marco Mondelli","_id":"059876FA-7A3F-11EA-A408-12923DDC885E"}],"language":[{"iso":"eng"}],"publication_identifier":{"isbn":["9781713845393"],"issn":["1049-5258"]},"publication_status":"published","volume":35,"oa_version":"Preprint","abstract":[{"lang":"eng","text":"The question of how and why the phenomenon of mode connectivity occurs in training deep neural networks has gained remarkable attention in the research community. From a theoretical perspective, two possible explanations have been proposed: (i) the loss function has connected sublevel sets, and (ii) the solutions found by stochastic gradient descent are dropout stable. While these explanations provide insights into the phenomenon, their assumptions are not always satisfied in practice. In particular, the first approach requires the network to have one layer with order of N neurons (N being the number of training samples), while the second one requires the loss to be almost invariant after removing half of the neurons at each layer (up to some rescaling of the remaining ones). In this work, we improve both conditions by exploiting the quality of the features at every intermediate layer together with a milder over-parameterization condition. More specifically, we show that: (i) under generic assumptions on the features of intermediate layers, it suffices that the last two hidden layers have order of N−−√ neurons, and (ii) if subsets of features at each layer are linearly separable, then no over-parameterization is needed to show the connectivity. Our experiments confirm that the proposed condition ensures the connectivity of solutions found by stochastic gradient descent, even in settings where the previous requirements do not hold."}],"month":"12","intvolume":" 35","main_file_link":[{"url":"https://arxiv.org/abs/2102.09671","open_access":"1"}],"date_updated":"2023-10-17T11:48:40Z","department":[{"_id":"MaMo"}],"_id":"10594","status":"public","type":"conference","conference":{"name":"35th Conference on Neural Information Processing Systems","end_date":"2021-12-14","location":"Virtual","start_date":"2021-12-06"}},{"volume":130,"related_material":{"record":[{"relation":"later_version","id":"12480","status":"public"}]},"language":[{"iso":"eng"}],"publication_status":"published","publication_identifier":{"issn":["2640-3498"]},"intvolume":" 130","month":"04","main_file_link":[{"url":"https://proceedings.mlr.press/v130/mondelli21a.html","open_access":"1"}],"scopus_import":"1","alternative_title":["Proceedings of Machine Learning Research"],"oa_version":"Preprint","abstract":[{"lang":"eng","text":" We consider the problem of estimating a signal from measurements obtained via a generalized linear model. We focus on estimators based on approximate message passing (AMP), a family of iterative algorithms with many appealing features: the performance of AMP in the high-dimensional limit can be succinctly characterized under suitable model assumptions; AMP can also be tailored to the empirical distribution of the signal entries, and for a wide class of estimation problems, AMP is conjectured to be optimal among all polynomial-time algorithms. However, a major issue of AMP is that in many models (such as phase retrieval), it requires an initialization correlated with the ground-truth signal and independent from the measurement matrix. Assuming that such an initialization is available is typically not realistic. In this paper, we solve this problem by proposing an AMP algorithm initialized with a spectral estimator. With such an initialization, the standard AMP analysis fails since the spectral estimator depends in a complicated way on the design matrix. Our main contribution is a rigorous characterization of the performance of AMP with spectral initialization in the high-dimensional limit. The key technical idea is to define and analyze a two-phase artificial AMP algorithm that first produces the spectral estimator, and then closely approximates the iterates of the true AMP. We also provide numerical results that demonstrate the validity of the proposed approach. "}],"department":[{"_id":"MaMo"}],"date_updated":"2024-03-07T10:36:53Z","status":"public","conference":{"name":"AISTATS: Artificial Intelligence and Statistics","start_date":"2021-04-13","end_date":"2021-04-15","location":"Virtual, San Diego, CA, United States"},"type":"conference","_id":"10598","date_created":"2022-01-03T11:34:22Z","date_published":"2021-04-01T00:00:00Z","page":"397-405","publication":"Proceedings of The 24th International Conference on Artificial Intelligence and Statistics","day":"01","year":"2021","oa":1,"quality_controlled":"1","publisher":"ML Research Press","acknowledgement":"The authors would like to thank Andrea Montanari for helpful discussions. M. Mondelli was partially supported by the 2019 Lopez-Loreta Prize. R. Venkataramanan was partially supported by the Alan Turing Institute under the EPSRC grant EP/N510129/1.","title":"Approximate message passing with spectral initialization for generalized linear models","editor":[{"first_name":"Arindam","full_name":"Banerjee, Arindam","last_name":"Banerjee"},{"last_name":"Fukumizu","full_name":"Fukumizu, Kenji","first_name":"Kenji"}],"article_processing_charge":"Yes (via OA deal)","external_id":{"arxiv":["2010.03460"]},"author":[{"first_name":"Marco","id":"27EB676C-8706-11E9-9510-7717E6697425","last_name":"Mondelli","orcid":"0000-0002-3242-7020","full_name":"Mondelli, Marco"},{"first_name":"Ramji","full_name":"Venkataramanan, Ramji","last_name":"Venkataramanan"}],"user_id":"3E5EF7F0-F248-11E8-B48F-1D18A9856A87","citation":{"mla":"Mondelli, Marco, and Ramji Venkataramanan. “Approximate Message Passing with Spectral Initialization for Generalized Linear Models.” Proceedings of The 24th International Conference on Artificial Intelligence and Statistics, edited by Arindam Banerjee and Kenji Fukumizu, vol. 130, ML Research Press, 2021, pp. 397–405.","ama":"Mondelli M, Venkataramanan R. Approximate message passing with spectral initialization for generalized linear models. In: Banerjee A, Fukumizu K, eds. Proceedings of The 24th International Conference on Artificial Intelligence and Statistics. Vol 130. ML Research Press; 2021:397-405.","apa":"Mondelli, M., & Venkataramanan, R. (2021). Approximate message passing with spectral initialization for generalized linear models. In A. Banerjee & K. Fukumizu (Eds.), Proceedings of The 24th International Conference on Artificial Intelligence and Statistics (Vol. 130, pp. 397–405). Virtual, San Diego, CA, United States: ML Research Press.","short":"M. Mondelli, R. Venkataramanan, in:, A. Banerjee, K. Fukumizu (Eds.), Proceedings of The 24th International Conference on Artificial Intelligence and Statistics, ML Research Press, 2021, pp. 397–405.","ieee":"M. Mondelli and R. Venkataramanan, “Approximate message passing with spectral initialization for generalized linear models,” in Proceedings of The 24th International Conference on Artificial Intelligence and Statistics, Virtual, San Diego, CA, United States, 2021, vol. 130, pp. 397–405.","chicago":"Mondelli, Marco, and Ramji Venkataramanan. “Approximate Message Passing with Spectral Initialization for Generalized Linear Models.” In Proceedings of The 24th International Conference on Artificial Intelligence and Statistics, edited by Arindam Banerjee and Kenji Fukumizu, 130:397–405. ML Research Press, 2021.","ista":"Mondelli M, Venkataramanan R. 2021. Approximate message passing with spectral initialization for generalized linear models. Proceedings of The 24th International Conference on Artificial Intelligence and Statistics. AISTATS: Artificial Intelligence and Statistics, Proceedings of Machine Learning Research, vol. 130, 397–405."},"project":[{"name":"Prix Lopez-Loretta 2019 - Marco Mondelli","_id":"059876FA-7A3F-11EA-A408-12923DDC885E"}]},{"department":[{"_id":"MaMo"}],"date_updated":"2024-03-07T12:18:50Z","type":"journal_article","article_type":"original","status":"public","_id":"9002","volume":67,"issue":"9","related_material":{"record":[{"status":"public","id":"6665","relation":"earlier_version"}]},"publication_identifier":{"issn":["0018-9448"],"eissn":["1557-9654"]},"publication_status":"published","language":[{"iso":"eng"}],"scopus_import":"1","month":"09","intvolume":" 67","abstract":[{"text":" We prove that, for the binary erasure channel (BEC), the polar-coding paradigm gives rise to codes that not only approach the Shannon limit but do so under the best possible scaling of their block length as a function of the gap to capacity. This result exhibits the first known family of binary codes that attain both optimal scaling and quasi-linear complexity of encoding and decoding. Our proof is based on the construction and analysis of binary polar codes with large kernels. When communicating reliably at rates within ε>0 of capacity, the code length n often scales as O(1/εμ), where the constant μ is called the scaling exponent. It is known that the optimal scaling exponent is μ=2, and it is achieved by random linear codes. The scaling exponent of conventional polar codes (based on the 2×2 kernel) on the BEC is μ=3.63. This falls far short of the optimal scaling guaranteed by random codes. Our main contribution is a rigorous proof of the following result: for the BEC, there exist ℓ×ℓ binary kernels, such that polar codes constructed from these kernels achieve scaling exponent μ(ℓ) that tends to the optimal value of 2 as ℓ grows. We furthermore characterize precisely how large ℓ needs to be as a function of the gap between μ(ℓ) and 2. The resulting binary codes maintain the recursive structure of conventional polar codes, and thereby achieve construction complexity O(n) and encoding/decoding complexity O(nlogn).","lang":"eng"}],"oa_version":"Preprint","author":[{"last_name":"Fazeli","full_name":"Fazeli, Arman","first_name":"Arman"},{"last_name":"Hassani","full_name":"Hassani, Hamed","first_name":"Hamed"},{"first_name":"Marco","id":"27EB676C-8706-11E9-9510-7717E6697425","last_name":"Mondelli","full_name":"Mondelli, Marco","orcid":"0000-0002-3242-7020"},{"last_name":"Vardy","full_name":"Vardy, Alexander","first_name":"Alexander"}],"article_processing_charge":"No","external_id":{"arxiv":["1711.01339"]},"title":"Binary linear codes with optimal scaling: Polar codes with large kernels","citation":{"mla":"Fazeli, Arman, et al. “Binary Linear Codes with Optimal Scaling: Polar Codes with Large Kernels.” IEEE Transactions on Information Theory, vol. 67, no. 9, IEEE, 2021, pp. 5693–710, doi:10.1109/TIT.2020.3038806.","short":"A. Fazeli, H. Hassani, M. Mondelli, A. Vardy, IEEE Transactions on Information Theory 67 (2021) 5693–5710.","ieee":"A. Fazeli, H. Hassani, M. Mondelli, and A. Vardy, “Binary linear codes with optimal scaling: Polar codes with large kernels,” IEEE Transactions on Information Theory, vol. 67, no. 9. IEEE, pp. 5693–5710, 2021.","apa":"Fazeli, A., Hassani, H., Mondelli, M., & Vardy, A. (2021). Binary linear codes with optimal scaling: Polar codes with large kernels. IEEE Transactions on Information Theory. IEEE. https://doi.org/10.1109/TIT.2020.3038806","ama":"Fazeli A, Hassani H, Mondelli M, Vardy A. Binary linear codes with optimal scaling: Polar codes with large kernels. IEEE Transactions on Information Theory. 2021;67(9):5693-5710. doi:10.1109/TIT.2020.3038806","chicago":"Fazeli, Arman, Hamed Hassani, Marco Mondelli, and Alexander Vardy. “Binary Linear Codes with Optimal Scaling: Polar Codes with Large Kernels.” IEEE Transactions on Information Theory. IEEE, 2021. https://doi.org/10.1109/TIT.2020.3038806.","ista":"Fazeli A, Hassani H, Mondelli M, Vardy A. 2021. Binary linear codes with optimal scaling: Polar codes with large kernels. IEEE Transactions on Information Theory. 67(9), 5693–5710."},"user_id":"3E5EF7F0-F248-11E8-B48F-1D18A9856A87","page":"5693-5710","date_published":"2021-09-01T00:00:00Z","doi":"10.1109/TIT.2020.3038806","date_created":"2021-01-10T23:01:18Z","year":"2021","day":"01","publication":"IEEE Transactions on Information Theory","quality_controlled":"1","publisher":"IEEE"},{"publication_status":"published","language":[{"iso":"eng"}],"volume":33,"abstract":[{"lang":"eng","text":"Recent works have shown that gradient descent can find a global minimum for over-parameterized neural networks where the widths of all the hidden layers scale polynomially with N (N being the number of training samples). In this paper, we prove that, for deep networks, a single layer of width N following the input layer suffices to ensure a similar guarantee. In particular, all the remaining layers are allowed to have constant widths, and form a pyramidal topology. We show an application of our result to the widely used LeCun’s initialization and obtain an over-parameterization requirement for the single wide layer of order N2.\r\n"}],"oa_version":"Preprint","main_file_link":[{"url":"https://arxiv.org/abs/2002.07867","open_access":"1"}],"month":"07","intvolume":" 33","date_updated":"2022-01-04T09:24:41Z","department":[{"_id":"MaMo"}],"_id":"9221","type":"conference","conference":{"start_date":"2020-12-06","end_date":"2020-12-12","location":"Vancouver, Canada","name":"NeurIPS: Neural Information Processing Systems"},"status":"public","year":"2020","day":"07","publication":"34th Conference on Neural Information Processing Systems","page":"11961–11972","date_published":"2020-07-07T00:00:00Z","date_created":"2021-03-03T12:06:02Z","acknowledgement":"The authors would like to thank Jan Maas, Mahdi Soltanolkotabi, and Daniel Soudry for the helpful discussions, Marius Kloft, Matthias Hein and Quoc Dinh Tran for proofreading portions of a prior version of this paper, and James Martens for a clarification concerning LeCun’s initialization. M. Mondelli was partially supported by the 2019 Lopez-Loreta Prize. Q. Nguyen was partially supported by the German Research Foundation (DFG) award KL 2698/2-1.","quality_controlled":"1","publisher":"Curran Associates","oa":1,"citation":{"mla":"Nguyen, Quynh, and Marco Mondelli. “Global Convergence of Deep Networks with One Wide Layer Followed by Pyramidal Topology.” 34th Conference on Neural Information Processing Systems, vol. 33, Curran Associates, 2020, pp. 11961–11972.","ama":"Nguyen Q, Mondelli M. Global convergence of deep networks with one wide layer followed by pyramidal topology. In: 34th Conference on Neural Information Processing Systems. Vol 33. Curran Associates; 2020:11961–11972.","apa":"Nguyen, Q., & Mondelli, M. (2020). Global convergence of deep networks with one wide layer followed by pyramidal topology. In 34th Conference on Neural Information Processing Systems (Vol. 33, pp. 11961–11972). Vancouver, Canada: Curran Associates.","short":"Q. Nguyen, M. Mondelli, in:, 34th Conference on Neural Information Processing Systems, Curran Associates, 2020, pp. 11961–11972.","ieee":"Q. Nguyen and M. Mondelli, “Global convergence of deep networks with one wide layer followed by pyramidal topology,” in 34th Conference on Neural Information Processing Systems, Vancouver, Canada, 2020, vol. 33, pp. 11961–11972.","chicago":"Nguyen, Quynh, and Marco Mondelli. “Global Convergence of Deep Networks with One Wide Layer Followed by Pyramidal Topology.” In 34th Conference on Neural Information Processing Systems, 33:11961–11972. Curran Associates, 2020.","ista":"Nguyen Q, Mondelli M. 2020. Global convergence of deep networks with one wide layer followed by pyramidal topology. 34th Conference on Neural Information Processing Systems. NeurIPS: Neural Information Processing Systems vol. 33, 11961–11972."},"user_id":"8b945eb4-e2f2-11eb-945a-df72226e66a9","author":[{"first_name":"Quynh","full_name":"Nguyen, Quynh","last_name":"Nguyen"},{"id":"27EB676C-8706-11E9-9510-7717E6697425","first_name":"Marco","last_name":"Mondelli","full_name":"Mondelli, Marco","orcid":"0000-0002-3242-7020"}],"external_id":{"arxiv":["2002.07867"]},"article_processing_charge":"No","title":"Global convergence of deep networks with one wide layer followed by pyramidal topology","project":[{"_id":"059876FA-7A3F-11EA-A408-12923DDC885E","name":"Prix Lopez-Loretta 2019 - Marco Mondelli"}]},{"department":[{"_id":"MaMo"}],"date_updated":"2023-08-07T13:36:24Z","status":"public","type":"conference","conference":{"name":"ISIT: Internation Symposium on Information Theory","start_date":"2020-06-21","end_date":"2020-06-26","location":"Los Angeles, CA, United States"},"_id":"8536","volume":"2020-June","related_material":{"record":[{"relation":"later_version","id":"9047","status":"public"}]},"language":[{"iso":"eng"}],"publication_identifier":{"isbn":["9781728164328"],"issn":["21578095"]},"publication_status":"published","month":"06","scopus_import":"1","main_file_link":[{"url":"https://arxiv.org/abs/1909.04892","open_access":"1"}],"oa_version":"Preprint","abstract":[{"text":"This work analyzes the latency of the simplified successive cancellation (SSC) decoding scheme for polar codes proposed by Alamdar-Yazdi and Kschischang. It is shown that, unlike conventional successive cancellation decoding, where latency is linear in the block length, the latency of SSC decoding is sublinear. More specifically, the latency of SSC decoding is O(N 1−1/µ ), where N is the block length and µ is the scaling exponent of the channel, which captures the speed of convergence of the rate to capacity. Numerical results demonstrate the tightness of the bound and show that most of the latency reduction arises from the parallel decoding of subcodes of rate 0 and 1.","lang":"eng"}],"title":"Simplified successive cancellation decoding of polar codes has sublinear latency","author":[{"first_name":"Marco","id":"27EB676C-8706-11E9-9510-7717E6697425","last_name":"Mondelli","orcid":"0000-0002-3242-7020","full_name":"Mondelli, Marco"},{"last_name":"Hashemi","full_name":"Hashemi, Seyyed Ali","first_name":"Seyyed Ali"},{"first_name":"John","last_name":"Cioffi","full_name":"Cioffi, John"},{"last_name":"Goldsmith","full_name":"Goldsmith, Andrea","first_name":"Andrea"}],"external_id":{"arxiv":["1909.04892"]},"article_processing_charge":"No","user_id":"2DF688A6-F248-11E8-B48F-1D18A9856A87","citation":{"chicago":"Mondelli, Marco, Seyyed Ali Hashemi, John Cioffi, and Andrea Goldsmith. “Simplified Successive Cancellation Decoding of Polar Codes Has Sublinear Latency.” In IEEE International Symposium on Information Theory - Proceedings, Vol. 2020–June. IEEE, 2020. https://doi.org/10.1109/ISIT44484.2020.9174141.","ista":"Mondelli M, Hashemi SA, Cioffi J, Goldsmith A. 2020. Simplified successive cancellation decoding of polar codes has sublinear latency. IEEE International Symposium on Information Theory - Proceedings. ISIT: Internation Symposium on Information Theory vol. 2020–June, 401–406.","mla":"Mondelli, Marco, et al. “Simplified Successive Cancellation Decoding of Polar Codes Has Sublinear Latency.” IEEE International Symposium on Information Theory - Proceedings, vol. 2020–June, 401–406, IEEE, 2020, doi:10.1109/ISIT44484.2020.9174141.","ieee":"M. Mondelli, S. A. Hashemi, J. Cioffi, and A. Goldsmith, “Simplified successive cancellation decoding of polar codes has sublinear latency,” in IEEE International Symposium on Information Theory - Proceedings, Los Angeles, CA, United States, 2020, vol. 2020–June.","short":"M. Mondelli, S.A. Hashemi, J. Cioffi, A. Goldsmith, in:, IEEE International Symposium on Information Theory - Proceedings, IEEE, 2020.","apa":"Mondelli, M., Hashemi, S. A., Cioffi, J., & Goldsmith, A. (2020). Simplified successive cancellation decoding of polar codes has sublinear latency. In IEEE International Symposium on Information Theory - Proceedings (Vol. 2020–June). Los Angeles, CA, United States: IEEE. https://doi.org/10.1109/ISIT44484.2020.9174141","ama":"Mondelli M, Hashemi SA, Cioffi J, Goldsmith A. Simplified successive cancellation decoding of polar codes has sublinear latency. In: IEEE International Symposium on Information Theory - Proceedings. Vol 2020-June. IEEE; 2020. doi:10.1109/ISIT44484.2020.9174141"},"article_number":"401-406","doi":"10.1109/ISIT44484.2020.9174141","date_published":"2020-06-01T00:00:00Z","date_created":"2020-09-20T22:01:37Z","day":"01","publication":"IEEE International Symposium on Information Theory - Proceedings","year":"2020","quality_controlled":"1","publisher":"IEEE","oa":1,"acknowledgement":"M. Mondelli was partially supported by grants NSF DMS-1613091, CCF-1714305, IIS-1741162 and ONR N00014-18-1-2729. S. A. Hashemi is supported by a Postdoctoral Fellowship from the Natural Sciences and Engineering Research Council of Canada (NSERC) and by Huawei."},{"volume":119,"language":[{"iso":"eng"}],"file":[{"file_name":"2020_PMLR_Shevchenko.pdf","date_created":"2021-03-02T15:38:14Z","file_size":5336380,"date_updated":"2021-03-02T15:38:14Z","creator":"dernst","success":1,"checksum":"f042c8d4316bd87c6361aa76f1fbdbbe","file_id":"9217","content_type":"application/pdf","relation":"main_file","access_level":"open_access"}],"publication_status":"published","intvolume":" 119","month":"07","oa_version":"Published Version","abstract":[{"text":"The optimization of multilayer neural networks typically leads to a solution\r\nwith zero training error, yet the landscape can exhibit spurious local minima\r\nand the minima can be disconnected. In this paper, we shed light on this\r\nphenomenon: we show that the combination of stochastic gradient descent (SGD)\r\nand over-parameterization makes the landscape of multilayer neural networks\r\napproximately connected and thus more favorable to optimization. More\r\nspecifically, we prove that SGD solutions are connected via a piecewise linear\r\npath, and the increase in loss along this path vanishes as the number of\r\nneurons grows large. This result is a consequence of the fact that the\r\nparameters found by SGD are increasingly dropout stable as the network becomes\r\nwider. We show that, if we remove part of the neurons (and suitably rescale the\r\nremaining ones), the change in loss is independent of the total number of\r\nneurons, and it depends only on how many neurons are left. Our results exhibit\r\na mild dependence on the input dimension: they are dimension-free for two-layer\r\nnetworks and depend linearly on the dimension for multilayer networks. We\r\nvalidate our theoretical findings with numerical experiments for different\r\narchitectures and classification tasks.","lang":"eng"}],"file_date_updated":"2021-03-02T15:38:14Z","department":[{"_id":"MaMo"}],"ddc":["000"],"date_updated":"2023-10-17T12:43:19Z","status":"public","type":"conference","_id":"9198","date_created":"2021-02-25T09:36:22Z","date_published":"2020-07-13T00:00:00Z","page":"8773-8784","publication":"Proceedings of the 37th International Conference on Machine Learning","day":"13","year":"2020","has_accepted_license":"1","oa":1,"publisher":"ML Research Press","quality_controlled":"1","acknowledgement":"M. Mondelli was partially supported by the 2019 LopezLoreta Prize. The authors thank Phan-Minh Nguyen for helpful discussions and the IST Distributed Algorithms and Systems Lab for providing computational resources.","title":"Landscape connectivity and dropout stability of SGD solutions for over-parameterized neural networks","external_id":{"arxiv":["1912.10095"]},"article_processing_charge":"No","author":[{"first_name":"Alexander","last_name":"Shevchenko","full_name":"Shevchenko, Alexander"},{"full_name":"Mondelli, Marco","orcid":"0000-0002-3242-7020","last_name":"Mondelli","id":"27EB676C-8706-11E9-9510-7717E6697425","first_name":"Marco"}],"user_id":"2DF688A6-F248-11E8-B48F-1D18A9856A87","citation":{"ista":"Shevchenko A, Mondelli M. 2020. Landscape connectivity and dropout stability of SGD solutions for over-parameterized neural networks. Proceedings of the 37th International Conference on Machine Learning. vol. 119, 8773–8784.","chicago":"Shevchenko, Alexander, and Marco Mondelli. “Landscape Connectivity and Dropout Stability of SGD Solutions for Over-Parameterized Neural Networks.” In Proceedings of the 37th International Conference on Machine Learning, 119:8773–84. ML Research Press, 2020.","ama":"Shevchenko A, Mondelli M. Landscape connectivity and dropout stability of SGD solutions for over-parameterized neural networks. In: Proceedings of the 37th International Conference on Machine Learning. Vol 119. ML Research Press; 2020:8773-8784.","apa":"Shevchenko, A., & Mondelli, M. (2020). Landscape connectivity and dropout stability of SGD solutions for over-parameterized neural networks. In Proceedings of the 37th International Conference on Machine Learning (Vol. 119, pp. 8773–8784). ML Research Press.","short":"A. Shevchenko, M. Mondelli, in:, Proceedings of the 37th International Conference on Machine Learning, ML Research Press, 2020, pp. 8773–8784.","ieee":"A. Shevchenko and M. Mondelli, “Landscape connectivity and dropout stability of SGD solutions for over-parameterized neural networks,” in Proceedings of the 37th International Conference on Machine Learning, 2020, vol. 119, pp. 8773–8784.","mla":"Shevchenko, Alexander, and Marco Mondelli. “Landscape Connectivity and Dropout Stability of SGD Solutions for Over-Parameterized Neural Networks.” Proceedings of the 37th International Conference on Machine Learning, vol. 119, ML Research Press, 2020, pp. 8773–84."},"project":[{"_id":"059876FA-7A3F-11EA-A408-12923DDC885E","name":"Prix Lopez-Loretta 2019 - Marco Mondelli"}]},{"intvolume":" 48","month":"12","main_file_link":[{"open_access":"1","url":"https://arxiv.org/abs/1901.01375"}],"oa_version":"Preprint","abstract":[{"text":"Fitting a function by using linear combinations of a large number N of `simple' components is one of the most fruitful ideas in statistical learning. This idea lies at the core of a variety of methods, from two-layer neural networks to kernel regression, to boosting. In general, the resulting risk minimization problem is non-convex and is solved by gradient descent or its variants. Unfortunately, little is known about global convergence properties of these approaches.\r\nHere we consider the problem of learning a concave function f on a compact convex domain Ω⊆ℝd, using linear combinations of `bump-like' components (neurons). The parameters to be fitted are the centers of N bumps, and the resulting empirical risk minimization problem is highly non-convex. We prove that, in the limit in which the number of neurons diverges, the evolution of gradient descent converges to a Wasserstein gradient flow in the space of probability distributions over Ω. Further, when the bump width δ tends to 0, this gradient flow has a limit which is a viscous porous medium equation. Remarkably, the cost function optimized by this gradient flow exhibits a special property known as displacement convexity, which implies exponential convergence rates for N→∞, δ→0. Surprisingly, this asymptotic theory appears to capture well the behavior for moderate values of δ,N. Explaining this phenomenon, and understanding the dependence on δ,N in a quantitative manner remains an outstanding challenge.","lang":"eng"}],"issue":"6","volume":48,"language":[{"iso":"eng"}],"publication_status":"published","publication_identifier":{"issn":["1932-6157"],"eissn":["1941-7330"]},"status":"public","type":"journal_article","article_type":"original","_id":"6748","department":[{"_id":"MaMo"}],"date_updated":"2024-03-06T08:28:50Z","oa":1,"publisher":"Institute of Mathematical Statistics","quality_controlled":"1","date_created":"2019-07-31T09:39:42Z","doi":"10.1214/20-AOS1945","date_published":"2020-12-11T00:00:00Z","page":"3619-3642","publication":"Annals of Statistics","day":"11","year":"2020","isi":1,"title":"Analysis of a two-layer neural network via displacement convexity","external_id":{"arxiv":["1901.01375"],"isi":["000598369200021"]},"article_processing_charge":"No","author":[{"first_name":"Adel","last_name":"Javanmard","full_name":"Javanmard, Adel"},{"last_name":"Mondelli","full_name":"Mondelli, Marco","orcid":"0000-0002-3242-7020","first_name":"Marco","id":"27EB676C-8706-11E9-9510-7717E6697425"},{"first_name":"Andrea","full_name":"Montanari, Andrea","last_name":"Montanari"}],"user_id":"2DF688A6-F248-11E8-B48F-1D18A9856A87","citation":{"mla":"Javanmard, Adel, et al. “Analysis of a Two-Layer Neural Network via Displacement Convexity.” Annals of Statistics, vol. 48, no. 6, Institute of Mathematical Statistics, 2020, pp. 3619–42, doi:10.1214/20-AOS1945.","ama":"Javanmard A, Mondelli M, Montanari A. Analysis of a two-layer neural network via displacement convexity. Annals of Statistics. 2020;48(6):3619-3642. doi:10.1214/20-AOS1945","apa":"Javanmard, A., Mondelli, M., & Montanari, A. (2020). Analysis of a two-layer neural network via displacement convexity. Annals of Statistics. Institute of Mathematical Statistics. https://doi.org/10.1214/20-AOS1945","ieee":"A. Javanmard, M. Mondelli, and A. Montanari, “Analysis of a two-layer neural network via displacement convexity,” Annals of Statistics, vol. 48, no. 6. Institute of Mathematical Statistics, pp. 3619–3642, 2020.","short":"A. Javanmard, M. Mondelli, A. Montanari, Annals of Statistics 48 (2020) 3619–3642.","chicago":"Javanmard, Adel, Marco Mondelli, and Andrea Montanari. “Analysis of a Two-Layer Neural Network via Displacement Convexity.” Annals of Statistics. Institute of Mathematical Statistics, 2020. https://doi.org/10.1214/20-AOS1945.","ista":"Javanmard A, Mondelli M, Montanari A. 2020. Analysis of a two-layer neural network via displacement convexity. Annals of Statistics. 48(6), 3619–3642."}},{"oa":1,"publisher":"IEEE","quality_controlled":"1","publication":"IEEE Transactions on Signal Processing","day":"15","year":"2019","date_created":"2019-07-31T09:51:14Z","doi":"10.1109/TSP.2019.2944738","date_published":"2019-11-15T00:00:00Z","article_number":"8854897","user_id":"D865714E-FA4E-11E9-B85B-F5C5E5697425","citation":{"mla":"Hashemi, Seyyed Ali, et al. “Rate-Flexible Fast Polar Decoders.” IEEE Transactions on Signal Processing, vol. 67, no. 22, 8854897, IEEE, 2019, doi:10.1109/TSP.2019.2944738.","ieee":"S. A. Hashemi, C. Condo, M. Mondelli, and W. J. Gross, “Rate-flexible fast polar decoders,” IEEE Transactions on Signal Processing, vol. 67, no. 22. IEEE, 2019.","short":"S.A. Hashemi, C. Condo, M. Mondelli, W.J. Gross, IEEE Transactions on Signal Processing 67 (2019).","ama":"Hashemi SA, Condo C, Mondelli M, Gross WJ. Rate-flexible fast polar decoders. IEEE Transactions on Signal Processing. 2019;67(22). doi:10.1109/TSP.2019.2944738","apa":"Hashemi, S. A., Condo, C., Mondelli, M., & Gross, W. J. (2019). Rate-flexible fast polar decoders. IEEE Transactions on Signal Processing. IEEE. https://doi.org/10.1109/TSP.2019.2944738","chicago":"Hashemi, Seyyed Ali, Carlo Condo, Marco Mondelli, and Warren J Gross. “Rate-Flexible Fast Polar Decoders.” IEEE Transactions on Signal Processing. IEEE, 2019. https://doi.org/10.1109/TSP.2019.2944738.","ista":"Hashemi SA, Condo C, Mondelli M, Gross WJ. 2019. Rate-flexible fast polar decoders. IEEE Transactions on Signal Processing. 67(22), 8854897."},"title":"Rate-flexible fast polar decoders","external_id":{"arxiv":["1903.09203"]},"article_processing_charge":"No","author":[{"last_name":"Hashemi","full_name":"Hashemi, Seyyed Ali","first_name":"Seyyed Ali"},{"full_name":"Condo, Carlo","last_name":"Condo","first_name":"Carlo"},{"last_name":"Mondelli","full_name":"Mondelli, Marco","orcid":"0000-0002-3242-7020","id":"27EB676C-8706-11E9-9510-7717E6697425","first_name":"Marco"},{"full_name":"Gross, Warren J","last_name":"Gross","first_name":"Warren J"}],"oa_version":"Preprint","abstract":[{"text":"Polar codes have gained extensive attention during the past few years and recently they have been selected for the next generation of wireless communications standards (5G). Successive-cancellation-based (SC-based) decoders, such as SC list (SCL) and SC flip (SCF), provide a reasonable error performance for polar codes at the cost of low decoding speed. Fast SC-based decoders, such as Fast-SSC, Fast-SSCL, and Fast-SSCF, identify the special constituent codes in a polar code graph off-line, produce a list of operations, store the list in memory, and feed the list to the decoder to decode the constituent codes in order efficiently, thus increasing the decoding speed. However, the list of operations is dependent on the code rate and as the rate changes, a new list is produced, making fast SC-based decoders not rate-flexible. In this paper, we propose a completely rate-flexible fast SC-based decoder by creating the list of operations directly in hardware, with low implementation complexity. We further propose a hardware architecture implementing the proposed method and show that the area occupation of the rate-flexible fast SC-based decoder in this paper is only 38% of the total area of the memory-based base-line decoder when 5G code rates are supported. ","lang":"eng"}],"intvolume":" 67","month":"11","main_file_link":[{"open_access":"1","url":"https://arxiv.org/abs/1903.09203"}],"scopus_import":1,"language":[{"iso":"eng"}],"publication_status":"published","publication_identifier":{"issn":["1053587X"]},"issue":"22","volume":67,"_id":"6750","status":"public","type":"journal_article","article_type":"original","date_updated":"2021-01-12T08:08:51Z","department":[{"_id":"MaMo"}]},{"date_published":"2019-10-18T00:00:00Z","doi":"10.3390/a12100218","date_created":"2019-11-12T14:46:19Z","has_accepted_license":"1","year":"2019","day":"18","publication":"Algorithms","quality_controlled":"1","publisher":"MDPI","oa":1,"author":[{"orcid":"0000-0002-3242-7020","full_name":"Mondelli, Marco","last_name":"Mondelli","first_name":"Marco","id":"27EB676C-8706-11E9-9510-7717E6697425"},{"full_name":"Hassani, S. Hamed","last_name":"Hassani","first_name":"S. Hamed"},{"last_name":"Urbanke","full_name":"Urbanke, Rüdiger","first_name":"Rüdiger"}],"external_id":{"arxiv":["1801.03153"]},"title":"A new coding paradigm for the primitive relay channel","citation":{"ista":"Mondelli M, Hassani SH, Urbanke R. 2019. A new coding paradigm for the primitive relay channel. Algorithms. 12(10), 218.","chicago":"Mondelli, Marco, S. Hamed Hassani, and Rüdiger Urbanke. “A New Coding Paradigm for the Primitive Relay Channel.” Algorithms. MDPI, 2019. https://doi.org/10.3390/a12100218.","apa":"Mondelli, M., Hassani, S. H., & Urbanke, R. (2019). A new coding paradigm for the primitive relay channel. Algorithms. MDPI. https://doi.org/10.3390/a12100218","ama":"Mondelli M, Hassani SH, Urbanke R. A new coding paradigm for the primitive relay channel. Algorithms. 2019;12(10). doi:10.3390/a12100218","ieee":"M. Mondelli, S. H. Hassani, and R. Urbanke, “A new coding paradigm for the primitive relay channel,” Algorithms, vol. 12, no. 10. MDPI, 2019.","short":"M. Mondelli, S.H. Hassani, R. Urbanke, Algorithms 12 (2019).","mla":"Mondelli, Marco, et al. “A New Coding Paradigm for the Primitive Relay Channel.” Algorithms, vol. 12, no. 10, 218, MDPI, 2019, doi:10.3390/a12100218."},"user_id":"2DF688A6-F248-11E8-B48F-1D18A9856A87","article_number":"218","issue":"10","volume":12,"related_material":{"record":[{"id":"6675","status":"public","relation":"earlier_version"}]},"publication_identifier":{"issn":["1999-4893"]},"publication_status":"published","file":[{"content_type":"application/pdf","access_level":"open_access","relation":"main_file","checksum":"267756d8f9db572f496cd1663c89d59a","file_id":"7008","date_updated":"2020-07-14T12:47:47Z","file_size":696791,"creator":"dernst","date_created":"2019-11-12T14:48:45Z","file_name":"2019_Algorithms_Mondelli.pdf"}],"language":[{"iso":"eng"}],"scopus_import":1,"month":"10","intvolume":" 12","abstract":[{"lang":"eng","text":"We consider the primitive relay channel, where the source sends a message to the relay and to the destination, and the relay helps the communication by transmitting an additional message to the destination via a separate channel. Two well-known coding techniques have been introduced for this setting: decode-and-forward and compress-and-forward. In decode-and-forward, the relay completely decodes the message and sends some information to the destination; in compress-and-forward, the relay does not decode, and it sends a compressed version of the received signal to the destination using Wyner–Ziv coding. In this paper, we present a novel coding paradigm that provides an improved achievable rate for the primitive relay channel. The idea is to combine compress-and-forward and decode-and-forward via a chaining construction. We transmit over pairs of blocks: in the first block, we use compress-and-forward; and, in the second block, we use decode-and-forward. More specifically, in the first block, the relay does not decode, it compresses the received signal via Wyner–Ziv, and it sends only part of the compression to the destination. In the second block, the relay completely decodes the message, it sends some information to the destination, and it also sends the remaining part of the compression coming from the first block. By doing so, we are able to strictly outperform both compress-and-forward and decode-and-forward. Note that the proposed coding scheme can be implemented with polar codes. As such, it has the typical attractive properties of polar coding schemes, namely, quasi-linear encoding and decoding complexity, and error probability that decays at super-polynomial speed. As a running example, we take into account the special case of the erasure relay channel, and we provide a comparison between the rates achievable by our proposed scheme and the existing upper and lower bounds."}],"oa_version":"Published Version","file_date_updated":"2020-07-14T12:47:47Z","department":[{"_id":"MaMo"}],"date_updated":"2023-02-23T12:49:28Z","ddc":["510"],"type":"journal_article","article_type":"original","tmp":{"legal_code_url":"https://creativecommons.org/licenses/by/4.0/legalcode","image":"/images/cc_by.png","name":"Creative Commons Attribution 4.0 International Public License (CC-BY 4.0)","short":"CC BY (4.0)"},"status":"public","_id":"7007"}]