C2M3: Cycle-consistent multi-model merging
Crisostomi D, Fumero M, Baieri D, Bernard F, Rodolà E. 2024. C2M3: Cycle-consistent multi-model merging. 38th Conference on Neural Information Processing Systems. NeurIPS: Neural Information Processing Systems, Advances in Neural Information Processing Systems, vol. 37.
Download (ext.)
Conference Paper
| Published
| English
Scopus indexed
Author
Crisostomi, Donato;
Fumero, MarcoISTA;
Baieri, Daniele;
Bernard, Florian;
Rodolà, Emanuele
Corresponding author has ISTA affiliation
Department
Series Title
Advances in Neural Information Processing Systems
Abstract
In this paper, we present a novel data-free method for merging neural networks in weight space. Differently from most existing works, our method optimizes for the permutations of network neurons globally across all layers. This allows us to enforce cycle consistency of the permutations when merging n ≥ 3 models, allowing circular compositions of permutations to be computed without accumulating error along the path. We qualitatively and quantitatively motivate the need for such a constraint, showing its benefits when merging sets of models in scenarios spanning varying architectures and datasets. We finally show that, when coupled
with activation renormalization, our approach yields the best results in the task.
Publishing Year
Date Published
2024-12-20
Proceedings Title
38th Conference on Neural Information Processing Systems
Publisher
Neural Information Processing Systems Foundation
Acknowledgement
This work is supported by the ERC grant no.802554 (SPECGEO), PRIN 2020 project
no.2020TA3K9N (LEGO.AI), and PNRR MUR project PE0000013-FAIR. Marco Fumero is supported by the MSCA IST-Bridge fellowship which has received funding from the European Union’s Horizon 2020 research and innovation program under the Marie Skłodowska-Curie grant agreement No 101034413. We thank Simone Scardapane for the helpful feedback on the paper.
Volume
37
Conference
NeurIPS: Neural Information Processing Systems
Conference Location
Vancouver, Canada
Conference Date
2024-12-09 – 2024-12-15
ISSN
IST-REx-ID
Cite this
Crisostomi D, Fumero M, Baieri D, Bernard F, Rodolà E. C2M3: Cycle-consistent multi-model merging. In: 38th Conference on Neural Information Processing Systems. Vol 37. Neural Information Processing Systems Foundation; 2024.
Crisostomi, D., Fumero, M., Baieri, D., Bernard, F., & Rodolà, E. (2024). C2M3: Cycle-consistent multi-model merging. In 38th Conference on Neural Information Processing Systems (Vol. 37). Vancouver, Canada: Neural Information Processing Systems Foundation.
Crisostomi, Donato, Marco Fumero, Daniele Baieri, Florian Bernard, and Emanuele Rodolà. “C2M3: Cycle-Consistent Multi-Model Merging.” In 38th Conference on Neural Information Processing Systems, Vol. 37. Neural Information Processing Systems Foundation, 2024.
D. Crisostomi, M. Fumero, D. Baieri, F. Bernard, and E. Rodolà, “C2M3: Cycle-consistent multi-model merging,” in 38th Conference on Neural Information Processing Systems, Vancouver, Canada, 2024, vol. 37.
Crisostomi D, Fumero M, Baieri D, Bernard F, Rodolà E. 2024. C2M3: Cycle-consistent multi-model merging. 38th Conference on Neural Information Processing Systems. NeurIPS: Neural Information Processing Systems, Advances in Neural Information Processing Systems, vol. 37.
Crisostomi, Donato, et al. “C2M3: Cycle-Consistent Multi-Model Merging.” 38th Conference on Neural Information Processing Systems, vol. 37, Neural Information Processing Systems Foundation, 2024.
All files available under the following license(s):
Copyright Statement:
This Item is protected by copyright and/or related rights. [...]
Link(s) to Main File(s)
Access Level

Export
Marked PublicationsOpen Data ISTA Research Explorer
Sources
arXiv 2405.17897