{"publication":"arXiv","date_updated":"2024-02-12T09:40:23Z","article_number":"2311.00664","abstract":[{"text":"While different neural models often exhibit latent spaces that are alike when exposed to semantically related data, this intrinsic similarity is not always immediately discernible. Towards a better understanding of this phenomenon, our work shows how representations learned from these neural modules can be translated between different pre-trained networks via simpler transformations than previously thought. An advantage of this approach is the ability to\r\nestimate these transformations using standard, well-understood algebraic procedures that have closed-form solutions. Our method directly estimates a transformation between two given latent spaces, thereby enabling effective stitching of encoders and decoders without additional training. We extensively validate the adaptability of this translation procedure in different\r\nexperimental settings: across various trainings, domains, architectures (e.g., ResNet, CNN, ViT), and in multiple downstream tasks (classification, reconstruction). Notably, we show how it is possible to zero-shot stitch text encoders and vision decoders, or vice-versa, yielding surprisingly good classification performance in this multimodal setting.","lang":"eng"}],"status":"public","date_created":"2024-02-07T15:08:55Z","year":"2023","day":"01","article_processing_charge":"No","date_published":"2023-11-01T00:00:00Z","acknowledgement":"This work is supported by the ERC grant no.802554 (SPECGEO), PRIN 2020 project no.2020TA3K9N (LEGO.AI), and PNRR MUR project PE0000013-FAIR. Francesco\r\nLocatello did not contribute to this work at Amazon.","language":[{"iso":"eng"}],"author":[{"first_name":"Valentino","full_name":"Maiorca, Valentino","last_name":"Maiorca"},{"last_name":"Moschella","first_name":"Luca","full_name":"Moschella, Luca"},{"last_name":"Norelli","first_name":"Antonio","full_name":"Norelli, Antonio"},{"last_name":"Fumero","full_name":"Fumero, Marco","first_name":"Marco"},{"last_name":"Locatello","id":"26cfd52f-2483-11ee-8040-88983bcc06d4","full_name":"Locatello, Francesco","orcid":"0000-0002-4850-0683","first_name":"Francesco"},{"last_name":"Rodolà","full_name":"Rodolà, Emanuele","first_name":"Emanuele"}],"main_file_link":[{"url":"https://doi.org/10.48550/arXiv.2311.00664","open_access":"1"}],"user_id":"2DF688A6-F248-11E8-B48F-1D18A9856A87","department":[{"_id":"FrLo"}],"oa":1,"publication_status":"submitted","_id":"14952","doi":"10.48550/arXiv.2311.00664","external_id":{"arxiv":["2311.00664"]},"month":"11","title":"Latent space translation via semantic alignment","oa_version":"Preprint","citation":{"apa":"Maiorca, V., Moschella, L., Norelli, A., Fumero, M., Locatello, F., &#38; Rodolà, E. (n.d.). Latent space translation via semantic alignment. <i>arXiv</i>. <a href=\"https://doi.org/10.48550/arXiv.2311.00664\">https://doi.org/10.48550/arXiv.2311.00664</a>","short":"V. Maiorca, L. Moschella, A. Norelli, M. Fumero, F. Locatello, E. Rodolà, ArXiv (n.d.).","chicago":"Maiorca, Valentino, Luca Moschella, Antonio Norelli, Marco Fumero, Francesco Locatello, and Emanuele Rodolà. “Latent Space Translation via Semantic Alignment.” <i>ArXiv</i>, n.d. <a href=\"https://doi.org/10.48550/arXiv.2311.00664\">https://doi.org/10.48550/arXiv.2311.00664</a>.","mla":"Maiorca, Valentino, et al. “Latent Space Translation via Semantic Alignment.” <i>ArXiv</i>, 2311.00664, doi:<a href=\"https://doi.org/10.48550/arXiv.2311.00664\">10.48550/arXiv.2311.00664</a>.","ieee":"V. Maiorca, L. Moschella, A. Norelli, M. Fumero, F. Locatello, and E. Rodolà, “Latent space translation via semantic alignment,” <i>arXiv</i>. .","ama":"Maiorca V, Moschella L, Norelli A, Fumero M, Locatello F, Rodolà E. Latent space translation via semantic alignment. <i>arXiv</i>. doi:<a href=\"https://doi.org/10.48550/arXiv.2311.00664\">10.48550/arXiv.2311.00664</a>","ista":"Maiorca V, Moschella L, Norelli A, Fumero M, Locatello F, Rodolà E. Latent space translation via semantic alignment. arXiv, 2311.00664."},"type":"preprint"}