Studies of protein fitness landscapes reveal biophysical constraints guiding protein evolution and empower prediction of functional proteins. However, generalisation of these findings is limited due to scarceness of systematic data on fitness landscapes of proteins with a defined evolutionary relationship. We characterized the fitness peaks of four orthologous fluorescent proteins with a broad range of sequence divergence. While two of the four studied fitness peaks were sharp, the other two were considerably flatter, being almost entirely free of epistatic interactions. Mutationally robust proteins, characterized by a flat fitness peak, were not optimal templates for machine-learning-driven protein design – instead, predictions were more accurate for fragile proteins with epistatic landscapes. Our work paves insights for practical application of fitness landscape heterogeneity in protein engineering.
We thank Ondřej Draganov, Rodrigo Redondo, Bor Kavčič, Mia Juračić and Andrea Pauli for discussion and technical advice. We thank Anita Testa Salmazo for advice on resin protein purification, Dmitry Bolotin and the Milaboratory (milaboratory.com) for access to computing and storage infrastructure, and Josef Houser and Eva Fujdiarova for technical assistance and data interpretation. Core facility Biomolecular Interactions and Crystallization of CEITEC Masaryk University is gratefully acknowledged for the obtaining of the scientific data presented in this paper. This research was supported by the Scientific Service Units (SSU) of IST-Austria through resources provided by the Bioimaging Facility (BIF), and the Life Science Facility (LSF). MiSeq and HiSeq NGS sequencing was performed by the Next Generation Sequencing Facility at Vienna BioCenter Core Facilities (VBCF), member of the Vienna BioCenter (VBC), Austria. FACS was performed at the BioOptics Facility of the Institute of Molecular Pathology (IMP), Austria. We also thank the Biomolecular Crystallography Facility in the Vanderbilt University Center for Structural Biology. We are grateful to Joel M Harp for help with X-ray data collection. This work was supported by the ERC Consolidator grant to FAK (771209—CharFL). KSS acknowledges support by President’s Grant МК–5405.2021.1.4, the Imperial College Research Fellowship and the MRC London Institute of Medical Sciences (UKRI MC-A658-5QEA0). AF is supported by the Marie Skłodowska-Curie Fellowship (H2020-MSCA-IF-2019, Grant Agreement No. 898203, Project acronym "FLINDIP"). Experiments were partially carried out using equipment provided by the Institute of Bioorganic Chemistry of the Russian Academy of Sciences Сore Facility (CKP IBCH). This work was supported by a Russian Science Foundation grant 19-74-10102.This project has received funding from the European Union’s Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie Grant Agreement No. 665,385.
Gonzalez Somermeyer L, Fleiss A, Mishin AS, et al. Heterogeneity of the GFP fitness landscape and data-driven protein design. eLife. 2022;11. doi:10.7554/elife.75842
Gonzalez Somermeyer, L., Fleiss, A., Mishin, A. S., Bozhanova, N. G., Igolkina, A. A., Meiler, J., … Kondrashov, F. (2022). Heterogeneity of the GFP fitness landscape and data-driven protein design. ELife. eLife Sciences Publications. https://doi.org/10.7554/elife.75842
Gonzalez Somermeyer, Louisa, Aubin Fleiss, Alexander S Mishin, Nina G Bozhanova, Anna A Igolkina, Jens Meiler, Maria-Elisenda Alaball Pujol, Ekaterina V Putintseva, Karen S Sarkisyan, and Fyodor Kondrashov. “Heterogeneity of the GFP Fitness Landscape and Data-Driven Protein Design.” ELife. eLife Sciences Publications, 2022. https://doi.org/10.7554/elife.75842.
L. Gonzalez Somermeyer et al., “Heterogeneity of the GFP fitness landscape and data-driven protein design,” eLife, vol. 11. eLife Sciences Publications, 2022.
Gonzalez Somermeyer L, Fleiss A, Mishin AS, Bozhanova NG, Igolkina AA, Meiler J, Alaball Pujol M-E, Putintseva EV, Sarkisyan KS, Kondrashov F. 2022. Heterogeneity of the GFP fitness landscape and data-driven protein design. eLife. 11, 75842.
Gonzalez Somermeyer, Louisa, et al. “Heterogeneity of the GFP Fitness Landscape and Data-Driven Protein Design.” ELife, vol. 11, 75842, eLife Sciences Publications, 2022, doi:10.7554/elife.75842.
All files available under the following license(s):
Creative Commons Attribution 4.0 International Public License (CC-BY 4.0):
2022_eLife_Somermeyer.pdf 5.30 MB