---
OA_place: publisher
OA_type: hybrid
_id: '21704'
abstract:
- lang: eng
  text: How functional protein sequences are distributed in sequence space is fundamentally
    important for evolutionary theory and protein design, particularly if a large
    diversity of protein functions are hidden in evolutionarily unexplored areas of
    the sequence space. However, this question is understudied in part because experimental
    and computational studies use extant sequences as a starting point to study sequence
    space. Here, we study whether extant sequences are representative of the entire
    functional sequence space. Across thousands of protein families from vertebrates
    and bacteria we calculate the dimensionality and the volume of sequence space
    occupied by extant homologs. We find that the observed dimensionality and volume
    of extant sequence space are minuscule, many orders of magnitude smaller than
    what we estimated using a model of protein evolution. Simulating sequence evolution
    we then quantify the impact of phylogeny, selection, and epistasis on restricting
    the evolutionary exploration of sequence space. We find that sequence evolution
    from a single common ancestor, or a single point of origin in sequence space,
    is by far the largest limiting factor that reduces the dimensionality and volume
    of extant sequence space. These results indicate that there are vast areas of
    functional sequence space that have not been explored in evolution because of
    the excessive restrictions on natural exploration of the protein sequence space
    imposed by the point of origin effect. We suggest that protein design methods
    that rely on extant sequences may be limited in their ability to discover truly
    novel functions.
acknowledgement: We thank Olga Kalinina for feedback on our manuscript, Vsevolod Kuksin
  for fruitful discussions and Lev Tsarin for participation in the design of our models.
  This work was supported by Japan Science and Technology Agency as part of Adopting
  Sustainable Partnerships for Innovative Research Ecosystem, Grant No. JPMJAP24B2
  (F.A.K. and L.H.I.), and Fonds Zur Förderung der Wissenschaftlichen Forschung Grant
  ESP253-B (O.O.B.)
article_processing_charge: Yes (in subscription journal)
article_type: original
author:
- first_name: Lada H.
  full_name: Isakova, Lada H.
  last_name: Isakova
- first_name: Elizaveta
  full_name: Streltsova, Elizaveta
  id: 57a170da-dc96-11ea-b7c8-ab3565071bf7
  last_name: Streltsova
- first_name: Olga
  full_name: Bochkareva, Olga
  id: C4558D3C-6102-11E9-A62E-F418E6697425
  last_name: Bochkareva
  orcid: 0000-0003-1006-6639
- first_name: Peter K.
  full_name: Vlasov, Peter K.
  last_name: Vlasov
- first_name: Fyodor
  full_name: Kondrashov, Fyodor
  id: 44FDEF62-F248-11E8-B48F-1D18A9856A87
  last_name: Kondrashov
  orcid: 0000-0001-8243-4694
citation:
  ama: Isakova LH, Streltsova E, Bochkareva O, Vlasov PK, Kondrashov F. Descent from
    a common ancestor restricts exploration of protein sequence space. <i>Proceedings
    of the National Academy of Sciences</i>. 2026;123(14):e2532018123. doi:<a href="https://doi.org/10.1073/pnas.2532018123">10.1073/pnas.2532018123</a>
  apa: Isakova, L. H., Streltsova, E., Bochkareva, O., Vlasov, P. K., &#38; Kondrashov,
    F. (2026). Descent from a common ancestor restricts exploration of protein sequence
    space. <i>Proceedings of the National Academy of Sciences</i>. National Academy
    of Sciences. <a href="https://doi.org/10.1073/pnas.2532018123">https://doi.org/10.1073/pnas.2532018123</a>
  chicago: Isakova, Lada H., Elizaveta Streltsova, Olga Bochkareva, Peter K. Vlasov,
    and Fyodor Kondrashov. “Descent from a Common Ancestor Restricts Exploration of
    Protein Sequence Space.” <i>Proceedings of the National Academy of Sciences</i>.
    National Academy of Sciences, 2026. <a href="https://doi.org/10.1073/pnas.2532018123">https://doi.org/10.1073/pnas.2532018123</a>.
  ieee: L. H. Isakova, E. Streltsova, O. Bochkareva, P. K. Vlasov, and F. Kondrashov,
    “Descent from a common ancestor restricts exploration of protein sequence space,”
    <i>Proceedings of the National Academy of Sciences</i>, vol. 123, no. 14. National
    Academy of Sciences, p. e2532018123, 2026.
  ista: Isakova LH, Streltsova E, Bochkareva O, Vlasov PK, Kondrashov F. 2026. Descent
    from a common ancestor restricts exploration of protein sequence space. Proceedings
    of the National Academy of Sciences. 123(14), e2532018123.
  mla: Isakova, Lada H., et al. “Descent from a Common Ancestor Restricts Exploration
    of Protein Sequence Space.” <i>Proceedings of the National Academy of Sciences</i>,
    vol. 123, no. 14, National Academy of Sciences, 2026, p. e2532018123, doi:<a href="https://doi.org/10.1073/pnas.2532018123">10.1073/pnas.2532018123</a>.
  short: L.H. Isakova, E. Streltsova, O. Bochkareva, P.K. Vlasov, F. Kondrashov, Proceedings
    of the National Academy of Sciences 123 (2026) e2532018123.
date_created: 2026-04-12T22:01:47Z
date_published: 2026-04-07T00:00:00Z
date_updated: 2026-05-04T06:57:31Z
day: '07'
ddc:
- '570'
department:
- _id: UlWa
doi: 10.1073/pnas.2532018123
external_id:
  pmid:
  - '41915737'
file:
- access_level: open_access
  checksum: 11b7a13a359e302498b2367906093a6b
  content_type: application/pdf
  creator: dernst
  date_created: 2026-05-04T06:46:31Z
  date_updated: 2026-05-04T06:46:31Z
  file_id: '21783'
  file_name: 2026_PNAS_Isakova.pdf
  file_size: 3355016
  relation: main_file
  success: 1
file_date_updated: 2026-05-04T06:46:31Z
has_accepted_license: '1'
intvolume: '       123'
issue: '14'
language:
- iso: eng
month: '04'
oa: 1
oa_version: Published Version
page: e2532018123
pmid: 1
publication: Proceedings of the National Academy of Sciences
publication_identifier:
  eissn:
  - 1091-6490
publication_status: published
publisher: National Academy of Sciences
quality_controlled: '1'
scopus_import: '1'
status: public
title: Descent from a common ancestor restricts exploration of protein sequence space
tmp:
  image: /images/cc_by_nc_nd.png
  legal_code_url: https://creativecommons.org/licenses/by-nc-nd/4.0/legalcode
  name: Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International
    (CC BY-NC-ND 4.0)
  short: CC BY-NC-ND (4.0)
type: journal_article
user_id: 2DF688A6-F248-11E8-B48F-1D18A9856A87
volume: 123
year: '2026'
...
