HypercubeME: Two hundred million combinatorially complete datasets from a single experiment

Esteban, Laura A; Lonishin, Lyubov R; Bobrovskiy, Daniil M; Leleytner, Gregory; Bogatyreva, Natalya S; Kondrashov, Fyodor; Ivankov, Dmitry N

HypercubeME: Two hundred million combinatorially complete datasets from a single experiment

Esteban LA, Lonishin LR, Bobrovskiy DM, Leleytner G, Bogatyreva NS, Kondrashov F, Ivankov DN. 2020. HypercubeME: Two hundred million combinatorially complete datasets from a single experiment. Bioinformatics. 36(6), 1960–1962.

Download

2020_Bioinformatics_Esteban.pdf 308.34 KB [Published Version]

DOI

10.1093/bioinformatics/btz841

Journal Article | Published | English

Scopus indexed

Author

Esteban, Laura A; Lonishin, Lyubov R; Bobrovskiy, Daniil M; Leleytner, Gregory; Bogatyreva, Natalya S; Kondrashov, Fyodor^ISTA ; Ivankov, Dmitry N

Department

Kondrashov Group (Alumni)

Grant

Systematic investigation of epistasis in molecular evolution

Abstract

Epistasis, the context-dependence of the contribution of an amino acid substitution to fitness, is common in evolution. To detect epistasis, fitness must be measured for at least four genotypes: the reference genotype, two different single mutants and a double mutant with both of the single mutations. For higher-order epistasis of the order n, fitness has to be measured for all 2n genotypes of an n-dimensional hypercube in genotype space forming a ‘combinatorially complete dataset’. So far, only a handful of such datasets have been produced by manual curation. Concurrently, random mutagenesis experiments have produced measurements of fitness and other phenotypes in a high-throughput manner, potentially containing a number of combinatorially complete datasets. We present an effective recursive algorithm for finding all hypercube structures in random mutagenesis experimental data. To test the algorithm, we applied it to the data from a recent HIS3 protein dataset and found all 199 847 053 unique combinatorially complete genotype combinations of dimensionality ranging from 2 to 12. The algorithm may be useful for researchers looking for higher-order epistasis in their high-throughput experimental data.

Publishing Year

2020

Date Published

2020-03-15

Journal Title

Bioinformatics

Publisher

Oxford University Press

Acknowledgement

This work was supported by the European Research Council under the European Union’s Seventh Framework Programme (FP7/2007-2013, ERC grant agreement 335980_EinME) and Startup package to the Ivankov laboratory at Skolkovo Institute of Science and Technology. The work was started at the School of Molecular and Theoretical Biology 2017 supported by the Zimin Foundation. N.S.B. was supported by the Woman Scientists Support Grant in Centre for Genomic Regulation (CRG).

Volume

Issue

Page

1960-1962

ISSN

1367-4803

eISSN

1460-2059

IST-REx-ID

8645

Cite this

Esteban LA, Lonishin LR, Bobrovskiy DM, et al. HypercubeME: Two hundred million combinatorially complete datasets from a single experiment. Bioinformatics. 2020;36(6):1960-1962. doi:10.1093/bioinformatics/btz841

Esteban, L. A., Lonishin, L. R., Bobrovskiy, D. M., Leleytner, G., Bogatyreva, N. S., Kondrashov, F., & Ivankov, D. N. (2020). HypercubeME: Two hundred million combinatorially complete datasets from a single experiment. Bioinformatics. Oxford University Press. https://doi.org/10.1093/bioinformatics/btz841

Esteban, Laura A, Lyubov R Lonishin, Daniil M Bobrovskiy, Gregory Leleytner, Natalya S Bogatyreva, Fyodor Kondrashov, and Dmitry N Ivankov. “HypercubeME: Two Hundred Million Combinatorially Complete Datasets from a Single Experiment.” Bioinformatics. Oxford University Press, 2020. https://doi.org/10.1093/bioinformatics/btz841.

L. A. Esteban et al., “HypercubeME: Two hundred million combinatorially complete datasets from a single experiment,” Bioinformatics, vol. 36, no. 6. Oxford University Press, pp. 1960–1962, 2020.

Esteban, Laura A., et al. “HypercubeME: Two Hundred Million Combinatorially Complete Datasets from a Single Experiment.” Bioinformatics, vol. 36, no. 6, Oxford University Press, 2020, pp. 1960–62, doi:10.1093/bioinformatics/btz841.

All files available under the following license(s):

Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0):