The term “haplotype block” is commonly used in the developing field of haplotype-based inference methods. We argue that the term should be defined based on the structure of the Ancestral Recombination Graph (ARG), which contains complete information on the ancestry of a sample. We use simulated examples to demonstrate key features of the relationship between haplotype blocks and ancestral structure, emphasizing the stochasticity of the processes that generate them. Even the simplest cases of neutrality or of a “hard” selective sweep produce a rich structure, often missed by commonly used statistics. We highlight a number of novel methods for inferring haplotype structure, based on the full ARG, or on a sequence of trees, and illustrate how they can be used to define haplotype blocks using an empirical data set. While the advent of new, computationally efficient methods makes it possible to apply these concepts broadly, they (and additional new methods) could benefit from adding features to explore haplotype blocks, as we define them. Understanding and applying the concept of the haplotype block will be essential to fully exploit long and linked-read sequencing technologies.
We thank the Barton group for useful discussion and feedback during the writing of this article. Comments from Roger Butlin, Molly Schumer's Group, the tskit development team, editors and three reviewers greatly improved the manuscript. Funding was provided by SCAS (Natural Sciences Programme, Knut and Alice Wallenberg Foundation), an FWF Wittgenstein grant (PT1001Z211), an FWF standalone grant (grant P 32166), and an ERC Advanced Grant. YFC was supported by the Max Planck Society and an ERC Proof of Concept Grant #101069216 (HAPLOTAGGING).
Shipilina D, Pal A, Stankowski S, Chan YF, Barton NH. On the origin and structure of haplotype blocks. Molecular Ecology. 2023;32(6):1441-1457. doi:10.1111/mec.16793
Shipilina, D., Pal, A., Stankowski, S., Chan, Y. F., & Barton, N. H. (2023). On the origin and structure of haplotype blocks. Molecular Ecology. Wiley. https://doi.org/10.1111/mec.16793
Shipilina, Daria, Arka Pal, Sean Stankowski, Yingguang Frank Chan, and Nicholas H Barton. “On the Origin and Structure of Haplotype Blocks.” Molecular Ecology. Wiley, 2023. https://doi.org/10.1111/mec.16793.
D. Shipilina, A. Pal, S. Stankowski, Y. F. Chan, and N. H. Barton, “On the origin and structure of haplotype blocks,” Molecular Ecology, vol. 32, no. 6. Wiley, pp. 1441–1457, 2023.
Shipilina D, Pal A, Stankowski S, Chan YF, Barton NH. 2023. On the origin and structure of haplotype blocks. Molecular Ecology. 32(6), 1441–1457.
Shipilina, Daria, et al. “On the Origin and Structure of Haplotype Blocks.” Molecular Ecology, vol. 32, no. 6, Wiley, 2023, pp. 1441–57, doi:10.1111/mec.16793.
All files available under the following license(s):
Creative Commons Attribution 4.0 International Public License (CC-BY 4.0):