DOI: 10.1093/molbev/msae251 ISSN: 0737-4038

A graph-based goat pangenome reveals structural variations involved in domestication and adaptation

Peipei Bian, Jiaxin Li, Shishuo Zhou, Xingquan Wang, Mian Gong, Xi Guo, Yudong Cai, Qimeng Yang, Jiaqi Fu, Rongrong Li, Shuhong Huang, Funong Luo, Ali Mujtaba Shah, Johannes A Lenstra, Joram M Mwacharo, Ran Li, Gang Ren, Xiaolong Wang, Cong Li, Wenxin Zheng, Yu Jiang, Xihong Wang

Abstract

Pangenomes can facilitate a deeper understanding of genome complexity. Using de novo phased long-read assemblies of eight representative goat breeds, we constructed a graph-based pangenome of goats (Capra hircus) and discovered 113 Mb autosomal novel sequences. Combining this multi-assembly pangenome with low-coverage PacBio HiFi sequences, we constructed a long-read structural variations (SVs) database containing 59,325 SV deletions, 84,910 SV insertions and 24,954 other complex SV alleles. This resource allowed reliable graph-based genotyping from short reads of 79 wild and 1,148 worldwide domestic goats. Selection signal analysis of SV captured a novel immune-related domestication locus containing the galectin-9 gene and extra copies of the ruminant-specific galectin-9-like genes (LGALS9L), which have high tissue specificity. A segmental duplication in domestic goats generates three additional LGALS9L copies. Ancient goat genome sequences show a gradual increase in frequency of this duplication from the Neolithic to the present. Two other newly detected SVs also have higher selection signals than adjacent SNPs, a truncated-LINE1 deletion in EDAR2 associated with cashmere production and a VNTR-related insertion in PAPSS2 linked to high-altitude adaptation. In summary, the multi-assembly goat pangenome and long-read SV database facilitates detecting complex variations that are important in evolution and selection.

More from our Archive