DOI: 10.1093/gbe/evag145 ISSN: 1759-6653

High rate of mutation and efficient removal by selection of structural variants from natural populations of Caenorhabditis elegans

Ayush Shekhar Saxena, Charles F Baer

Abstract

The importance of genomic structural variants (SVs) is well-appreciated, but less is known about their mutational properties than of single nucleotide variants (SNVs) and short indels. The reason is simple: the longer the variant, the less likely it will be covered by a single sequencing read, thus the harder it is to map unambiguously to a unique genomic location.

Here we report SV mutation rate estimates from six mutation accumulation (MA) lines from two strains of C. elegans using long-read (PacBio) sequencing. The inferred SV mutation rate is ∼0.03/genome/generation, about 1/10 the SNV rate and 1/4 the short indel rate. We identified 40 SV mutations (12 insertions, 28 deletions, 0 inversions) and 52 false positive (FP) variants by manual inspection. Excluding one atypical line (5 mutations, 35 FPs), the signal (mutant) to noise (FP) ratio is approximately 2:1. False negative rates were determined by simulating variants in the reference genome and observing 'recall'. Recall rate ranges from >90% for short indels and declines as SV length increases. Small deletions have nearly the same recall rate as small insertions (∼100bp), but deletions have higher recall rates than insertions as size increases. The reported SV mutation rate is likely a lower bound.

A quarter of identified SV mutations occur in SV hotspots that harbor pre-existing low complexity repeat variation. Comparison of the spectrum of spontaneous SVs to wild isolates implies that natural selection is not only efficient at removing SVs in exons but also effectively removes SVs in intergenic regions.

More from our Archive