DOI: 10.1099/mgen.0.001743 ISSN: 2057-5858

NanoPop: Nanopore amplicon sequencing of Salmonella virulence genes to characterize complex mixed serovar populations using k-mers to overcome high sequencing error rates

David Ayala-Velastegui, Tatum D. Mortimer, Amy T. Siceloff, Nikki W. Shariat

Salmonella enterica comprises over 2,600 genetically distinct serovars, of which 20 are associated with 70% of human illnesses. In food animal production systems, including poultry, mixed serovars commonly exist. Therefore, culture-based approaches that rely on selecting colonies can underestimate serovar complexity in a sample. Recent advances in molecular deep serotyping methods like CRISPR-SeroSeq provide a high resolution of Salmonella populations in a single sample but are still time-consuming. This project aimed to develop a faster serotyping approach using the Oxford Nanopore Technology (ONT) platform. A database was built containing the alleles of two conserved Salmonella virulence genes, fimH ( n =91 sequences) and sseL ( n =91 sequences), from a total of 1,160 genomes that represent 36 serovars (including polyphyletic lineages for 11 of these serovars). To test the sensitivity of the assay, bacterial cultures of three serovars of concern (Typhimurium, Enteritidis and Infantis) were mixed in tenfold ratios with serovar Kentucky. Both genes were amplified, barcoded with the ONT native barcoding kit, pooled and sequenced with a MinION device, followed by basecalling, demultiplexing and filtering for length and quality. To overcome high sequencing error rates and still resolve serovars with a low relative abundance, we used two tools: Emu, a published alignment and sequence likelihood algorithm that has been applied to ONT 16S rRNA gene sequencing, and NanoPop, a method developed here. NanoPop relies on a generated database of sliding window 15-mer sequences ( n =6,322) across SNPs in all fimH and sseL alleles and parsing ONT sequencing reads against these 15-mers. Emu and NanoPop each detected background serovars present at 1% or higher. Finally, these methods were applied to real-world samples from pre- and post-harvest poultry samples and compared to CRISPR-SeroSeq outputs. NanoPop is beneficial because it calls unknown or untypeable salmonellae, while Emu forces a match with an allele in the database, potentially calling incorrect serovars. While Salmonella was used as a bacterial model for this study, this method has potential application to other micro-organisms.

More from our Archive