Primer Design through Submodular Function Estimation
Yixin Chen, Yunheng Han, Ao Wang, Aaron Hong, Adam R Rivers, Alan Kuhnle, Christina BoucherAbstract
Motivation
Multiplex PCR-based enrichment is widely used in viral genome sequencing and pathogen surveillance. However, designing large sets of primers that maximize genome coverage while minimizing primer–primer interactions remains a major computational challenge. Existing methods such as SADDLE and Olivar use heuristics to optimize a Badness score for primer dimers but lack theoretical guarantees on solution quality.
Results
We introduce PRISM, a new framework that formulates multiplex primer design as a constrained submodular maximization problem. Our method defines an objective that balances genome coverage and dimer risk, and applies a local search algorithm with a constant-factor approximation guarantee. Evaluations on viral genome datasets demonstrate that PRISM consistently achieves lower Badness scores compared to PrimalScheme, Olivar, and primerJinn. These results highlight the scalability and theoretical rigor of submodular optimization in primer design.
Availability
PRISM is open-source and available at https://github.com/yhhan19/PRISM-new. The experimental data, scripts, and results used in this paper are archived on Figshare at https://doi.org/10.6084/m9.figshare.32806499.
Supplementary information
Supplementary information: Supplementary material, including proofs and figures, is available at Bioinformatics online and on Figshare at https://doi.org/10.6084/m9.figshare.32806499.