Automation of OncoKB annotation to enhance efficiency and consistency of molecular tumor board workflows.
Zit Liang Chan, Kenneth Shiu Kahn Chow, Su Fen Ang, Christine Carolyn, Sharmaine Jia Xin Tan, Yi Zhi Ong, Anders Skanderup, David Tai, Tira J. Tan, Daniel Shao-Weng Tan26
Background:
Molecular Tumor Boards (MTBs) play a critical role in translating next-generation sequencing (NGS) results into actionable treatment recommendations. Variant annotation is currently performed manually which is labour-intensive, difficult to scale, and subject to variability. We evaluated whether automated extraction and annotation could improve efficiency while maintaining concordance with expert-curated MTB annotations.
Methods:
A rule-based pipeline was developed to extract genomic events from heterogeneous clinical NGS reports, including single-nucleotide variants, copy-number alterations, gene fusions, and mutational signatures. Extracted events were normalized into a standardized data model and annotated using the OncoKB Application Programming Interface (API) to retrieve oncogenicity classifications and therapeutic levels of evidence. The pipeline was iteratively refined across four development versions and retrospectively evaluated on 32 reports comprising 497 unique genomic events including variants of unknown significance (VUS). Precision recall metrics (F1 and balanced accuracy scores) for extraction and actionability assignment steps were calculated using the manually validated final version as a reference.
Results:
Iterative refinement resulted in progressive improvements in extraction completeness and therapeutic annotation accuracy (Table 1). Manual review of a monthly MTB cohort requires up to 30 hours of expert effort, whereas automated processing completes in under 3 minutes.
Conclusions:
Automated extraction and OncoKB annotation can achieve performance comparable to expert manual curation when supported by iterative refinement and normalization rules. By version 4, the pipeline achieved complete concordance with validated MTB annotations and significantly reduced the processing time by up to 600-fold compared to manual review. This scalable approach enhances efficiency, consistency, and reproducibility in routine MTB workflows. While automated processing achieved complete overall concordance, fusions/rearrangements may still benefit from expert review due to format heterogeneity in clinical reports.
Iterative pipeline performance across development versions.