Simulating expert oncologist-to-oncologist thinking and behavior: Multi-agent AI for precision oncology collaboration.
Deva Datta Reddy Jyothi, Samyukta Jytohi, Devarshini Jyothi20
Background: Collaborative oncologist decision-making—via tumor boards, subspecialty consultation, and second opinions—frequently alters treatment recommendations, often in 20–40% of cases. However, maintaining expert synchrony is increasingly challenging due to rapid guideline updates, expanding therapeutic options, and workforce constraints, particularly in resource-limited settings. Single large language models (LLMs) demonstrate substantial rates of guideline-discordant recommendations in oncology. We developed a multi-agent AI system in which agents act on behalf of organ- and modality-specific oncologists to simulate expert cognitive reasoning and behavioral collaboration. Methods: The architecture comprises specialized agents representing medical, surgical, and radiation oncology operating through synchronized perceive–reflect–plan loops. A live knowledge layer integrates continuously updated clinical guidelines and regulatory approvals. A structured simulation engine governs deterministic inter-agent deliberation. Reinforcement learning using behavior cloning and multi-agent proximal policy optimization (MAPPO) promotes stable collaborative dynamics. The system was evaluated on 102 multi-institutional real-world breast cancer cases and 60 structured MEREDITH/AMIE-style benchmark tasks (total N = 162). Primary endpoints included guideline concordance, safety (major discordance), and comparative performance versus single-model baselines. Results: The multi-agent system achieved 95.1% guideline concordance (95% CI, 91.2%–98.4%) on structured benchmarks and 93.1% (95% CI, 89.1%–96.5%) on real-world cases. No major guideline-discordant recommendations were observed. Compared with single-model baselines, absolute improvements ranged from 15–47 percentage points (all P < .001). Inter-agent verification eliminated expert-rejected outputs observed in comparator systems. Conclusions: A multi-agent AI framework acting on behalf of organ-specific oncologists to simulate expert thinking and behavioral synchrony significantly improves guideline adherence while eliminating major discordant outputs. This architecture provides a scalable approach for precision oncology decision support and structured training in rapidly evolving clinical environments.