DOI: 10.1001/jamanetworkopen.2026.11885 ISSN: 2574-3805

Medical Record Abstraction for Quality Improvement in Sepsis Care Using Artificial Intelligence

Aaron Boussina, Claire Allison, Kimberly Quintero, Sonia Jain, Chad VanDenBerg, Michael Hogarth, Amy M. Sitapati, Karandeep Singh, Atul Malhotra, Michael T. McCurdy, Christopher A. Longhurst, James S. Ford, Theodore Chan, Paul Ishimine, Richard Childers, Shamim Nemati, Gabriel Wardi

Importance

Hospital quality reporting remains a manual, costly process with critical limitations as a mechanism to improve care outcomes.

Objective

To assess whether near-real-time quality measurement, enabled by large language models (LLMs), can improve quality performance as measured by the Centers for Medicare & Medicaid Services (CMS) Severe Sepsis and Septic Shock Management Bundle (SEP-1) quality metric.

Design, Setting, and Participants

This single-blind, unstratified, cluster randomized trial was conducted between December 13, 2024, and July 8, 2025, at 2 academic emergency departments (EDs) within the University of California, San Diego (UCSD) health system. Participants included all 66 attending physicians who practiced in the UCSD EDs and worked more than 3 shifts per month prior to study initiation.

Intervention

Participants were randomized to receive targeted feedback from LLM-determined compliance with SEP-1 at the time of patient discharge or standard process.

Main Outcomes and Measures

The primary outcome was overall compliance with SEP-1. Secondary outcomes included expert agreement with the LLM SEP-1 determination, 30-day mortality, and intensive care unit admissions of patients with severe sepsis and/or septic shock in the ED. Effect sizes were estimated from a mixed-effects logistic regression model with the intervention group as a fixed effect and a random intercept for physician.

Results

The study population included 66 physicians who treated 301 patients (121 in the control group and 180 in the intervention group; median age, 64.3 [IQR, 51.1-75.7] years; 171 [56.8%] male; 52 [17.3%] with chronic kidney disease; 52 [17.3%] with chronic heart failure) who met CMS inclusion criteria for SEP-1. Physicians in the control group had a SEP-1 compliance rate of 70.1%, while those in the intervention group had a rate of 82.9%. Assignment to the intervention group resulted in a 13.0% absolute improvement in SEP-1 compliance (95% CI, 2.5%-23.4%; odds ratio, 2.10 [95% CI, 1.15-3.81]; P  = .02) in the mixed-effects model. The largest difference between the intervention group and control group was in noncompletion of the 30-mL/kg fluid bolus component (3 of 180 [1.7%] vs 16 of 121 [13.2%]), a documentation-sensitive component of the quality measure. Agreement between LLM determination and expert review was 92%. No significant differences existed in intensive care unit admissions or 30-day mortality.

Conclusions and Relevance

In this cluster randomized trial of artificial intelligence (AI)-enabled medical record abstraction for sepsis care, rapid assessment of SEP-1 performance and targeted feedback improved overall compliance with the measure. AI-driven quality clinical integration may address limitations in existing hospital quality reporting and better support a learning health system.

Trial Registration

ClinicalTrials.gov Identifier: NCT07581340

More from our Archive