DOI: 10.1093/ajrccm/aamag286.308 ISSN: 1073-449X

A103-20 Leveraging Observational and Randomized Data to Enhance Prediction of Individualized Treatment Effects for Antibiotic Selection in Hospitalized Patients

A Spicer, E J Graham Linck, E T Qian, K A Carey, J D Casey, K P Seitz, A Stey, G Chen, T W Rice, M W Semler, M M Churpek

Abstract

Rationale

Causal machine learning models that predict individualized treatment effects (ITEs) for patients using baseline characteristics (e.g., age, vital signs) have been increasingly applied to critical care randomized clinical trials (RCTs). These models have successfully identified heterogeneity of treatment effects for several interventions, informing more personalized treatment. However, RCTs are often small and may not be representative of the population of interest. In contrast, observational data from electronic health records (EHR) provide granular data for large numbers of patients, but causal analysis is complicated by measured and unmeasured confounding. We hypothesize that developing ITE models using a debiasing procedure that leverages both randomized and observational data will outperform models developed using either data type alone.

Methods

The randomized data for this analysis came from the single-center Antibiotic Choice on Renal Outcomes (ACORN) trial, which compared survival without incident AKI at 14 days between two antibiotics commonly prescribed for acute infection: cefepime and piperacillin-tazobactam. The observational EHR data were collected from adult patients hospitalized between 2021 and 2024 at Northwestern Memorial Hospital (NMH), Chicago, IL. Observational data were processed for target-trial emulation by filtering for patients who met ACORN RCT eligibility criteria and aligning treatment, outcome, and covariate definitions with those in the trial. Baseline covariates included demographics, lab values, and comorbidities. A randomly selected third of the randomized dataset was held out for testing. Six ITE models were 1) trained in only randomized data, 2) trained in only observational data, and 3) trained in observational data and then debiased for unobserved confounding using a bias model learned in the randomized data (Kallus et al., 2018). Model performance was evaluated using the Qini coefficient in the held-out test set.

Results

The randomized dataset was smaller than the observational dataset (2511 versus 6039 patients), and in both datasets, the majority of patients survived without incident AKI (74.2% versus 62.1%). None of the six ITE models created in either dataset alone had statistically significant Qini coefficients. In contrast, incorporating both observational EHR and RCT data yielded a statistically significant Qini coefficient for all but one of the ITE models (Figure 1). Furthermore, the debiasing procedure achieved the highest Qini coefficients across all six ITE modeling approaches compared with each data type alone.

Conclusions

Leveraging strengths from randomized and observational data through debiasing yields higher-performing ITE models than either data type alone, increasing the potential utility and generalizability of these predictions for personalizing care.

This abstract is funded by: NIH/ NIGMS, R35-GM145330 and NLM 5T15LM007359

More from our Archive