Using a large language model for echocardiographic identification of right ventricular dysfunction in patients with pulmonary hypertension
L Ekambarapu, A Pendyal, M Alwakeel, A RajaratnamAbstract
Background
Pulmonary hypertension (PH) affects 1% of the global population and carries substantial morbidity and mortality. As pulmonary pressures rise, the right ventricle (RV) experiences structural and functional changes, from right ventricular dysfunction (RVD) to RV failure. Transthoracic echocardiography (TTE) is noninvasive and can identify RVD, though standard RV measurements are only modestly predictive. Unstructured TTE reports can offer richer information on RV function, but extracting relevant data is labor-intensive. Rule-based, regular expression–driven terminology mapping can extract individual variables, but cannot represent the multifactorial nature of RVD. Large language models (LLMs) offer scalable, clinically meaningful interpretations, enabling robust extraction of RVD features to support earlier diagnosis and management.
Purpose
To compare an LLM-based extraction method to a conventional rules-based schema for identifying echocardiographic features associated with PH and RVD in a large TTE dataset.
Methods
MIMIC-III NOTE2NUM echocardiography reports (n = 45,794) were analyzed using GPT-4o–based LLM extraction deployed within a secure health system enclave, and were benchmarked against echocardiographic measurements defined in the MIMIC-III dictionary schema. In MIMIC-III, PH was recorded qualitatively (mild/moderate/severe) based on tricuspid regurgitant (TR) jet velocity and then re-coded as present vs. absent. LLM based extraction defined RVD as (1) RV structural abnormality (≥1 of hypertrophy, dilation, or wall hypo-/akinesis) or (2) RV pressure/volume overload (≥2 of estimated right atrial pressure > 8 mmHg, TR jet velocity > 2.8 m/s, fractional area change < 35%, tricuspid annular planar systolic excursion < 17 mm, S′ < 9.5 cm/s, or E/e′ > 14), with PH defined as estimated pulmonary artery systolic pressure > 35 mmHg or qualitative documentation of PH.
Results
LLM extraction identified PH in 15,394 (33.6%), RV pressure/volume overload in 14,449 (31.6%), and RV structural abnormalities in 11,955 (26.1%). Co-occurrence was common: overload + structural changes in 9,380 (20.5%), overload + PH in 9,756 (21.3%), structural changes + PH in 6,183 (13.5%), and all three in 5,620 (12.3%). Using the MIMIC-III dictionary schema, PH prevalence was similar (15,371; 33.6%), but RV overload fields were captured less often (pressure overload 1,357 [3.0%], volume overload 1,128 [2.5%], pressure + volume overload 1,093 [2.4%]; any overload field 3,578 [7.8%]), and RV pressure/volume overload with PH was identified in only 731 (1.6%).
Conclusions
LLM-based extraction outperforms rules-based schemas for identifying complex disease states not defined by any single variable. By synthesizing multifactorial signals, LLMs can phenotype RVD with higher fidelity and support population-level assessment. Further validation using complementary imaging, invasive hemodynamics, and clinical outcomes is needed to confirm clinical utility.For image description, please refer to the figure legend and surrounding text.