DOI: 10.1161/circ.148.suppl_1.14422 ISSN: 0009-7322

Abstract 14422: Accuracy of a Popular Online Chat-Based AI Model in Providing Recommendations on the Management of Atrial Fibrillation in Accordance With the 2019 AHA/ACC/HRS Guidelines: A Patient versus Physician-Centered Approach

Joseph Kassab, Abdel Hadi El Hajjar, Joseph El Dahdah, Michel Chedid El Helou, Habib Layoun, Shady Nakhla, Walid I Saliba, Mohamed H Kanj, Samir R Kapadia, Serge C Harb
  • Physiology (medical)
  • Cardiology and Cardiovascular Medicine

Introduction: Atrial fibrillation (AF) is associated with reduced QOL and numerous CV events. AF patients often seek additional information beyond what their physician provides, often turning to Internet sources for guidance.

Hypothesis: The rise of online chat-based AI interactions raises the question of whether this technology can assist in patient education and provide guidelines-based recommendations. We sought to evaluate if ChatGPT 4.0 could provide accurate patient and physician-centered recommendations for AF.

Methods: We created 2 sets of 15 questions, one from a patient (PA) and another from a physician’s (PH) perspective. PA questions were formulated after querying “Google Trends” for the most searched terms related to AF. PH questions were formulated by expert cardiologists. Each question was submitted to the AI online model. Each cardiologist received 1 set of 30 answers and graded each answer as "accurate," "accurate with missing information (MI)," or "inaccurate." An answer was considered “accurate” if it included appropriate and complete information found in the currently published guidelines, “inaccurate” if false information was provided, or “accurate with MI” if essential information was missing according to the PH clinical expertise. Discrepancies in the PH gradings were identified and resolved.

Results: 13/15 (87%) PA answers were graded as accurate. The remaining 2/15 were accurate but with MI. No PA answers were inaccurate. 6/15 (40%) PH answers were accurate. 7/15 (46%) were accurate with MI and 2/15 (13%) were inaccurate. The AI chatbot provided significantly more accurate answers to PA questions vs PH questions (p<0.05, Fisher’s exact test).

Conclusions: ChatGPT 4.0 provided largely accurate answers to PA questions, showing promise as a tool to assist in patient education on AF. However, the accuracy of answers to PH questions was suboptimal, reinforcing that this tool should not be used by physicians for clinical support in AF.

More from our Archive