(165) Assessing Artificial Intelligence Quality in the Evaluation and Treatment of Erectile Dysfunction

doi:10.1093/jsxmed/qdae001.156

DOI: 10.1093/jsxmed/qdae001.156 ISSN: 1743-6095

(165) Assessing Artificial Intelligence Quality in the Evaluation and Treatment of Erectile Dysfunction

M Pathuri, O Marciano, P Barrtero Guimaraes, E Kocjancic, O Raheem

Urology
Reproductive Medicine
Endocrinology
Endocrinology, Diabetes and Metabolism
Psychiatry and Mental health

Show PDF Cite

Abstract

Introduction

Since November 2022, Artificial Intelligence (AI) chatbots have grown in popularity including in urologic conditions. However, their accuracy and quality has not been evaluated systematically. In this study, we sought to assess the accuracy and quality of AI chatbots in the management of Erectile Dysfunction (ED).

Objective

We aim to evaluate the accuracy and quality of open-source language models in fielding common clinical questions pertaining to ED compared to board certified urologists.

Methods

Two AI open-source language models, ChatGPT and Google Bard, were fielded 15 standard questions related to ED some of which included causes, risk factors and treatment options of ED. Two board certified urologists were given the same questions on a standard survey. A third blinded board-certified urologist served as a grader using the AUA guidelines for ED on a Likert scale to assess the accuracy, robustness, and bias of each response. Urologist and AI responses were graded and aggregated using Likert scales.

Results

Overall AI responses were significantly more accurate (p<0.01), robust (p<0.01), and unbiased (p<0.01). Additionally, Google Bard had the highest scores all around followed by ChatGPT. The urologists’ responses were approximately 38% lower compared to the AI responses.

Conclusions

This study suggests that AI responses were superior compared to urologists in the areas of accuracy, robustness, and bias pertaining to the management of ED. Albeit, the chatbots have promising role in urologic conditions including ED, its widespread clinical use and adoption warrants further evaluation in the context of clinical decision making and enhancing patient care.

Disclosure

No.

Outline

(165) Assessing Artificial Intelligence Quality in the Evaluation and Treatment of Erectile Dysfunction

Abstract

Introduction

Objective

Methods

Results

Conclusions

Disclosure

More from our Archive