A Survey of Clinicians’ Views of the Utility of Large Language Models
Matthew Spotnitz, Betina Idnay, Emily R Gordon, Rebecca Shyu, Gongbo Zhang, Cong Liu, James J Cimino, Chunhua Weng- Health Information Management
- Computer Science Applications
- Health Informatics
Objective: Large language models (LLMs) like ChatGPT are powerful algorithms that have been shown to produce human-like text from input data. A number of potential clinical applications of this technology have been proposed and evaluated by biomedical informatics experts. However, few have surveyed healthcare providers for their opinions about whether the technology is fit for use. Materials and Methods: We distributed a validated mixed-methods survey to gauge practicing clinicians’ comfort with LLMs for a breadth of tasks in clinical practice, research and education, which were selected from the literature. Results: A total of 30 clinicians fully completed the survey. Of the 23 tasks, 16 were rated positively by more than 50% of the respondents. Based on our qualitative analysis, healthcare providers considered LLMs to have excellent synthesis skills and efficiency. However, our respondents had concerns that LLMs could generate false information and propagate training data bias. Discussion: Our survey respondents were most comfortable with scenarios that allow LLMs to function in an assistive role, like a physician extender or trainee. Conclusion: In a uniquely rigorous and comprehensive mixed-methods survey of clinicians about LLM use, healthcare providers were encouraging of having LLMs in healthcare for many tasks, and especially in assistive roles. There is a need for continued human-centered development of both LLMs and artificial intelligence (AI) in general.