DOI: 10.1093/bjd/ljag086.620 ISSN: 0007-0963

PS05 Large language model (LLM) artificial intelligence (AI) patient simulation as a learning tool in psychodermatology: a pilot study with comparative consultation assessment between psychodermatologist, LLM AI and candidate self-assessment

Halim Hussain, Faisal Dubash, Andrew Affleck

Abstract

Delusional infestation (DI) is a challenging condition to treat requiring skilful engagement, empathic communication and avoidance of collusion. Medical trainees report low confidence in such consultations and traditional teaching may not address niche interpersonal skills. Simulation-based education provides a safe environment for development, and large language model (LLM)-based artificial intelligence (AI) offers scalable simulated patient encounters. We evaluated the feasibility and educational impact of an LLM-based AI-simulated consultation for DI, to compare AI-generated assessment with consultant dermatologist ratings and student self-assessment. Eleven medical students completed a 30-min consultation with an LLM-based AI patient with DI. The AI generated formative feedback incorporating Gibbs’ Reflective Cycle and creative prompts from Eno’s Oblique Strategies. Student performance was rated by AI using a five-point Likert scale across domains: empathy, rapport, management of diagnostic uncertainty, avoidance of delusional validation, and clarity of management. Independently, a consultant psychodermatologist scored performance using the same framework. All students completed the simulation, indicating high feasibility. Strengths were noted in acknowledging distress (mean 4.5 out of 5), maintaining empathy without validating delusions (4.4/5), respectful language (4.3/5) and rapport building (4.3/5). Challenges included managing diagnostic uncertainty (mean 3.4/5), avoiding premature psychiatric labelling (3.2/5) and addressing risk and functional impact (2.5–3.0/5). Overall 82% of students recommended AI for communication practice and 73% felt more confident managing similar patients. Consultant dermatologist ratings were generally lower than AI scores (mean difference 0.8), particularly in risk assessment and biopsychosocial formulation, although patterns of strengths and weaknesses were concordant. LLM-based AI patient simulation is feasible and educationally valuable for teaching complex psychodermatological consultations. While AI ratings tend to be higher than expert scores, concordance in performance patterns supports its use as a formative adjunct to specialist-led teaching. The AI-simulated patients were realistic and the AI-generated feedback was appropriate. Future studies are indicated to calibrate AI assessment and evaluate its impact in a larger cohort.

More from our Archive