Can AI-Based Large Language Models Provide Reliable Information on Postpartum Depression? A Systematic Content Analysis

doi:10.62425/esbder.1911961

DOI: 10.62425/esbder.1911961 ISSN: 2687-2110

Can AI-Based Large Language Models Provide Reliable Information on Postpartum Depression? A Systematic Content Analysis

Hazan Tomar Bozkurt, Abdullah Bozkurt

Objective: This study aimed to systematically evaluate and compare the information quality, scientific reliability, and readability of the responses provided by four widely used large language models (LLMs)—ChatGPT, Google Gemini, DeepSeek, and Claude—to frequently asked questions regarding postpartum depression (PPD).Methods: This descriptive cross-sectional study assessed 40 LLM-generated responses concerning PPD. Information quality was assessed using the DISCERN tool, scientific reliability was evaluated using a 5-point Likert scale, and readability was measured using the Flesch Reading Ease Score (FRES) and Flesch–Kincaid Grade Level (FKGL).Results: All four models met the adequacy thresholds for information quality and scientific reliability. DeepSeek achieved the highest mean DISCERN score, which was significantly higher than that of Claude. No statistically significant difference was observed in scientific reliability scores across the four models. Regarding readability, Claude produced significantly more complex texts than ChatGPT and Google Gemini based on FKGL scores. Mean FRES values for all models fell within the "difficult" to "fairly difficult" range, with no significant between-group difference.Conclusion: All four LLMs demonstrated adequate information quality and scientific reliability regarding PPD, with DeepSeek exhibiting the highest information quality. However, substantial deficiencies were identified in readability across all models, with Claude producing the most linguistically complex outputs. These findings suggest that while LLMs show promising potential as complementary health information sources for PPD, their outputs require simplification to meet recommended readability standards for patient education, particularly for postpartum populations who may experience cognitive and emotional barriers to comprehending complex health information.

Outline

Can AI-Based Large Language Models Provide Reliable Information on Postpartum Depression? A Systematic Content Analysis

More from our Archive