DOI: 10.12688/f1000research.181764.1 ISSN: 2046-1402

Development of Predictive Analytics Model for Early Detection of Depression in Women

Chidi Betrand, Chinna Orish, Oluchukwu Ekwealor, Chinwe Onukwugha, Mercy Benson-Emenike, Nneka Oragba, Douglas Kelechi, Christopher Ofoegbu, Toochi Ewunonu
Introduction Depression remains a significant global health concern, with women being particularly affected by gender specific symptom patterns that remain unaddressed in conventional predictive models. The aim of this study is to address this gap by developing and evaluating a multimodal deep learning system engineered to predict depression early among women. Methods The DAIC-WOZ (Distress Analysis Interview Corpus – Wizard of Oz) dataset from the (USC ICT/SAIL Lab, University of Southern California was used for this work. Approval for the use of the dataset was obtained after completing the End-User License Agreement and an approval email received. The employs a late-fusion approach integrating three data streams: linguistic information from interview transcripts through Robustly Optimized BERT Pretraining Approach (RoBERTa) embeddings, acoustic information (Mel-frequency cepstral coefficients (MFCCs), and visual information (facial Action Units). Each modality is handled independently before concatenating their embeddings for the final classification. Training was closely monitored, and the optimal checkpoint was selected based on validation performance. Results The final model achieved an F1-score of 0.50 for the depressed class on the unseen test dataset. Ablation studies indicated the dominance of linguistic features, which achieved an AUC of 0.77, while acoustic features achieved an Area Under Curve (AUC) of 0.59. Visual features, however, performed poorly (AUC = 0.36), suggesting that they introduced noise rather than informative signals. To further explore model behavior, a case study on a confirmed depressed participant was conducted. The framework predicted at 50% confidence, demonstrating greater sensitivity than more cautious unimodal baselines. Conclusion These findings point out that multimodal deep learning is a promising direction for gender-specific depression prediction. Among modalities, linguistic cues remain the strongest indicators of depression in women, resonating with their integral role in effective predictive modeling.

More from our Archive