DOI: 10.1002/alz.077629 ISSN: 1552-5260

A novel sociodemographically‐neutral speech analysis algorithm to detect cognitive impairment in a Spanish population

Peru Gabirondo, Alberto Coca, Alyssa Kaser, Jeffrey Schaffert, Munro Cullum, Javier Jiménez Raboso, Pablo Guardia, Laura Lacritz
  • Psychiatry and Mental health
  • Cellular and Molecular Neuroscience
  • Geriatrics and Gerontology
  • Neurology (clinical)
  • Developmental Neuroscience
  • Health Policy
  • Epidemiology



Early detection of mild cognitive impairment (MCI) and dementia is crucial for initiation of care. Brief, cost‐effective cognitive screening instruments are needed to identify individuals who require further evaluation. This study presents preliminary data on a new screening technology using automated recordings analysis software in a Spanish population.


Data were collected from 148 Spanish‐speakings in Spain (MAge = 74.4, MEducation = 12.93, 43.3% males), clinically diagnosed as cognitively normal (CN, n = 78) or impaired (MCI, n = 49; all‐cause dementia, n = 21). Participants were recorded performing four language tasks [Animal fluency, alternating fluency (sports and fruits), phonemic “F” fluency and Cookie Theft Description]. Recordings were processed via text‐transcription and digital‐signal processing techniques to capture neuropsychological variables and audio characteristics. A training sample of 103 subjects with similar demographics (age, sex, education) across groups was used to develop an algorithm to detect cognitive impairment. The features of each task and of speech were used separately to develop five algorithms to compute scores between 0 and 1, and a final algorithm was constructed from these using repeated cross‐validation. Of the 45 remaining original participants, an age‐balanced subset of 28 subjects was used to test the algorithm. General linear modeling was used to quantify demographic biases in the data and in the predicted logistically‐transformed algorithm scores.


Mean logit algorithm scores were significantly different across groups in both test/training samples (p<1e‐3). Sex‐bias in the test dataset (psex = 1.3e‐5) was reduced by predicted logit scores (psex = 0.01); other demographic variables were not significant across samples (page = 0.31→0.21; peducatiom = 0.69→0.58; pmultilingualism = 0.74→0.32). The final algorithm in test/training samples obtained an AUC of 0.908/0.956, with overall accuracy of 89%/89%, sensitivity of 0.85/0.95, and specificity of 0.92/0.83 in discriminating CN and impaired groups.


These findings provide initial support for the utility of this automated speech analysis algorithm as a screening tool for cognitive impairment that does not explicitly use memory tests. Further research is needed to validate algorithm in other languages and cultures.

More from our Archive