DOI: 10.1002/alz.082548 ISSN: 1552-5260

Automated analysis of letter fluency data produced by Korean and American patients with AD and MCI

Jin‐Seo Kim, Seohee Kim, David J. Irwin, Naomi Nevler, Min Seok Baek, Sunghye Cho
  • Psychiatry and Mental health
  • Cellular and Molecular Neuroscience
  • Geriatrics and Gerontology
  • Neurology (clinical)
  • Developmental Neuroscience
  • Health Policy
  • Epidemiology



Letter fluency task measures executive functioning and working memory in clinical settings. Previous studies have shown that both the total number of correct responses and lexical characteristics of produced words are informative in explaining patients’ performance. Yet, many previous studies analyzed fluency data in one language using manual assessments, lacking crosslinguistic validity and feasibility. We developed automated pipelines for digitized letter fluency data in Korean and English to compare patients with Alzheimer’s disease (AD) or mild cognitive impairment (MCI) and healthy controls (HC) across the two languages.


We analyzed letter fluency data (English: ‘F’, Korean: ‘G’) produced by 47 Koreans (mean age = 74.1±8years, 25 females) and 40 Americans (mean age = 63.8±8.6years, 16 females) with AD (11 Koreans, 20 Americans) or MCI (16 Koreans, 7 Americans) and HCs (20 Koreans, 13 Americans) using our automated pipeline for each language. The pipelines counted the number of correct responses, rated words for frequency and length, and measured phonetic and semantic distances and pause duration between two consecutive correct responses. Linear regression models compared groups in scores and language variables, respectively, where score or speech variable was included as a dependent variable and diagnostic group as an independent variable. Education level, age, and language were co‐varied in all models. The same models were built for each language without language as a covariate.


AD patients produced fewer correct responses with higher word frequency compared to HC (score: β = ‐4, p = 0.001, frequency: β = 0.63, p = 0.001) and MCI (score: β = ‐2.9, p = 0.017, frequency: β = 0.55, p = 0.008). AD patients also produced longer pauses between words and words that were less phonetically similar to its antecedent compared to HC (pause: β = 4.99, p = 0.002, phonetic similarity: β = 11.33, p = 0.011) and/or MCI patients (pause: β = 3.6, p = 0.039). The results remained similar when separate models were built for each language.


We demonstrated the feasibility of automated analyses of letter fluency data in different languages and validated that word frequency, pause duration and phonetic similarity were important speech characteristics in both languages. These pipelines can be adopted in other languages, which will enable large‐scale, automated crosslinguistic analyses of letter fluency data in clinical settings.

More from our Archive