DOI: 10.1093/pnasnexus/pgad485 ISSN: 2752-6542

America’s racial framework of superiority and Americanness embedded in natural language

Messi H J Lee, Jacob M Montgomery, Calvin K Lai


America’s racial framework can be summarized using two distinct dimensions: superiority/inferiority and Americanness/foreignness (Zou & Cheryan, 2017). We investigated America’s racial framework in a corpus of spoken and written language using word embeddings. Word embeddings place words on a low-dimensional space where words with similar meanings are proximate, allowing researchers to test whether the positions of group and attribute words in a semantic space reflect stereotypes. We trained a word embedding model on the Corpus of Contemporary American English - a corpus of one-billion words that span thirty years and eight text categories - and compared the positions of racial/ethnic groups with respect to superiority and Americanness. We found that America’s racial framework is embedded in American English. We also captured an additional nuance: Asian people were stereotyped as more American than Hispanic people. These results are empirical evidence that America’s racial framework is embedded in American English.

