Unearthing implicit biases: a scenario-based evaluation of open-source LLMs

doi:10.1108/ajim-06-2025-0389

DOI: 10.1108/ajim-06-2025-0389 ISSN: 2050-3806

Unearthing implicit biases: a scenario-based evaluation of open-source LLMs

Yuan Yao, Liu Hao, He Rong, Wu Dan, Xueqin Zhao

Purpose

As large language models (LLMs) are increasingly applied in information and data management – serving as tools for information retrieval, digital libraries, and knowledge organization – addressing their embedded biases is crucial for maintaining fairness and transparency. This paper aims to introduce a novel scenario-based test to quantitatively evaluate the level of implicit bias in LLMs, specifically in the context of information management tasks.

Design/methodology/approach

We employed a novel scenario-based test, designed to simulate real-world information tasks, to extensively investigate biases across 10 recently released open-source LLMs, including LLaMA 3.2 and Qwen 2.5. Our evaluation spanned 21 diverse bias domains (e.g. race, gender, religion, health) and was conducted in both English and Chinese contexts. We also performed an extensive comparative study against the Implicit Association Test to assess the suitability of our approach for this domain.

Findings

Implicit biases are pervasive across most models, with the potential to affect information processing and retrieval. In English, biases related to sexuality and skin tone are most prominent, while in Chinese, mental illness became the most biased domain, demonstrating a critical impact of language on bias. Our study shows that the scenario-based approach is better suited for capturing these context-dependent biases, making it more relevant for real-world applications in information and data management.

Originality/value

This work presents a novel, quantitative scenario-based test specifically for evaluating LLM implicit bias in information-management contexts. Extensively testing 10 LLMs across 21 domains and two languages provides concrete evidence of pervasive, language-dependent biases and offers a more context-aware assessment than previous methods, highlighting the urgent need for robust evaluation and mitigation strategies in this field.

Outline

Unearthing implicit biases: a scenario-based evaluation of open-source LLMs

Purpose

Design/methodology/approach

Findings

Originality/value

More from our Archive