DOI: 10.1177/00131644261455414 ISSN: 0013-1644

A Simple Approach for Differential Test Functioning Based on Sum Scores

Yutaro Sakamoto, Ryuichi Kumagai

Differential test functioning (DTF) evaluates whether a test exhibits group differences beyond those attributable to latent trait differences. Its magnitude is often most interpretable on the raw-score scale. However, most existing DTF effect size measures rely on item response theory (IRT) modeling. Building on the renewed practical value of sum scores, this study proposes a simple observed-score-based approach to estimate the DTF magnitude directly in raw-score points. The proposed Index S stratifies examinees by an anchor-based sum score composed of items not flagged for differential item functioning (DIF), summarizes the within-stratum mean differences in total test scores, and aggregates these conditional differences using weights based on the observed distribution of the matching variable. To stabilize the estimation when score strata are sparse, adjacent strata are merged to satisfy a minimum per-group sample size requirement, and Index S_std provides a standardized version. We evaluated the indices using two-parameter logistic (2PL) simulation studies by varying the sample size, test length, DIF type, DIF proportion, DIF direction, and focal group latent trait distribution. The utility was assessed in terms of estimation accuracy, including Bias and root mean square error (RMSE), and correlation with established IRT-based DTF indices. An empirical application to TIMSS 2023 Grade 8 mathematics (Japan vs. the United States) illustrates how the proposed indices provide an accessible raw-score interpretation of the DTF magnitude for both psychometric and non-psychometric stakeholders.

More from our Archive