Multi-Agent LLMs for Occupational Profiling: Psychometric Validation on 1636 Chinese Occupations

doi:10.3390/bs16071064

DOI: 10.3390/bs16071064 ISSN: 2076-328X

Multi-Agent LLMs for Occupational Profiling: Psychometric Validation on 1636 Chinese Occupations

Yuting Han, Xiaoyang Luo, Feng Ji, Xiang Kong

Occupation-level psychological profiles, such as RIASEC interests and Big Five personality, underpin career counseling, person–job matching, and workforce research, but building them at scale has been expensive and limited to a few national taxonomies. The O*NET Interest Profiler, the largest operationalization of RIASEC, took more than two decades of worker surveys, expert ratings, and iterative empirical calibration to construct, and the 2022 Chinese Occupational Classification has no comparable psychological database. Large language models (LLMs) offer a scalable alternative, but using them as raters raises issues that single-model designs do not resolve: inter-rater reliability, calibration to external benchmarks, and systematic psychometric validation. We propose a multi-agent LLM framework in which three LLMs serve as separate expert raters, in-context anchors align the rating scale, and a separate arbitrator resolves rater disagreements. We applied the framework to all 1636 occupations in the 2022 Chinese Occupational Classification, producing six RIASEC and five Big Five scores per occupation. RIASEC dimensions showed uniformly excellent reliability (intraclass correlation coefficient, ICC [2,1] = 0.87 to 0.98) and high convergent correlations with O*NET (r = 0.84 to 0.96); structural validity received weak support (Tracey’s C = 0.653, ns), though the dimensions differentiated occupational categories as theory predicts, and the profile space recovered the administrative taxonomy (adjusted Rand index, ARI = 0.418). Big Five absolute agreement was uniformly high, although ICC(2,1) values for Conscientiousness and Neuroticism were attenuated by variance compression and model-level calibration offsets rather than rater disagreement. The Big Five scores are, therefore, suited to broad occupational differentiation, particularly on Openness and Extraversion, rather than to fine-grained rank ordering on Conscientiousness or Neuroticism. The framework also yields the first occupation-level RIASEC and Big Five database for the 2022 Chinese Occupational Classification, openly available for applied use.

Outline

Multi-Agent LLMs for Occupational Profiling: Psychometric Validation on 1636 Chinese Occupations

More from our Archive