DOI: 10.3390/app16136617 ISSN: 2076-3417

LLMs in Automated Assessment: A Role-Based Taxonomy and Framework for Controlled Educational Integration

Anastasia Vangelova, Veska Gancheva

Large language models (LLMs) are reshaping the automated assessment of open-ended student responses. Compared with earlier rule-based, statistical, and feature-engineered approaches, they enable a deeper interpretation of meaning, context, and argumentation. This development can be understood as a fifth generation of automated scoring systems, but it also raises a new question: not only what LLMs can do, but also how they can be deployed in education in a controlled and reliable manner. This paper presents a role-based taxonomy that distinguishes between generative LLMs used as direct virtual graders, encoder transformers used as semantic tools, and intermediate text-to-text models used in more formalized assessment tasks. It also discusses the main limitations of standalone LLM graders, including hallucinations, probabilistic instability, limited interpretability, bias, and weak grounding in domain-specific content. To address these issues, the paper presents a developed framework implemented in an integrated assessment system built on role prompting, rubric-constrained grading, Retrieval-Augmented Generation (RAG), structured machine-readable outputs, workflow orchestration, and LMS integration. The framework is further extended to multimodal assessment through vision-based evaluation of visual artifacts such as UML state diagrams. The main contribution of the paper is not only a conceptual framework, but also its realization in a working integrated system for automated assessment in a more traceable, pedagogically grounded, and institutionally reliable way.

More from our Archive