DOI: 10.3390/a19060501 ISSN: 1999-4893

Leveraging Label-Attention Networks and POS Tagging for Generating Chinese Cloze Questions

Yanyang Hou, Shufeng Xiong, Yang Li

Chinese cloze question generation for educational assessments requires identifying gap phrases that accurately reflect key knowledge points, posing significant challenges to automated systems. We observe that the syntactic boundaries revealed by part-of-speech (POS) tags closely align with the semantic boundaries of target gap phrases. Motivated by this observation, we propose a multi-task learning framework in which gap phrase identification serves as the primary task and POS tagging as a complementary auxiliary task. The two tasks share a common BERT-BiLSTM encoder, enabling mutual reinforcement of both syntactic and semantic representations through joint training. To further capture the interaction between label semantics and contextual word representations, we introduce a label-attention mechanism that models dependencies between the global word sequence and candidate label embeddings. Additionally, we construct a refined POS tag subset by excluding categories whose boundaries show no alignment with gap phrase boundaries, thereby strengthening the correspondence between the two tasks. Evaluated on a real-world dataset of 20.5K questions spanning five academic disciplines, our method achieves an F1 score of 65.85%, with a Recall of 67.79%, representing improvements of 2.12% and 4.35% over the prior state-of-the-art, respectively. These results demonstrate that exploiting the alignment between syntactic and semantic structures through joint learning is effective for generating educationally meaningful fill-in-the-blank questions.

More from our Archive