Influence-Aware Bayesian-Inspired Token Reweighting for Improved Code Generation
Yuqi Zhu, Ge Li, Hong Mei, Zhi Jin, Jia Li, Qibin Zheng, Jieyuan ZhangLarge language models (LLMs) have achieved remarkable progress in code generation, yet the structural properties of programming languages introduce distinctive challenges. In particular, program correctness is disproportionately influenced by a subset of structurally critical tokens, such as API names, variable identifiers, and control-flow keywords, termed as influential tokens. Errors in predicting these tokens often propagate and accumulate through subsequent decoding steps, leading to substantial degradation in overall correctness. Addressing the heterogeneous difficulty of predicting such tokens is therefore crucial for improving the reliability of code generation. To address this challenge, we introduce Influence-Aware Bayesian Code Generation (I-BAYGEN), a framework that explicitly handles influential tokens. The framework consists of two components. First, it identifies influential tokens using a loss-based detection mechanism, and measures the influential degree of each token in three ways. Second, to handle influential tokens, we introduce auxiliary reasoning paths as additional evidence to refine the token distribution during code generation in a Bayesian-inspired manner. To captures structural dependencies, we incorporate influence scores as adaptive weights in the self-rewarding mechanism, encouraging greater optimization emphasis on structurally critical tokens. Using the influence-aware reweighting mechanism, the framework provides differentiated treatment to tokens based on their prediction difficulties, with influential tokens receiving enhanced attention through a reward weighting scheme and deeper reasoning processes. Comprehensive experiments on competition-level programming benchmarks demonstrate that I-BAYGEN achieves up to 47.2% relative improvement in correctness over state-of-the-art non-weighted approach. We further show that I-BAYGEN generalizes robustly across multiple programming languages and out-of-distribution scenarios, highlighting its potential for real-world code generation tasks. Moreover, qualitative analysis reveals that the framework produces reasoning paths that are more interpretable and logically coherent than non-weighted method, effectively addressing heterogeneous token difficulty in code generation.