CertiCoder: Towards MISRA-Compliant C Code Generation with LLMs
Min Gou, Zhiyu Yao, Hualong Ma, Ende Zhang, Jian Zhou, Fei He
Large language models (LLMs) are increasingly applied to code generation in IDEs, CI pipelines, and automated workflows. Existing evaluations, however, have largely focused on functionality, with comparatively limited attention to compliance with established safety standards. This gap is particularly critical for C, where programmes may compile and pass unit tests yet still violate MISRA C:2012, a widely adopted guideline in safety-critical domains. We present CertiCoder, a post-training framework with rule-aware optimization that transforms tool-verified outcomes into per-rule contrasts and trains models through three stages: rule tuning, cold-start supervised fine-tuning, and rule-aware preference optimization. This design helps models not only distinguish compliant from violating outputs but also associate violations with specific rules. To support reproducible assessment, we construct a Codeforces-derived C benchmark with frozen splits, multi-level decontamination, and metrics that jointly measure MISRA compliance (