On the Road to Personalized Code Intelligence: Portraiting and Assisting Developers Based on Their In-IDE Behaviors
Yuhong Liu, Yunhe Su, Zhipeng Peng, Zhiwen Luo, Lin Shi, Zhi Jin, Li ZhangWith the advent of powerful large language models (LLMs), research in automated software engineering has increasingly focused on leveraging these models to achieve a deeper semantic understanding of code or to engineer sophisticated agent-based processes. The predominant goal of these efforts is to enhance developer productivity through automated assistance. However, this research trajectory has largely overlooked a critical factor: the developers themselves. Programming is a deeply human and individualized activity; developers exhibit significant variation in their coding styles, tool-chain preferences, domain-specific expertise, and problem-solving strategies. Consequently, the current paradigm of one-size-fits-all code intelligence systems struggles to accommodate the unique characteristics and needs of individual developers. To address this gap, we introduce VirtualME, a novel IDE-embedded data infrastructure designed to model the developer by continuously capturing and interpreting their dynamic programming behaviors and preferences. VirtualME contains three components. (1) Log-level Behavior Extraction: it captures and extracts developers' log-level behaviors (edits, navigations, etc.) from IDE. (2) Task-level Behavior Recognition: it aggregates log-level behaviors into task-level behaviors (“skimming API docs”, “iterative debugging”, etc.) via a multi-agent pipeline. (3) Developer-persona Measurement: it builds a rule engine to distill a four-dimensional developer persona: Core Technical Foundation, Practical Development Efficiency, Personal Development Norms, and Technical Adaptability. On top of VirtualME, we propose a solution for personalized repository-level knowledge Q&A by integrating the developer persona into a Chain-of-Thought (CoT) guided agent. We evaluated VirtualME by building a multi-repository benchmark with real-world developer trajectories, balancing correctness and personalization. Experimental results show that VirtualME-enhanced answers outperform generic baselines on five dimensions: correctness, cognitive-level fit, technology-stack relevance, behavioral-pattern alignment, and stylistic preference, yielding an average 33.80% improvement. Our results demonstrate that abundant, continuous developer-behavior data can unlock Personalized Code Intelligence. By integrating this personalized understanding into the code intelligence loop, our approach paves the new way for adaptive and personalized code intelligence.