Understanding LLM-Centric Challenges for Deep Learning Frameworks: An Empirical Analysis
Yanzhou Mu, Rong Wang, Juan Zhai, Chunrong Fang, Xiang Chen, Jiacong Wu, An Guo, Jiawei Shen, Bingzhuo Li, Zhenyu ChenLarge language models (LLMs) have achieved remarkable success across diverse applications, but their development and deployment depend heavily on deep learning (DL) frameworks. The massive scale, long execution cycles, and complex workflows of LLMs place stringent demands on framework usability, functionality, and reliability. When these demands are not well supported, developers may suffer from reduced efficiency, severe failures, and substantial resource waste. However, a fundamental question remains underexplored: what challenges do DL frameworks face in supporting LLMs? To answer this question, we analyze issue reports from three major DL frameworks (i.e., MindSpore, PyTorch, and TensorFlow) and eight associated LLM toolkits such as DeepSpeed. Based on a manual review of these reports, we construct a taxonomy that captures LLM-centric framework bugs, user requirements, and user questions. We then refine and enrich this taxonomy through interviews with 11 LLM users and eight DL framework developers. Using the taxonomy and interview findings, our study further reveals key technical challenges and mismatches between LLM user needs and developer priorities. Overall, our contributions are threefold: (1) we develop a comprehensive taxonomy comprising five question themes (11 sub-themes), five requirement themes (17 sub-themes), and ten bug themes (47 sub-themes); (2) we assess the importance and priority of different categories in the taxonomy based on insights from LLM users and framework developers; and (3) we identify five key findings across the LLM development lifecycle and propose five actionable recommendations to improve the quality of DL frameworks in supporting LLMs. Our results highlight critical limitations in current DL frameworks and offer concrete guidance for advancing their support for future LLM development, deployment, and applications.