DOI: 10.1145/3821424 ISSN: 1049-331X
When Retrieval Augmentation Meets API Documentation: Can LLMs Code with Less-Common Libraries?
Jingyi Chen, Songqiang Chen, Jialun Cao, Jiasi Shen, Shing-Chi Cheung
Retrieval-augmented generation (RAG) has increasingly shown its power in extending large language models’ (LLMs’) capability beyond their pre-trained knowledge. Existing works showed that RAG can help with software development tasks such as code generation and test generation. Yet, the effectiveness of adapting LLMs to their unfamiliar, less-common or fast-evolving library APIs using RAG remains unknown. To bridge this gap, we take an initial step to study this unexplored yet practical setting – when developers code with an unfamiliar library, they often refer to its API documentation; likewise,
when LLMs are allowed to look up API documentation of their unfamiliar libraries via RAG, to what extent can LLMs be advanced?
To mimic such a setting, we select four less-common open-source Python libraries with a total of 1017 eligible APIs. We study the factors that affect the effectiveness of using the documentation of less-common libraries as additional knowledge for retrieval and generation. Our study yields interesting findings: (1) RAG helps improve LLMs’ performance by 83%
\(\sim\)
220%. (2) Example code contributes the most to advancing LLMs, compared to the descriptive texts and parameter lists in the API documentation. (3) LLMs could sometimes tolerate mild noises (typos in description or incorrect parameters) by referencing their pre-trained knowledge or document context. Based on the findings, we advocate that developers pay more attention to the quality and diversity of code examples in the API documentation. The study sheds light on the future low-code software development workflows with LLMs.