期刊文献+
共找到1篇文章
< 1 >
每页显示 20 50 100
How Does Naming Affect Language Models on Code Analysis Tasks?
1
作者 Zhilong Wang Lan Zhang +3 位作者 Chen Cao nanqing luo Xinzhi luo Peng Liu 《Journal of Software Engineering and Applications》 2024年第11期803-816,共14页
The Large Language Models (LLMs), such as GPT and BERT, were proposed for natural language processing (NLP) and have shown promising results as general-purpose language models. An increasing number of industry profess... The Large Language Models (LLMs), such as GPT and BERT, were proposed for natural language processing (NLP) and have shown promising results as general-purpose language models. An increasing number of industry professionals and researchers are adopting LLMs for program analysis tasks. However, one significant difference between programming languages and natural languages is that a programmer has the flexibility to assign any names to variables, methods, and functions in the program, whereas a natural language writer does not. Intuitively, the quality of naming in a program affects the performance of LLMs in program analysis tasks. This paper investigates how naming affects LLMs on code analysis tasks. Specifically, we create a set of datasets with code containing nonsense or misleading names for variables, methods, and functions, respectively. We then use well-trained models (CodeBERT) to perform code analysis tasks on these datasets. The experimental results show that naming has a significant impact on the performance of code analysis tasks based on LLMs, indicating that code representation learning based on LLMs heavily relies on well-defined names in code. Additionally, we conduct a case study on some special code analysis tasks using GPT, providing further insights. 展开更多
关键词 LLMs CodeBERT Code Analysis
下载PDF
上一页 1 下一页 到第
使用帮助 返回顶部