期刊文献+

基于同位替换的深度程序生成模型测试及修复方法

An Isotopic-Replacement-Based Approach for Testing and Improving Code Generation Systems
下载PDF
导出
摘要 程序的编写是软件开发中的主要活动.提高程序编写的效率一直是软件工程研究关注的重要问题.基于深度学习的程序生成是提高程序编写效率的重要途径.该类方法借鉴了自然语言处理中的基于深度神经网络的机器翻译方法,试图将输入的自然语言描述自动转换为对应程序.然而,现有程序生成方法的生成效果很难让人满意.在这类方法中,对输入自然语言描述的微小改动可能使得输出的代码发生巨大改变.这种变化会给开发者带来理解上的困难.为了解决这个问题,本文提出了一种感知上下文语境的测试和修复算法(COTE).COTE将变异和蜕变测试相结合以实现程序生成系统中相应问题的自动测试并在测试的基础上实现自动修复.本文在常用程序生成工具CodeGPT上对COTE进行验证.实验结果表明:在COTE的测试下,CodeGPT大约有39%的输入存在问题;同时,COTE可以自动修复其中33%~42%的问题. Programming is a fundamental aspect of software development,serving as the building block for creating robust,functional,and efficient applications.In today’s technology-driven world,the demand for software solutions is growing at an unprecedented rate.As a result,there is an increasing need for streamlined programming processes that can keep pace with this rapid growth.To address this need,many researchers have turned their attention towards code generation as a means to automate and expedite the coding process,making it more accessible and efficient for developers.Code generation systems are designed to accept an input in the form of a natural language description and then automatically generate the target program.This approach has the potential to revolutionize the programming landscape by reducing the time and effort required to develop software solutions.However,despite the potential benefits of code generation,existing approaches are not without their limitations.One of the most significant challenges faced by code generation systems is their sensitivity to changes in the input sentence.Even minor modifications to the input can lead to substantial and undesirable changes in the output,ultimately impacting the reliability and effectiveness of code generation systems in realworld applications.This sensitivity poses a significant obstacle to the widespread adoption of code generation technologies,as developers need to have confidence in the consistency and accuracy of the generated code.In order to tackle this challenge,we propose a COntext-aware code generation TEsting and Repair approach(COTE).COTE integrates mutation and metamorphic testing techniques.This method utilizes context-similar mutations to generate mutated sentences,which serve as test inputs for the code generation system being evaluated.If a context-similar mutation causes a disruption exceeding the predetermined threshold in the code generation system of the non-mutated portion,the approach identifies and reports it as a bug.Once a bug is reported,COTE further leverages its black-box/grey-box repair capabilities to automatically repair these bugs,thereby enhancing the overall quality and reliability of the generated code.To assess the effectiveness of COTE,we conducted a comprehensive series of experiments using CodeGPT,a state-of-the-art code generation system.Our experimental results provide valuable insights into the performance of COTE in a real-world setting.With the implementation of COTE,we found that bugs were detected in approximately 39%of CodeGPT’s input.This demonstrates COTE’s capacity to identify a substantial number of issues within the generated code that might have otherwise gone unnoticed.Furthermore,COTE showcases a remarkable ability to automatically repair the detected bugs,further emphasizing its potential as a powerful tool for improving code generation systems.Our experiments reveal that COTE successfully repaired between 33%~42%of the identified bugs,which underlines its capacity to enhance the overall reliability and quality of the generated code.
作者 孙泽宇 张洁 熊英飞 郝丹 张路 SUN Ze-Yu;ZHANG Jie;XIONG Ying-Fei;HAO Dan;ZHANG Lu(Zhongguancun Laboratory,Beijing 100871;Key Lab of HCST(PKU),MOE,SCS,Peking University,Beijing 100871;King’s College London,London,UK)
出处 《计算机学报》 EI CAS CSCD 北大核心 2023年第10期2025-2040,共16页 Chinese Journal of Computers
基金 国家重点研发计划课题(编号:2022YFB4501902) 中兴通讯-北京大学基础软件联合实验室项目的支持.
关键词 程序生成 程序测试 程序修复 神经网络 软件工程 code generation software testing program repair neural network software engineering
  • 相关文献

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部