摘要
针对代码注释较少导致软件项目可维护性降低、理解代码语义困难等问题,提出一种基于NMT模型的双编码器框架自动生成代码注释的方法.在该框架中,首先提取不同的代码特征信息;然后分别采用序列编码器和图编码器对不同的代码特征编码,引入注意力机制调整编码器输出向量,再对双编码器的输出向量综合处理;最终利用解码器对综合向量解码获得注释序列.为了验证带有注意力机制的双编码器模型效果,本文构建自动生成代码注释算法框架.实验表明,双编码器模型与文中的序列编码器和树编码器模型算法相比,在生成代码注释方面的结果评估得分上表现较好.通过BLEU-1、ROUGE-L和F1测评指标得分对比,验证了本文算法的有效性.
Focusing on problems of less code comments which lead to difficulties in software project maintainability and code semantics understanding,a method of automatically generating code comments based on the dual encoder framework of the NMT model is proposed.In this framework,different code feature information is extracted first;then sequence encoder and graph encoder is applied to encode different code features.In the frame,attention mechanism is also introduced to adjust encoder output vector,and the output vectors of dual encoder is synthesized;Finally,the decoder is used to decode the integrated vector to obtain the annotation sequence.In order to verify the effect of the dual encoder model with attention mechanism,this paper constructs an algorithm framework for automatically generating code comments.Experiments show that the dual-encoder model performs better than the serial encoder and tree encoder model algorithms in the article in terms of generating code comments.Through the comparison of BLEU-1,ROUGE-L and F1 evaluation index scores,the effectiveness of this algorithm is verified.
作者
董传珂
赵逢禹
刘亚
DONG Chuan-ke;ZHAO Feng-yu;LIU Ya(School of Optical-Electrical&Computer Engineering,University of Shanghai for Science and Technology,Shanghai 200093,China)
出处
《小型微型计算机系统》
CSCD
北大核心
2022年第2期438-442,共5页
Journal of Chinese Computer Systems
基金
“十三五”密码发展基金理论项目(MMJJ20180202)资助。