期刊文献+

基于代码语句掩码注意力机制的源代码迁移模型 被引量:1

Source Code Migration Model Based on Code-statement Masked Attention Mechanism
下载PDF
导出
摘要 源代码迁移技术旨在将源代码从一种编程语言转换至另一种编程语言,以减轻开发人员迁移软件项目的负担.现有研究通常利用神经机器翻译(NMT)模型将源代码转换为目标代码,但这些研究忽略了代码结构特征,导致源代码迁移性能不佳.为此,本文提出了基于代码语句掩码注意力机制的源代码迁移模型CSMAT (code-statement masked attention Transformer).该模型利用Transformer的掩码注意力机制(masked attention mechanism),在编码时引导模型理解源代码语句的语法和语义以及语句间上下文特征,在译码时引导模型关注并对齐源代码语句,从而提升源代码迁移性能.本文使用真实项目数据集CodeTrans进行实证研究,并使用4个指标评估模型性能.实验结果验证了CSMAT的有效性,同时验证了代码语句掩码注意力机制在预训练模型的适用性. Source code migration techniques are designed to convert source code from one programming language to another,which helps reduce developers’burden in migrating software projects.Existing studies mainly use neural machine translation(NMT)models to convert source code to target code.However,these studies ignore the code structure features,resulting in poor source code migration performance.Therefore,this study proposes a source code migration model based on a code-statement masked attention Transformer(CSMAT).The model uses Transformer’s masked attention mechanism to guide the model to understand the syntax and semantics of source code statements and interstatement contextual features when encoding and make the model focus on and align the source code statements when decoding,so as to improve migration performance of source code.Empirical studies are conducted on the real project dataset,namely CodeTrans,and model performance is evaluated by using four metrics.The experimental results have validated the effectiveness of CSMAT and the applicability of the code-statement masked attention mechanism to pretrained models.
作者 徐明瑞 李征 刘勇 吴永豪 XU Ming-Rui;LI Zheng;LIU Yong;WU Yong-Hao(College of Information Science and Technology,Beijing University of Chemical Technology,Beijing 100029,China)
出处 《计算机系统应用》 2023年第9期77-88,共12页 Computer Systems & Applications
基金 国家自然科学基金(61902015,61872026)。
关键词 代码语句 掩码 代码迁移 机器翻译 注意力机制 code statement mask code migration machine translation attention mechanism
  • 相关文献

参考文献1

二级参考文献4

  • 1American National Standards Institute. Programming language COBOL X3.23, ANSI, NEW YORK, 1985.
  • 2Van Deursen A, Klint P, Verhoef C. Research Issues in the Revovation of Legacy Systems. Proceedings of the Second International Conference on Fundamental Approaches to Software Engineering,1999:1-21.
  • 3Terekhov A A, Verhoef C. Realities of Language Conversions. IEEE Software, 2000, 17(6): 111-124.
  • 4Church-Turing Thesis.http:∥www.cs.williams.edu/~kim/cs361/F02/Church.pdf.

共引文献1

同被引文献14

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部