中国科技核心期刊

中文核心期刊

CSCD来源期刊

空间控制技术与应用 ›› 2021, Vol. 47 ›› Issue (2): 42-48.doi: 10.3969/j.issn.1674-1579.2021.02.006

• 论文与报告 • 上一篇    下一篇

基于神经网络融合模型的源代码注释自动生成

  

  1. 中国人民大学新闻传播实验中心
  • 出版日期:2021-04-10 发布日期:2021-04-19
  • 基金资助:
    国家自然科学基金资助项目(62072017)

A Neural Network Fusion Model for Source Code Comments Generation

  • Online:2021-04-10 Published:2021-04-19

摘要: 注释可以有效提高源代码的可读性、帮助开发者理解软件功能,对于软件的维护和演化起着关键作用. 当前源代码注释自动生成方面的研究存在一定局限,一是没有深入挖掘词法信息;二是没能很好的融合词法和语法信息. 因此,提出了基于神经网络融合模型的源代码注释自动生成方法,该方法利用编码器解码器神经网络框架深度表征源代码的词法信息,结合基于语法树挖掘到的语法信息,使用融合机制形成更加全面的功能语义编码向量用于注释自动生成. 通过在公开数据集上进行实验,该方法在BLEU4、METEOR等评价指标上均优于对比的模型,验证了方法的有效性.

关键词: 源代码注释, 抽象语法树, 编码器 解码器, 融合模型

Abstract: The comments are very helpful for understanding the source code and play an important role in software maintenance and evolution. Existing works show that the lack of source code comments is one common practice in realworld projects. Current studies on automatic source code comments generation have two limitations. Firstly, they only use much simple lexical information; secondly, they do not use the lexical and syntactic information well. In this work, we propose a neural network fusion model for source code comments generation based on the encoderdecoder framework. Our model can embed the lexical information better, represent the syntax information based on abstract syntax tree, and then produce a fusion encoder to learn both the lexical and syntactic information for source code comments generation. The experiments on the public benchmark indicate that our fusion model outperforms the previous models by the metrics such as BLEU4 and METEOR.

Key words: source code comments, abstract syntax tree, encoderdecoder, fusion model

中图分类号: 

  • TP311