摘要
研究人员将软件系统中的关键类作为理解和维护一个系统的起点,而关键类上的缺陷给系统带来了极大的安全隐患.因此,识别关键类可提高软件的可靠性和稳定性.常用的识别方法是将软件系统抽象为一个类依赖网络,再根据定义好的度量指标和计算规则计算每个节点的重要性得分,如此基于非训练框架得到的关键类,并没有充分利用软件网络的结构信息.针对这一问题,基于图神经网络技术提出了一种有监督的关键类识别方法.首先,将软件系统抽象为类粒度的软件网络,并利用网络嵌入学习方法Node2Vec得到类节点的表征向量,再通过一个全连接层将节点的表征向量转换为具体分值;然后,利用改进的图神经网络模型,综合考虑类节点之间的依赖方向和权重,进行节点分值的聚合操作;最后,模型输出每个类节点的最终得分并进行降序排列,从而实现关键类的识别.在8个Java开源软件系统上,通过与基准方法的实验对比,验证了该方法的有效性.实验结果表明:在前10个候选关键类中,所提方法比最先进的方法提升了6.4%的召回率和3.5%的精确率.
Researchers use key classes as starting points for software understanding and maintenance.These key classes may cause a significant security risk to the software if they have defects.Therefore,identifying key classes can improve the reliability and stability of the software.Most of the existing methods are based on non-trainable solutions,which calculate the score of each node according to a certain calculation rule,and cannot fully utilize the structural information available in the software network.To solve these problems,a supervised deep learning method is proposed based on graph neural network technology.First,the project is built as a software network and the network embedding learning method Node2Vec is used to learn the node representation.Then,the node representation is mapped into a score through a simple dense network.Second,the aggregation function of the graph neural networks(GNNs)is improved to aggregate important scores instead of node embedding.The direction and weight information between nodes are also considered when aggregating the scores of neighbor nodes.Finally,the nodes are ranked in descending order according to the predicted score output by the model.To evaluate the effectiveness of the proposed method,it is applied to eight Java open-source software systems.The experimental results show that the proposed method performs better than benchmark methods.In the top 10 key candidates,the proposed method achieves 6.4%higher recall and 3.5%higher precision than the state-of-the-art.
作者
周纯英
曾诚
何鹏
张龑
ZHOU Chun-Ying;ZENG Cheng;HE Peng;ZHANG Yan(School of Computer Science and Information Engineering,Hubei University,Wuhan 430062,China;School of Cyber Science and Technology,Hubei University,Wuhan 430062,China;Engineering Technology Research Center for Education Informatization of Hubei Province,Wuhan 430062,China)
出处
《软件学报》
EI
CSCD
北大核心
2023年第6期2509-2525,共17页
Journal of Software
基金
国家自然科学基金(62102136)
湖北省重点研发计划(2021BAA184,2021BAA188,2022BAA044)
湖北省技术创新专项(2020AEA008)。
关键词
关键类识别
软件网络
图神经网络
软件度量
key class identification
software network
graph neural network(GNN)
software measurement