期刊文献+

基于加权PageRank算法的关键包识别方法 被引量:8

Identifying the Key Packages Using Weighted PageRank Algorithm
下载PDF
导出
摘要 识别软件中的关键实体对于人们理解软件,控制和降低维护费用具有重要意义.然而现有的工作基本都是针对关键类识别的,针对关键包、方法/属性等的研究甚少;同时现有的工作也未能揭示关键类与软件外部质量属性间的关系.为丰富现有的工作,本文提出了一种基于加权PageRank算法的关键包识别方法.该方法用加权有向软件网络模型抽象包粒度软件系统,提出新度量PR(PackageRank)从结构角度量度节点重要性,并引入加权的PageRank算法计算该度量值.数据实验部分以六个开源Java软件为例,分析了包的PR值与常用复杂网络中心性指标(介数中心性、接近中心性、度数中心性等)间的相关性;使用加权的SIR(Susceptible-Infectious-Recovered)模型分析了PR所识别关键包的传播影响,并与其它相关方法进行比较,验证了本文方法的有效性;最后,以其中两个软件为例,分析了包的PR值与包可理解性间的关系,进一步验证了本文方法的有效性. Identifying key entities has many implications for software understanding and controlling and reducing maintenance costs. However the existing methods only focus on identifying key classes. Little work has been done on the identification of key entities at the other levels. Further the existing work also failed to reveal the relationships between key classes and external quality attributes. In this paper,we introduce a novel method IDEEP( IDEntifying k Ey Packages using weighted PageRank algorithm) to identify the key packages. IDEEP uses a weighted and directed software network to describe packages and their dependencies,proposes a new metric PR( PackageRank) to quantify the package importance,and introduces a weighted PageRank algorithm to compute PR values. Our experiments are carried out on six Java software systems. First we analyze the correlation between PR values and other centrality metrics such as betweenness,closeness and degree. Second we use a weighted version of the susceptible-infectious-recovered model to examine the spreading influence of each node. The results show that our method is better than other six methods. Further,we reveal the relationships between key packages and their understandability and show that the key packages identified by our method are more meaningful from a software engineering perspective.
出处 《电子学报》 EI CAS CSCD 北大核心 2014年第11期2174-2183,共10页 Acta Electronica Sinica
基金 国家973重点基础研究发展计划(No.2014CB340401) 国家自然科学基金(No.61202048 No.61273216 No.61272111) 浙江省自然科学基金(No.LQ12F02011 No.LY13F020010) 软件工程国家重点实验室开放基金(No.SKLSE-2012-09-21)
关键词 关键包 PAGERANK算法 软件网络 程序理解 key package PageRank algorithm software network program comprehension
  • 相关文献

参考文献31

  • 1Yau S S, Collofeflo J S. Some stability measures for software maintenance[ J ]. IEEE Transactions on Software Engineering, 1980, SE- 6(6) :545 - 552.
  • 2Guimaraes T. Managing application program maintenance ex- penditttre [ J]. Communication of ACM, 1983,26 (10) : 739 - 746.
  • 3Corbi T A.Program understanding:Challenge for the 90s [ J].IBM Systems Journal, 1990,28(2) :294 - 306.
  • 4Zaidman A, Demeyer S. Automatic identification ofkey classes in a software system using Web mining techniques[ J]. Journal of Software Maintenance and Evolution: Research and Prac- tices, 2008,20(6) :387 - 417.
  • 5Ko A J,Myer B A,Coblenz M J,et al.An exploratory study of how developers seek, relate, and collect relevant information during software maintenance tasks [J]. 1F.EE Transaction on Software Engineering, 2006,32(12) : 971 - 987.
  • 6Weifeng PAN Bing LI Yutao MA Jing LIU.MULTI-GRANULARITY EVOLUTION ANALYSIS OF SOFTWARE USING COMPLEX NETWORK THEORY[J].Journal of Systems Science & Complexity,2011,24(6):1068-1082. 被引量:13
  • 7李兵,王浩,李增扬,何克清,余敦辉.基于复杂网络的软件复杂性度量研究[J].电子学报,2006,34(B12):2371-2375. 被引量:38
  • 8Potanin A,Noble J,Frean M, et al. Scale-free geometry in ob- jecoriented programs [J ]. Communications of the ACM, 2005,48(5) :99 - 103.
  • 9吕金虎,王红春,何克清.复杂动力网络及其在软件工程中的应用[J].计算机研究与发展,2008,45(12):2052-2059. 被引量:10
  • 10李辉,赵海,徐久强,李博,李鹏,王家亮.基于k-核的大规模软件宏观拓扑结构层次性研究[J].电子学报,2010,38(11):2635-2643. 被引量:8

二级参考文献106

共引文献173

同被引文献64

引证文献8

二级引证文献33

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部