期刊文献+

代码挖掘中的数据处理方法综述 被引量:1

Survey on Data Processing in Code Mining
下载PDF
导出
摘要 程序代码中蕴含着软件开发人员最原始的开发理念、设计思想和编程习惯等信息,将数据挖掘用于分析处理这种编码痕迹以便提取出潜藏着的有用知识是一个有着广阔前景的新的研究领域.由于当前的挖掘程序尚无法直接处理这种文本结构的程序代码,因而研究者需要将软件代码抽象成一种更有效的中间表达形式来作为挖掘对象.这种中间表达形式不仅界定了挖掘所使用的算法,更重要的是,它决定了所能挖掘的知识内容.对代码挖掘的一般过程进行了介绍并着重分析了代码挖掘的各种中间表达形式的特点.在此基础上,指出了当前代码挖掘存在的问题及今后的发展方向. Source code contains a wide range of information such as the original concepts, design ideas and programming habits of the developers. Analyzing coding traces in source code with data mining technology to extract those hidden but useful knowledge has been a new but promising research field. Since the current mining applications could not handle with source code of text structures, it is usually necessary to abstract software code into a more effective form of expression called intermediate form to simplify the mining process. Such expressing forms decide not only the mining algorithm used but also, more importantly, decide which kind of knowledge could be extracted from. This paper anatomized the general process of data mining based on source code and focused on the analysis of specialty of kinds of intermediate forms. On this foundation, it put forward the current problem in this area and point out the future direction of development.
出处 《小型微型计算机系统》 CSCD 北大核心 2010年第11期2121-2128,共8页 Journal of Chinese Computer Systems
基金 国家自然科学基金项目(6087321360703103)资助 北京市自然科学基金项目(4082018)资助 国家"八六三"高技术研究发展计划项目(2007AA01Z414)资助
关键词 数据挖掘 知识模式 预处理 漏洞检测 data mining knowledge pattern preprocess vulnerability detection
  • 相关文献

参考文献2

二级参考文献22

  • 1Mitchell J C.Programming language methods in computer security.ACM POPL,UK,2001
  • 2Cousot P,Cousot R.Abstract interpretation:a unified lattice model for static analysis of programs by construction or approximation of fixpoints.ACM POPL,USA,1977
  • 3Rice H G.Classes of Recursively Enumerable Sets and their Decision Problems.Transactions of the American Mathematical Society,1953(89):25~29
  • 4Foster J S,Fahndrich M,Aiken A.A theory of type qualifiers.ACM PLDI,USA,1999
  • 5Shankar U,Talwar K,Foster J S,et al.Detecting format string vulnerabilities with type qualifiers.USENIX Security Symposium,USA,2001
  • 6Zhang Xiaolan,Edwards Antony,Jaeger T.Using CQUAL for static analysis of authorization hook.USENIX Security Symposium,USA,2002
  • 7Johnson R,Wagner D.Finding user/kernel pointer bugs with type inference.USENIX Security Symposium,2004
  • 8Aho A V,Sethi R,Ullman J D.Compilers principles,techniques and tools.编译原理.李建中,姜守旭译.北京:机械工业出版社,2003
  • 9Larochelle D.Statically detecting likely buffer overflow vulnerabilities.USENIX Security Symposium,USA,2001
  • 10Xie Yichen,Chou Andy,Engler D.ARCHER:Using Symbolic,Path-sensitive Analysis to Detect Memory Access Errors.ESEC/FSE'03,Helsinki,Finland,September 2003

共引文献28

同被引文献15

  • 1TAN Lin,ZHANG Xiaolan,MA Xiao,et al.AutoISES: Automatically Inferring Security Specifications and De- tecting Violations. Proceedings of the 17th Confer- ence on Security Symposium: July 28- August 1,2008 . 2008
  • 2THUMMALAPENTA S,XIE T.NEGWeb: Detecting Neg- lected Conditions Via Mining Programming Rules from Open Source Code. Proceedings of the 2008 International Symposium on Software Testing and Analyis: July 20-24, 2008 . 2008
  • 3NOVILLO D.Tree SSA- A New High-Level Optimization Framework for GCC. Proceedings of the 2003 USENIX Annual Technical Conference: June 9-14,2003 . 2003
  • 4ENGLER D,CHEN DY,HALLEM S,et al.Bugs as De- viant Behavior: A General Approach to Inferring Errors in Systems Code. Proceedings of the 18th ACM Sympo- sium on Operating Systems Principles . 2001
  • 5CWE/SANS Top 25 Most Dangerous Software Errors [OL]. http: / /cwe.mitre.org/top25/ . 2011
  • 6LI Zhenmin,LU Shan,MYAGMAR S,et al.CP-Miner: A Tool for Finding Copy-paste and Related Bugs in Op- erating System Code. Proceedings of the 6th Confer- ence on Symposium on Operating Systems Design & Im- plementation: December,2004 . 2004
  • 7LI Z,ZHOU Y.PR-Miner: Automatically Extracting Im- plicit Programming Rules and Detecting Violations in Large Software Code. Proceedings of the 10th Euro- pean Software Engineering Conference Held Jointly with 13th ACM SGSOFT Internaional Symposium on Founda- tions of Software Engineering: September 5-9,2005 . 2005
  • 8Ottenstein K J,Ottenstein L M.The Program Dependence Graph in a Software Development Environment. Proceeding of the ACM SIGSOFT/SIGPLAN Software Engineering Symposium on Practical Software Development Environments . 1984
  • 9Ferrante J,Ottenstein KJ,Warren JD.The program dependence graph and its use in optimization. ACM Transactions on Programming Languages and Systems . 1987
  • 10Jiawei Han,Jian Pei,Yiwen Yin et al.Mining frequent patterns without candidate generation: A frequent-pattern tree approach. Data Mining and Knowledge Discovery . 2004

引证文献1

二级引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部