期刊文献+

基于自然语言处理的通用信息模型自动调试 被引量:1

Automated debug for common information model defect using natural language processing algorithm
下载PDF
导出
摘要 通用信息模型(CIM)是工业界的一种公开标准,并已实现于很多产品中,大量的bug被发现和修复。为了减少了人工查找错误根源所需的时间和精力,提出一种基于自然语言处理的方法对CIM的bug进行自动调试。首先使用最大熵模型对已解决bug的文档描述进行分词,然后基于构建的词典使用simHash找出那些重复性很大的已修复的bug,最后使用文档处理的方法分析客户提供的trace找出问题所在和解决方法。实验结果取得了87.5%准确率,表明了该方法的有效性。 Common Information Model (CIM) is an open industrial standard, which has been implemented in products of many companies. Meanwhile, there are lots of bugs being reported and fixed. In order to reduce the cost time and effort of finding the root cause, in this paper, a method to debug automatically was proposed based on natural language processing algorithm. It firstly segmented those sentences using maximum entropy model, then used simHash to find the most similar fixed bug based on specifically constructed dictionary, finally used text mining to find the root cause and solution via analyzing the trace provided by customer. The experimental result achieves 87.5% accuracy, which shows its effectiveness.
作者 项炜
出处 《计算机应用》 CSCD 北大核心 2013年第5期1446-1449,共4页 journal of Computer Applications
基金 四川省教育厅青年基金资助项目(11ZB134)
关键词 通用信息模型 自然语言处理 最大熵模型 调试 文档处理 Common Information Model (CIM) natural language processing maximum entropy model debug text processing
  • 相关文献

参考文献13

  • 1LIBLIT B. The cooperative bug isolation project[ EB/OL]. [ 2003 - 10 -09]. http://www, es. wise. edu/cbi/.
  • 2WOOD M. A dynamic approach to statistical debugging: building program specific models with neural networks[ D]. Georgia: Georgia Institute of Technology, 2007.
  • 3ZHENG A X, JORDAN M I, LIBLIT B, et al. Statistical debug- ging: simultaneous identification of multiple bugs[ C]//Proceedings of the 23rd International Conference on Machine Learning. New York: ACM Press, 2006:1105 - 1112.
  • 4ZHENG A, JORDAN M, LIBLIT B, et al. Statistical debugging of sampled programs[ C]// Advances in Neural Information Processing Systems. Cambridge: MIT Press, 2004:501 -510.
  • 5LIU C, "YAN X F, FE1 L. Sober: statistical model-based bug locali- zation[ J]. Symposium on the Foundations of Software Engineering, 2006, 30(5) : 286 -295.
  • 6LIBLIT B, NAIK M, ZHENG A X, et al. Scalable statistical bug i- solation[ C]// PLDI '05: Proceedings of the 2005 ACM SIGPLAN Conference on Programming Language Design and Implementation. New York: ACM Press, 2005:15 -26.
  • 7HANGAL S, LAM M S. Tracking down software bugs using auto- matic anomaly detection[ C]// Proceedings of the 24th International Conference on Software Engineering. Washington, DC: IEEE Com- puter Society, 2002:291 - 301.
  • 8LIU C, FEI L, YAN X, HAN J, et al. Statistical debugging: a hy- pothesis testing-based approach[ J]. IEEE Transactions on Software Engineering, 2006, 10(3) : 831 -848.
  • 9ANDRZEJEWSKI D, MULHERN A, LIBLIT B, et al. Statistical debugging using latent topic models[ C]// 18th European Confer- ence on Machine Learning. Berlin: Springer-Verlag, 2007:6 - 17.
  • 10BERGET A L, DELLA PIETRA V J, DELLA PIETRA S A. Maxi- mum entropy approach to natural language processing[ J]. Computa- tional Linguistics, 1998, 22(1) : 39 -71.

二级参考文献19

  • 1全昌勤,何婷婷,姬东鸿,刘辉.从搭配知识获取最优种子的词义消歧方法[J].中文信息学报,2005,19(1):30-35. 被引量:13
  • 2ZANZONI A,MONTECCHI-PALAZZI L,QUONDAM M,et al.MINT:A molecular INTeraction database[J].FEBS Letters,2002,513(1):135-140.
  • 3BADER G,BETEL D,HOGUE C.Bind—the biomolecular interac-tion network database[J].Nucleic Acids Research,2003,31(1):248-250.
  • 4XENARIOS I,RICH D W,SALWINSKI L,et al.DIP:The data-base of interacting proteins[J].Nucleic Acids Research,2000,28(1):289-291.
  • 5BUNESCU R,MOONEY R,RAMANI A.Integrating co-occurrencestatistics with information extraction for robust retrieval of protein in-teractions from Medline[C]//BioNLP'06:Proceedings of the Work-shop on Linking Natural Language Processing and Biology:TowardsDeeper Biological Literature Analysis.Stroudsburg:Association forComputational Linguistics,2006:49-56.
  • 6FUNDEL K,KUFFER R,ZIMMER R.RelEx-relation extraction u-sing dependency parse trees[J].Bioinformatics,2006,23(3):365–371.
  • 7NIELSEN L A.Extracting protein-protein interactions using simplecontextual features[C]//BioNLP'06:Proceedings of the Workshopon Linking Natural Language Processing and Biology:Towards Dee-per Biological Literature Analysis.Stroudsburg:Association forComputational Linguistics,2006:120-121.
  • 8MIYAO Y,SAETRE R,SAGAE K,et al.Task-oriented evaluationof syntactic parsers and their representations[EB/OL].[2011-05-01].http://www.aclweb.org/anthology-new/P/P08/P08-1006.pdf.
  • 9BUNESCU R C,MOONEY R J.A shortest path dependency kernelfor relation extraction[C]//HLT'05:Proceedings of the Conferenceon Human Language Technology and Empirical Methods in NaturalLanguage Processing.Stroudsburg:Association for ComputationalLinguistics,2005:724-731.
  • 10AIROLA A,PYYSALO S,BJRNE J,et al.All-paths graph ker-nel for protein-protein interaction extraction with evaluation of cross-corpus learning[J].BMC Bioinformatics,2008,9(Suppl 11):S2.

共引文献5

同被引文献13

引证文献1

二级引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部