期刊文献+

面向维基百科服务计算领域的演化知识树 被引量:3

Evolution Knowledge Tree for Services Computing Domain in Wikipedia
原文传递
导出
摘要 针对已有知识树知识热点不突出、知识分类不准确以及结构不断演化等问题,本文面向维基百科的中文数据库"服务计算"领域密集型数据,提出了扩展的中文分词算法,抽取、分类出多种主题知识及其结构化信息,结合服务计算领域文档提出基于LDA改进的DKHM(文档-主题-热点)模型,使用Gibbs抽样算法对数据集采样,并消除原词条歧义分类,以建立演化知识树.实验结果表明:基于DKHM的聚类准确度高于一般的贝叶斯聚类,通过聚类发现的热点与真实热点的匹配度达60%以上,从而验证了演化知识树比维基百科原有知识树结构更合理,热点趋势效果更明显. Because knowledge tree hotspots are not prominent,knowledge classification is not accurate,and the structure keeps evolving,an extensional algorithm of Chinese segmentation based on Chinese Wikipedia database"service computing"was proposed to extract a variety of themes knowledge and structural information.The evolution knowledge tree was constructed by improved DKHM(Document-Themes-Hotspot Model)based on the research in service computing domain,and using Gibbs sampling,the ambiguity of entry was eliminated.The experiments results showed that the accuracy of DKHM clustering is higher than that of the Bayes algorithm,and the matching rate reach60% by clustering to find hotspot.Thus,the evolution knowledge tree is more reasonable than the original Wikipedia knowledge tree structure and the hotspot trend is more obvious.
出处 《武汉大学学报(理学版)》 CAS CSCD 北大核心 2015年第4期331-338,共8页 Journal of Wuhan University:Natural Science Edition
基金 国家重点基础研究发展计划(973)(2014CB340404)资助项目
关键词 文档-主题-热点 GIBBS抽样 演化知识树 维基百科热点 DKHM(document-themes-hotspots model) Gibbs sample evolution of knowledge tree Wikipedia hotspot
  • 相关文献

参考文献12

二级参考文献54

  • 1吴健,吴朝晖,李莹,邓水光.基于本体论和词汇语义相似度的Web服务发现[J].计算机学报,2005,28(4):595-602. 被引量:218
  • 2尚文倩,黄厚宽,刘玉玲,林永民,瞿有利,董红斌.文本分类中基于基尼指数的特征选择算法研究[J].计算机研究与发展,2006,43(10):1688-1694. 被引量:38
  • 3Galley M, McKeown K, Improving Word Sense Disambiguation in Lexical Chaining[C]//Proc. of the 18th International Joint Conference on Artificial Intelligence. Acapulco, Mexico: [s. n.], 2003: 1486-1488.
  • 4Yarowsky D. Unsupervised Word Sense Disambiguation Rivaling Supervised Methods[C]//Proc. of the 33rd Annual Meeting of the Association for Computational Linguistics. Cambridge, Massachusetts, USA: [s. n.], 1995: 189-196.
  • 5Gey F C. Inferring Probability of Relevance Using the Method of Logistic Regression[C]//Proc. of the 17th International Conference of the ACM-SIGIR'94. [S. l.]: Springer-Verlag, 1994: 222-231.
  • 6Remy M. Wikipedia: The Free Encyclopedia[J]. Online Information Review, 2002, 26(6): 434-435.
  • 7Denoyer L, Gallinari E The Wikipedia XML Corpus[J]. SIGIR Forum, 2006, 40(1): 64-69.
  • 8邝砾,邓水光,李莹,吴健,吴朝晖.使用倒排索引优化面向组合的语义服务发现[J].软件学报,2007,18(8):1911-1921. 被引量:24
  • 9Michael Strube,Simon Paolo Ponzetto.WikiRelate!Computing semantic relatedness using Wikipedia[C] //Proceedings of the 21rd national conference onArtificial intelligence,2006:1419-1424.
  • 10Simone Paolo Ponzetto,Michael Strube.KnowledgeDerived From Wikipedia For Computing SemanticRelatedness[J].Journal of Artificial IntelligenceResearch,2007,30:181-212.

共引文献124

同被引文献26

引证文献3

二级引证文献22

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部