期刊文献+

XML文档自动聚类研究 被引量:16

Research on XML Documents Cluster
下载PDF
导出
摘要 本文在文本聚类的基础上对XML文档自动聚类进行了研究,对划分聚类法和层次聚类法进行了改进,使之适合于XML文档聚类;给出了元素比较法、边集比较法和编辑距离法等三种计算文档间相似度的方法,并利用实际数据进行了测试和分析。 On the basis of text cluster, the author makes an exploratory research on XML documents cluster, through the improvement on partition cluster and layer cluster, makes them can use on XML documents cluster. Then, the author discusses some method about XML documents similarity calculation.
作者 潘有能
出处 《情报学报》 CSSCI 北大核心 2006年第2期215-220,共6页 Journal of the China Society for Scientific and Technical Information
基金 本文为浙江大学“曙光”青年项目“基于XML的Web日志挖掘研究”(No:205000-362221)研究成果.
关键词 数据挖掘 文本聚类 XML data mining, text cluster, XML.
  • 相关文献

参考文献13

  • 1Richi Nayak,Rebecca Witt & Anton Tonev.Data Mining and Xml Documents.Proceedings of the 2002 International Conference on Internet Computing,2002
  • 2Andrew Nierman,H.V.Jagadish.Evaluating Structural Similarity in XML Documents.WebDB 2002
  • 3Elisa Bertino,Giovanna Guerrini,Marco Mesiti,Luigi Tosetto.Evolving a Set of DTDs According to a Dynamic Set of XML Documents.EDBT Workshops,2002
  • 4Elisa Bertino,Giovanna Guerrini,Marco Mesiti.Matching an XML Document against a Set of DTDs.ISMIS,2002
  • 5Kaizhong Zhang,Rick Statman,Dennis Shasha.Simple Fast Algorithms for the Editing Distance Between Trees and Related Problems.SIAM Journal on Computing,1989,18(6):1245~1262
  • 6Yuan Wang,David J.De Witt,Jin-Yi Cai.X-Diff:An Effective Change Detection Algorithm for XML Documents.In the International Conference On Data Engineering (ICDE),2003,519~530
  • 7Chawathe,H.Garcia-Molina.Meaningful Change Detection in Structured Data.In Proceeding of the ACM SIGMOD International Conference on Management of Data,1996,26~37
  • 8S.Chawathe,A.Rajaman,H.Garcia-Molina,and J.Widom.Change Detection in Hierarchically Structured Information.In the Conference of Special Interest Group on Management of Data (SIGMOD),1996,493~504
  • 9郑仕辉,周傲英,张龙.XML文档的相似测度和结构索引研究[J].计算机学报,2003,26(9):1116-1122. 被引量:28
  • 10潘有能,邓三鸿.基于XML和关联规则的Web挖掘研究[J].现代图书情报技术,2004(7):30-34. 被引量:9

二级参考文献40

  • 1徐振航,刘莉芹.XML与面向Web的数据挖掘技术[J].软件世界,2000(10):120-122. 被引量:16
  • 2..http://www.yahoo.com,2001.
  • 3Chakrabarti S, Dom B, Gibson D, Kleinberg J, Raghavan P, Rajagopalan S. Automatic resource compilation by analyzing hyperlink structure and associated text. In: Thistlewaite P, et al. eds. Proceedings of the 7th ACM-WWW International Conference. Brisbane:ACM Press, 1998. 65~74.
  • 4Chakrabarti S. Integrating the document object model with hyperlinks for Enhanced topic distillation and information extraction. In:Vincent Y S, et al. eds. Proceedings of the 10th ACM-WWW International Conference. Hong Kong: ACM Press, 2001.211~220.
  • 5Borodin A, Roberts G, Rosenthal J, Tsaparas P. Finding authorities and hubs from link structures on the World Wide Web. In:Vincent Y S, et al. eds. Proceedings of the 10th ACM-WWW International Conference. Hong Kong: ACM Press, 2001. 415~429.
  • 6Davison B, Gerasoulis A, Kleisouris K, Lu Y, Seo H, Wang W, Wu B. DiscoWeb: Applying link analysis to web search (extended abstract). In: Vezza A, Maloney M, Cailliau R, eds. Proceedings of the 8th ACM-WWW International Conference. Toronto: ACM Press, 1999. 148~149.
  • 7Golub GH, Van Loan CF. Matrix Computations. London: Johns Hopkins University Press, 1989. 40~45.
  • 8Bharat K, Henzinger M. Improved algorithms for topic distillation in a hyperlinked environment. In: Voorhees E, Kirsch S, eds.Proceedings of the 21st ACMSIGIR International Conference on Research and Development in Information Retrieval. Melbourne:ACM Press, 1998. 104~111.
  • 9Brin S, Page L. The anatomy of a large-scale hypertextual Web search engine. In: Thistlewaite P, et al. eds. Proceedings of the 7th ACM-WWW International Conference. Brisbane: ACM Press, 1998. 107~117.
  • 10Kleinberg J. Authoritative sources in a hyperlinked environment. In: Tarjan RE, Baecker T, eds. Proceedings of the 9th ACM-SIAM Symposium on Discrete Algorithms. New Orleans: ACM Press, 1997. 668~677.

共引文献65

同被引文献242

引证文献16

二级引证文献81

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部