基于模式的XML文档相似度算法被引量：2

Similarity Algorithm Based on Schema of XML Document

下载PDF

导出

摘要提出一种基于XML模式的文档相似度算法,其中,XML模式间的相似度是XML文档聚类的重要依据,元素是XML模式的主体,模式的相似度由元素相似度组成,该算法综合考虑XML模式中元素的结构和语义信息,进一步提高计算相似度的精度。另外,该算法通过计算XML模式间的相似度,可以降低算法的复杂度,提高聚类的准确性,易于提取聚簇的通用XML模式。 A similarity algorithm based on XML schema is brought forward. The similarity of XML Schema is an important foundation for XML clustering. Elements in XML are the main body and the similarity among elements is the major components of schemas similarity. The algorithm takes full account of the structure and semantics of elements. It can make more accurate calculation of similarity. In the mean while, it reduces the complexity and improves the accuracy of clustering. In addition, it is easy to extract the common XML schema of clustering by calculating the similarity among the XML schemas.

作者孙霞程宏斌

机构地区常熟理工学院计算机科学与工程学院

出处《计算机工程》 CAS CSCD 北大核心 2010年第21期54-56,共3页 Computer Engineering

关键词模式相似度结构语义可扩展标记语言 schema similarity structure semantic XML

分类号 TP391 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

参考文献5

1Andrew N.Evaluating Structural Similarity in XML Documents[C]// Proc.of the 5th International Workshop on the Web and Databases.[S.l.]: IEEE Press,2002: 61-66.
2Lee M.XClust: Clustering XML Schemas for Effective Integration[C]//Proc.of CIKM’02.[S.l.]: IEEE Press,2002.
3张海威,袁晓洁,杨娜,王鑫.元素路径模型:高效的XML Schema提取方法[J].计算机工程,2008,34(3):32-34. 被引量：2
4Hegewald J,Naumann F,Weis M.XStruct: Efficient Schema Extraction from Multiple and Large XML Documents[C]//Proc.of the 22nd International Conference on Data Engineering Workshops.Atlanta,USA: [s.n.],2006.
5George M,Richard B.Introduction to WordNet: An Online Lexical Database[J].International Journal of Lexicography.1993,3(4):235-312.

二级参考文献5

1Garofalakis M, Gionis A, Rastogi R, et al. XTRACT: A System for Extracting Document Type Descriptors from XML Documents[C]// Proceedings ofACM SIGMOD. Dallas, Texas: [s. n.], 2000: 165.
2Berman L, Diaz A. Data Descriptors by Example[EB/OL]. (2001 - 10-10). http://www.alphaworks.ibm.com/tech/DDbE.
3Moh C H, Lim E P, Ng W K. DTD-miner: A Tool for Mining DTD from XML Documents[C]//Proceedings of International Workshop on Advance Issues of E-commerce and Web-based Information Systems. San Jose : [s. n.], 2000: 144.
4Hegewald J, Naumann F, Weis M. XStruct: Efficient Schema Extraction from Multiple and Large XML Documents[C]// Proceedings of the 22rid International Conference on Data Engineering Workshops. Atlanta, GA, USA: [s. n.], 2006:81.
5Min J K, Ahn J Y, Chung C W. Efficient Extraction of Schemas for XML Documents[J]. Information Processing Letters, 2003, 85( 1): 7.

共引文献1

1孙霞,张玉生.基于模式元素的文档聚类方法研究[J].常熟理工学院学报,2012,26(8):94-98.

同被引文献24

1李笛,胡学钢,胡春玲.主动贝叶斯分类方法研究[J].计算机研究与发展,2007,44(z2):47-51. 被引量：1
2米洪,汪芸.基于SOAP消息的XML数据在网络间的传输[J].微机发展,2004,14(11):73-76. 被引量：3
3孔令波,唐世渭,杨冬青,王腾蛟,高军.XML数据的查询技术[J].软件学报,2007,18(6):1400-1418. 被引量：72
4Seshasayee B, Schwan K, Widener P. SOAP-binQ : rash- Performance SOAP with Continuous Quality Management [ C ]// Proceedings of the 24th International Conference on Distributed Computing Systems. [ s. l. ] : [ s. n. ] ,2004 : 158-165.
5Mendelsohn N, Nottingham M. XML-binary Optimized Packaging W3C Working Draft [S/OL]. ( 2004 -06 ) [ 2011-07 - 11 ]. http://www. w3. org/TR/2004/WD- xop10- 20040608/.
6Nikitas M. Improve XML Web Services Performance by Compressing SOAP[ EB/OL]. [2011-07-11]. http://www. dot- netjunkies, com/Article/46630AE2 - 1C79 - 4D5F - 827E - 6C2857FF1D23. dcik.
7Davis D, Parashar M. Latency Performance of SOAP Implementations[ C ]//Proceedings of the 2rid IEEE/ACM International Symposium on Cluster Computing and the Grid. [ s. l. ] : [s. n. ] ,2002:407-412.
8Makino S, Tatsubori M, Tamura K, et al. Improving WS-Security Performance with a Template-Based Approach[ C ]//Proc IEEE Int'l Conf Web Services ( ICWS 95 ). [ s. l. ] : [ s. n. ], 2005:581-588.
9Phan Z T K A, Bertok P. Optimizing web services performance by using similarity-based multicast protocol [ C ]//ECOWS' 06,4th European Conference on Web Services. [ s. l. ] : [ s. n. ] ,2006:119-128.
10Yang Jianwu, Cheung W K, Chen Xiaoou. Integrating element and term semantics for similarity-based XML document clustering[ C ]//Proceedings of the 2005 IEEE/WIC/ACM International Conference on Web Intelligence. 2005: 222-228.

引证文献2

1韩晓梅,郑洪源,丁秋林.一种基于贝叶斯分类的XML检索文档相似度算法[J].计算机与现代化,2012(1):34-36.
2龚恩源,暴建民.基于相似性的SOAP性能的研究[J].计算机技术与发展,2012,22(2):114-118. 被引量：1

二级引证文献1

1尚军,陈莉,汤宏胜,张苍松,李华.基于IRST的谱图相似性查找方法研究[J].计算机与应用化学,2014,31(3):333-336.

1潘有能.XML文档自动聚类研究[J].情报学报,2006,25(2):215-220. 被引量：16
2冯少荣,潘炜炜,林子雨.基于改进k-medoids算法的XML文档聚类[J].计算机工程,2015,41(9):56-62. 被引量：4
3杨厚群,何中市,雷景生.基于划分的XML文档聚类研究[J].计算机科学,2008,35(3):183-185. 被引量：4
4孙霞,程宏斌.基于加权层次结构的XML文档相似度算法[J].武汉理工大学学报,2009,31(18):76-79. 被引量：1
5傅珊珊,吴扬扬.基于频繁结构的XML文档聚类[J].计算机工程与应用,2008,44(9):135-138. 被引量：1
6赵斌,张永胜.基于Bagging的XML文档集成聚类研究[J].计算机工程与应用,2009,45(14):138-140. 被引量：1
7王成勇,杜庆伟,孙静,孙振.用带权重的pq-gram算法计算XML文档相似度[J].计算机与现代化,2015(3):20-25. 被引量：1
8蒋勇,谭怀亮,李光文.基于量子遗传算法的XML聚类方法[J].计算机应用,2011,31(2):446-449. 被引量：6
9郑仕辉,周傲英,张龙.XML文档的相似测度和结构索引研究[J].计算机学报,2003,26(9):1116-1122. 被引量：28
10谌志群.XML文档相似度计算方法研究[J].情报学报,2009,28(1):48-57. 被引量：3

计算机工程

2010年第21期

浏览历史

内容加载中请稍等...

基于模式的XML文档相似度算法被引量：2

参考文献5

二级参考文献5

共引文献1

同被引文献24

引证文献2

二级引证文献1

相关作者

相关机构

相关主题

浏览历史

基于模式的XML文档相似度算法 被引量：2

参考文献5

二级参考文献5

共引文献1

同被引文献24

引证文献2

二级引证文献1

相关作者

相关机构

相关主题

浏览历史

基于模式的XML文档相似度算法被引量：2