期刊文献+

半结构化数据的模式抽取 被引量:5

Extracting Schema from Semistructured Data
下载PDF
导出
摘要 模式抽取在半结构化数据研究领域中具有重要意义。论文结合同类对象集和标签路径的概念,提出了一种从OEM模型中抽取模式的新方法。算法的基本思想是:在用OEM模型表示的半结构化数据中查找同类对象集,并通过构造模式表的方法来实现模式抽取。这种方法不但能从层次结构数据中抽取模式,而且还能从包含环路的OEM数据中进行模式抽取,克服了其它一些算法不能从带有环路的数据中进行模式抽取的缺点。 Extracting schema is important in the field of semistructured data research.This paper presents a new approach to this topic with the conception of homo-object set and label path.The new approach finishes extracting schema by tow steps:firstly,searching all homo-object sets from OEM model;secondly,constructing schema table.This approach not only extracts schema from level structured data,but also from OEM data which include circle,while some other approaches can not extract schema from OEM data which include circle.
出处 《计算机工程与应用》 CSCD 北大核心 2006年第27期162-165,共4页 Computer Engineering and Applications
关键词 半结构化数据 OEM 同类对象 模式表 模式抽取 semistructured data,OEM,homo-object,schema table,extracting schema
  • 相关文献

参考文献8

  • 1王静,孟小峰.半结构化数据的模式研究综述[J].计算机科学,2001,28(2):6-10. 被引量:22
  • 2刘芳,胡和平.半结构化数据的模式发现[J].微型电脑应用,2000,16(2):13-15. 被引量:9
  • 3沈一栋.Extracting Schema from an OEM Database[J].Journal of Computer Science & Technology,1998,13(4):289-299. 被引量:1
  • 4刘芳,胡和平,路松峰.半结构化、层次数据的模式发现[J].小型微型计算机系统,2001,22(1):84-88. 被引量:11
  • 5Papakonstantinou Y,Garcia-Molina H,Widom J.Object exchange across heterogeneous information sources[C].In:Proc of the IEEE ICDF IEEE Computer Society Press,1995:251~260
  • 6Svetlozar Nestorov,Jeffrey Ullman,Janet Wiener.Representative objects:concise representations of semistructured,hierarchical data[C].In:Proc ICDE,1997:79~90
  • 7Goldman R,Widom J.DataGuide:Enabling Query Formulation and Optimization in Semistructured Database[C].In:Proc of the Intl Conf on Very Large Data Bases(VLDB),Athens,Greece,1997
  • 8Jason McHugh,Serge Abiteboul,Roy Goldman et al.Lorel:A Database Management System for Semistructured Data[OL].http://www-db.stanford.edu/lore

二级参考文献13

  • 1[1]Peter Buneman. Semistructured data. [C]1n Proc. of PODS.Tucson,Arizona. 1997. 117~121
  • 2[2]Serge Abiteboul. Querying semi-structured data. [C]In proceed ings of ICDT, Delphi,Greece. Jan 1997. 1~18
  • 3[3]Svetlozar Nestorov, Jeffrey Ullman, Janet Wiener, et al. Representative objects: concise representations of semistructured, hierarchical data. [C]In Proc ICDE. 1997. 79~90
  • 4[4]Bayardo R. Efficiently mining long patterns from databases. [C]In Proc. of the 1998 ACM-SIGMOD Int' 1 Conf. on Management of Data. Washington USA In Proc. of ICDE, Birmingham UK 1998. 85~93
  • 5王秋月,Proc of EDBT 2000,2000年
  • 6Buneman P,Proc of the ACM SIGACT SIGMOD-SIGART Symposium on Principles of Database Systems (PODS),1999年
  • 7Buneman P,Proc of the ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems (PODS),1998年
  • 8S.Abiteboul.Querying semi-structured data.In Proc.ofICDT.Delphi,Greece,January,1997.1-18
  • 9Peter Buneman,Susan Davison,Gerd Hillebrand,et al.A query language and optimizationtechniques for unstructured data.In Proc.of ACM-SIGMOD InternationalConference,Montreal,Canada,June,1996,505-516
  • 10S.Abiteboul,D.Quass,J.McHugh,et al.The lorel language for semistructureddata.Technical report,Dept.of Computer Science,Stanford University,1996(http://www-db.stanford.edu/pub/papers/lore196.ps)

共引文献31

同被引文献20

  • 1谢坤武,陈世强.一种分类数据的聚类算法[J].计算机研究与发展,2006,43(z3):332-337. 被引量:1
  • 2吴共庆,陈恩红.一种基于XML的半结构化数据存储方法[J].计算机工程,2004,30(10):57-59. 被引量:11
  • 3鲁明羽,陆玉昌.基于OEM模型的半结构化数据的模式抽取[J].清华大学学报(自然科学版),2004,44(9):1264-1267. 被引量:8
  • 4吕橙,魏楚元,张瀚韬.基于OEM模型的半结构化数据的模式发现[J].计算机工程与应用,2006,42(34):162-165. 被引量:5
  • 5Goldman R,W idom J. DataGuide:Enabling query formulation and optimization in semistructured databases In:Proe of the international Conf of the Very Large Data Bases(VIDB) [ C ]. Athens, Greece, 1997.
  • 6Jason McHugh, Jennifer Widom. Loor: A Database Management System for Semistructured data [ C ]. SIGMOD Record, 1997 ( 3 ) :54 - 66.
  • 7Michal Cutler, Yungming Shih, Weiyi Meng. Using the Structure of HTML Documents to Improve Retrieval[ C ]//Proc of USENIX Symposium on Intemet Technologies and Systems, Monterey: California, 1997 : 241 - 251.
  • 8Svetlozar Nestorov,Jeffrey Ullman,Janet Wiener,et al. Representative Objects:Concise Representations of Semistructured, Hierarchical Data[ C]// Proc of the Thirteenth International Conference on Data Engineering, IEEE, 1997:79- 90.
  • 9Ke Wang, Huiqing Liu. Schema Discovery for Scmistructured Data[C]//Proc of the Third International Conference on Knowledge Discovery and Data Mining. Menlo Park: AAAI Press,1997:271 -274.
  • 10Bradley J, Rhodes. Taxonomic knowledge structure discovery from imagery - based data using the neural associative incremental learning (NAIL) algorithm[J]. Information Fusion,2007,8(3) :295 -315.

引证文献5

二级引证文献3

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部