期刊文献+

XML模式推断研究综述 被引量:3

Schema Inference from XML Data: A Review
下载PDF
导出
摘要 本文对XML(Extensible Markup Language)数据的模式推断问题研究现状与进展进行了阐述.首先,从正规树文法的角度介绍了不同模式语言的理论模型.进而从模式推断方法、目标模式语言、支持的表达能力、内容模型对应的正则表达式类型等多个方面对当前研究工作进行了细致的分类归纳和对比.此外,还介绍了模式语言中支持的基本语义完整性约束推断的研究进展.最后指出了当前研究中的不足,并对未来需要深入研究的方向进行了展望.重在对XML模式推断的主流方法和前沿进展进行概括、比较和分析,以期对后续研究有所助益. This paper surveys the state of the art of schema inference from XML data. First,the formal models based on regular tree grammar for commonly used XML schema languages are presented. Then,the existing works on XML schema inference are summarized and compared from various aspects such as inference methods,target schema languages,supported expressiveness,regular expression types corresponding to the content models,and so on. In addition,inferences of some basic integrity constraints from XML data are also introduced. Finally,this paper points out the defects of current research and discusses some potential future research directions. This paper aims to offer a detail overview,comparison and analysis of the mainstream methods and recent progress in this field,expecting to be beneficial for subsequent research.
作者 郑黎晓 王成
出处 《电子学报》 EI CAS CSCD 北大核心 2016年第2期461-471,共11页 Acta Electronica Sinica
基金 华侨大学科研启动基金(No.12BS215) 国家自然科学基金(No.51305142 No.61502184) 福建省自然科学基金(No.2015J01259)
关键词 可扩展标记语言 模式推断 正规树文法 正则表达式 XML(Extensible Markup Language) schema inference regular tree grammar regular expression
  • 相关文献

参考文献67

  • 1Tova M,Suciu D,Vianu V.Typechecking for XML transformers[J].Journal of Computer and System Sciences,2003,66(1):66-97.
  • 2Bj?rklund H,Martens W,Schwentick T.Optimizing conjunctive queries over trees using schema information.Proceedings of the International Symposium on Mathematical Foundations of Computer Science[C].Berlin Heidelberg:Springer,2008.132-143.
  • 3Halevy A,Rajaraman A,Ordille J.Data integration:the teenage years.Proceedings of the VLDB[C].New York:ACM Press,2006.9-16.
  • 4Barbosa D,Mignet L,Veltri P.Studying the XML web:Gathering statistics from an XML sample[J].World Wide Web,2006,9(2):187-212.
  • 5Martens W,Neven F,Schwentick T and Bex G J.Expressiveness and complexity of XML Schema[J].ACM Transactions on Database Systems,2006,31(3):770-813.
  • 6Grijzenhout S,Marx M.The quality of the XML web[J].Web Semantics:Science,Services and Agents on the World Wide Web,March 2013,19:59-68.
  • 7Florescu D.Managing semi-structured data[J].ACM Queue,2005,3(8):18-24.
  • 8Hinkelman S.Business integration--Information conformance statements[R].Technique Report,IBM Developer Works,2005.
  • 9Bray T,Paoli J,et al.Extensible markup language(XML)1.0(Fifth Edition)[EB/OL].http://www.w3.org/TR/2008/REC-xml-20081126/,2008.
  • 10Gao S,Sperberg-McQueen C M,Thompson H S.XML schema definition language(XSD)1.1 part 1:structures[EB/OL].http://www.w3.org/TR/2012/REC-xmlschema11-1-20120405/,2012.

二级参考文献13

  • 1Sun. JAXB[EB/OL]. http://java, sun. com/webservices/jaxb.
  • 2Bex G J, Neven F, Vansummeren S. Inferring XML schema definitions from XML data[C]// Proceedings of the 33rd International Conference on Very Large Data Bases. Vienna, Austria: VLDB Endowment, 2007,998-1009.
  • 3Hegewald J, Naumann F, Weis M. XStruct; efficient schema extraction from multiple and large XML documents[C]//Proceedings of 22nd International Conference on Data Engineering Workshops. Atlanta, GA, USA, 2006: 81.
  • 4Altova XMLSpy[EB/OL]. http://www, altova, com/products/ xrnlspy/xmlspy, html.
  • 5Stylus Studio[EB/OL]. http://www, stylusstudio, com/.
  • 6Trang C J. Multi-format schema converter based on RELAX NG [EB/OL]. Thai Open Source Software Center Ltd, 2003. http://www, thaiopensource, com/relaxng/trang, html.
  • 7Garofalakis M, Gionis A, Rastogi R, et al. XTRACT: learning document type descriptors from XML document collections[J]. Data Mining and Knowledge Discovery, 2003,7 ( 1 ):23-56.
  • 8Min J, Ahn J,Chung C. Efficient extraction of schemas for XML documents[J]. Information Processing Letters, 2003,85 ( 1 ) : 7- 12.
  • 9Bex G, et al. Inference of concise DTDs from XML data[C]// Proceedings of the 32nd International Conference on Very Large Data Bases. Seoul, Korea.. VLDB Endowment, 2006 : 115-126.
  • 10Martens W, Neven F, Schwentick T. Simple off the shelf abstractions for XML Schema[J].ACM SIGMOD Record, 2007, 36(3) : 15-22.

共引文献2

同被引文献13

引证文献3

二级引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部