期刊文献+

基于超文本标记语言的文档信息自动提取技术研究 被引量:4

Research on Automatic Extraction of Document Information Based on Hypertext Markup Language
下载PDF
导出
摘要 文章研究探索了如何使用文档分解(文档结构研究),文档标记(具有可扩展标记语言(XML)),超文本标记语言(HML)和可伸缩矢量图形(SVG),以及多方面的分类机制。文档内容提取是通过计算机编程(使用Java)实现的。在这项研究中开发的文档信息自动提取技术证明:作为信息提供者,可以使信息用户(包括工程师)以更易于访问的方式制作文档内容。 This paper explores how to use document decomposition(document structure research),document markup(with Extensible Markup Language(XML)),Hypertext Markup Language(HML),and Scalable Vector Graphics(SVG),and more classification mechanism.The document content extraction is realized through computer programming(using Java).The automatic extraction technology of document information(AETDI)developed in this research proves that as an information provider,you can make Information users(including engineers)can create document content in a more accessible way.
作者 佘俊 余少锋 周宇鹏 廖崇阳 罗勇 SHE Jun;YU Shao-feng;ZHOU Yu-peng;LIAO Chong-yang;LUO Yong(Information&Communication Branch of China Southern Power Grid Peaking&Frequency Modulation Power Generation Co.,Ltd.,Guangzhou Guangdong 511400,China;Western Maintenance Test Branch of China Southern Power Grid Peaking&Frequency Modulatio Generation Co.,Ltd.,Xingyi Guizhou 562400,China)
出处 《粘接》 CAS 2020年第8期80-84,共5页 Adhesion
基金 南方电网调峰调频发电有限公司科技项目(STKJXM20180065)。
关键词 文档信息自动提取 超文本标记语言 分解方案 文档标记 分面分类 automatic extraction of document information hypertext markup language decomposition scheme document markup faceted classification
  • 相关文献

参考文献4

二级参考文献18

  • 1刘伟,孟小峰,孟卫一.Deep Web数据集成研究综述[J].计算机学报,2007,30(9):1475-1489. 被引量:136
  • 2Khare R,An Y,Song I Y.Understanding Deep Web Search Interfaces:A Survey[J].SIGMOD Record,2010,39(1):33-40.
  • 3Marin-Castro H M,Sosa-Sosa V J,Martinez-Trinidad J F,et al.Automatic Discovery of Web Query Interfaces Using Machine Learning Techniques[J].Journal of Intelligent Information Systems,2013,40(1):85-108.
  • 4Dragut E C,Kabisch T,Yu Clement,et al.A Hierarchical Approach to Model Web Query Interfaces for Web Source Integration[J].Journal of Very Large Database,2009,2(1):325-336.
  • 5Zhang Zhen,He Bin,Chang K C C.Understanding Web Query Interfaces:Best-effort Parsing with Hidden Syntax[C]//Proceedings of ACM SIGMOD Inter-national Conference on Management of Data.Paris,French:ACM Press,2004:107-118.
  • 6Barbosa L,Freire J.Searching for Hidden-Web Databases[C]//Proceedings of the 8th ACM SIGMOD International Workshop on Web and Databases.Baltimore,USA:ACM Press,2005:1-6.
  • 7Barbosa L,Freire J.Combining Classifiers to Identify Online Databases[C]//Proceedings of the 16th International Conference on World Wide Web.New York,USA:ACM Press,2007:107-118.
  • 8Wang Y,Li H,Zuo W,et al.Research on Discovering Deep Web Entries[J].Computer Science and Information Systems,2011,8(3):779-799.
  • 9Lin L,Zhou L.Web Database Schema Identification Through Simple Query Interface[J].Resource Discovery Lecture Notes in Computer Science,2010,6162(2):18-34.
  • 10W3CHTML[EB/OL].(2014-12-12).http://www.w3chtml.com/html/form.html.

共引文献13

同被引文献48

引证文献4

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部