期刊文献+

基于领域模型的Web数据抽取与集成 被引量:4

Extraction and Integration of Web Data Based on Domain Model
下载PDF
导出
摘要 Web数据抽取与集成的目的是提供面向领域的增值服务,结合领域数据特征,提出Web数据模式与领域数据模型.给出了基于Web数据模式的数据定位、抽取与基于领域数据模型的集成算法,并结合行业领域的需求,验证了模型和算法的有效性. The purpose of web data extraction and integration is to provide the value-added services. Analyzing the features of domain data, the paper proposes web data schema and domain data model. It also puts forward the data positioning and extracting on the base of web data schema and an integration algorithm according to do the main data model. The application results are given to show effectiveness of the proposed algorithm and model.
出处 《微电子学与计算机》 CSCD 北大核心 2012年第9期152-156,共5页 Microelectronics & Computer
基金 辽宁省自然科学基金项目(20071004)
关键词 WEB数据模型 Web数据模式 领域数据模型 数据抽取与集成 web data model web data schema domain data model data extraction and integration
  • 相关文献

参考文献8

  • 1Cafarella, M. J. ; Halevy, A.; Wang, D. Z.; Wu, E. ; and Zhang, Y. 2008. WebTables: Exploring the Power of Tables on the Web [C] // Proceedings of VLDB-08. Auckland, New Zealand: ACM, 2008: 538- 549.
  • 2Crestan E, Pantel P. Web-scale Knowledge extraction from semi- structured tables [C] // Proceedings of WWW-2010. Raleigh, USA.. ACM, 2010: 1081-1082.
  • 3廖涛,刘宗田,孙荣.Web表格定位技术的研究与实现[J].计算机科学,2009,36(9):227-230. 被引量:9
  • 4Liu Bing.Web data mining[M].俞勇,薛贵荣,韩定一,译.北京:清华大学出版社,2009.
  • 5Chen H, Tsai S, Tsai J. Mining tables from large scale HTML texts[C]//Proceedings of COLING-00 Saarbrticken, Germany: A CL, 2000:166-172.
  • 6Robert G, Wilks Y. Information extraction.. Beyond document retrieval [ J ]. Journal of Documentation, 1998,54(1) : 70-105.
  • 7Gatterbauer W, Bohunsky P, Herzog M, et al. To- wards domain-independent information extraction from web tables[ C]// Proceedings of WWW- 07. Banff, Canda: ACM, 2007 : 71-80.
  • 8鲜学丰,方巍,赵朋朋,崔志明,胡鹏昱.一种Deep Web数据源质量评估模型[J].微电子学与计算机,2008,25(10):47-50. 被引量:6

二级参考文献16

  • 1Hammer J,Garcia-Molina H,Cho J,et al.Extracting semistructured information from the Web[J].SIGOD Record,1997,26(2):18-25.
  • 2Lim S,Ng Y.An automated approach for retrieving heirarchicsl data from HTML tables[A]//Proceedings of the 8th International Conference on Information and Knowledge Management (CIKM'99)[C].1999:466-474.
  • 3Hurst M.Classifying Table Elements in HTML[A]//Proc.The 11th International World Wide Web Conference[C].WWW 2002,Sheraton Waikiki Honolulu,Hawaii,USA,May 2002.http://www2002,org/CDROM/poster/115/index,html.
  • 4Wang Y,Hu J.A Machine Learning-based Approach for Table Detection on the Web[A]//Proceedings of the 11th International Conference on WWW[C].2002:242-250.
  • 5Cui Tao.Schema Matching and Data Extraction over HTML Tables[D].USA:Brigham Young University,2003.
  • 6Chen H,et al.Mining Tables from Large Scale HTML Texts[A]//Proceedings of the 18th International Conference on Computational Linguistics[C].2000:166-172.
  • 7Chen Hsin-Hsi,Tsai Shih-Chung,Tsai Jin-He.Mining tables from large scale html texts[A]//The 18th International Conference on Computational Linguistics[C].July 2000:166-172.
  • 8Robert G,Wilks Y.Information extraction:Beyond document retrieval[J].Journal of Documentation,1998,54 (1):70-105.
  • 9Penn G,Hu J,Luo H,et al.Flexible Web document analysis for delivery to narrow-band width devices[A]//Proceeding of the 5th International Conference on Document Analysis and Recognition(ICDAR)[C].SCattle,USA,2001:1074-1078.
  • 10刘伟,孟小峰,孟卫一.Deep Web数据集成研究综述[J].计算机学报,2007,30(9):1475-1489. 被引量:136

共引文献14

同被引文献30

  • 1周骏,徐林,李征.元模型驱动的企业建模[J].计算机工程与应用,2005,41(27):215-217. 被引量:8
  • 2刘兵.Web数据挖掘[M].北京:清华大学出版社,2013.
  • 3Wang R, Cohen W. Iterative set expansion of named entity using the Web[C] // Proceedings of the 2008.Eighth IEEE Interna- tional Conference on Data Mining. 2008:1091 1096.
  • 4Lin Xi-de, Zhao 13o, Weninger T, et "al. Entity RelationDis-eovery from Web Tables and Links[C]//Proc. WWW. 2010.-1145 1146.
  • 5Wang R, Cohen W. Characterqevel analysis of semi-structured documents for set expansion[C]//EMNLP. 2009.
  • 6Etzioni O, Cafarella M, Downey D, et al. Web-scale information extraction in KnowItAll[C]// WWW. 2004:100-110.
  • 7Pantel P, CrestanE, BorkovskyA, etal. Web-ScaleDistributional Similarity and Entity Set Expansion[C]//Proceedings of EMN- LP2009. Singapore: ACL, 2009 : 938-947.
  • 8He Ye ye, Xin Dong. Set Expansion hy Iterative Similarity Ag- gregation[C]//Proc of WVCW 2011. dia: ACM, 2011= 427 436.
  • 9Pennaechiotti M,Pantel P. Entity Extraction via Ensemble Se- mantics[C]//Proc of EMNLP2009. Singapore: ACL, 2009 : 238- 247.
  • 10Tan Pang-ning, Kumar V. Introduction to Data Mining [M]. 2005.

引证文献4

二级引证文献8

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部