期刊文献+

基于陶瓷类电子商务网站的Web信息抽取系统的研究

On the Application of Web Extraction System into the Ceramic E-commerce Website
下载PDF
导出
摘要 从互联网获取信息的手段中,Web信息抽取技术是有别于搜索引擎的,它能得到更精确和更具细粒度的信息,该文在分析了Web信息抽取技术在国内外发展现状的基础上,提出了陶瓷产品信息Web信息抽取的技术路线,制定了抽取规则,开发出了一套抽取系统,得到了相关陶瓷产品信息。 Among the ways of information acquisition from the internet, the Web Extraction Technology is different from Search Engine in that it can help people get more precise and more granular information. On the basis of analyzing the developing status of Web information extraction technology at home and abroad, this thesis is first to present the technical route of applying Web information exaction into the acquisition of the information of ceramic products, then to make the extraction rules, and finally to develop a set of extraction system in order to provide an access to the information of the relevant ceramic products.
作者 詹沐清
出处 《电脑知识与技术》 2014年第8X期5799-5802,共4页 Computer Knowledge and Technology
基金 江西省教育厅青年科学基金项目"Web信息抽取技术在陶瓷电子商务中的应用"(项目编号:GJJ13623) 主持人:詹沐清
关键词 Web抽取 陶瓷产品信息 Web Information extraction information of ceramic products
  • 相关文献

参考文献2

二级参考文献16

  • 1王琦,唐世渭,杨冬青,王腾蛟.基于DOM的网页主题信息自动提取[J].计算机研究与发展,2004,41(10):1786-1792. 被引量:81
  • 2Byeong H K, Yang S K. Noise Elimination from the Web Documents by Using URL paths and Information Redundancy [ C ]//The 2006 Inter-national Conference on Information & Knowledge Engineering, 2006: 135 -141.
  • 3Chang C H, Kayed M, Girgis R, et al. A survey of web information ex- traction systems[J]. IEEE Transactions on Knowledge and Data Engi- neering,2006, 15 (10) :1411-1428.
  • 4Weninger T, Hsu W H, Hart J. CETR-content extraction via tagratios [ C ]//Proceedings of the 19th international conference on World Wide Web. Raleigh : ACM Press ,2010:971 - 980.
  • 5Sun F, Song D, Liao L. DOM based content extraction via text density [ C ]//Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval. Beijing: ACM Press ,2011:245 - 254.
  • 6Cover T M ,Thomas J A. Elements of Information Theory[ M]. 2nd ed. John Wiley & Sons, Inc. , Hoboken,New Jersey, 2006.
  • 7Pinto D, Branstein M, Coleman R, et al. Quasm : A system for ques- tionanswering using semi-structured data [ C ]//Proceedings of the 2nd ACM/IEEE-CS joint conference on Digital libraries,2002:46- 55.
  • 8Gottron T. Content code blurring: A new approach to content extraction [ C]//DEXA08: Proceedings of the 19th International Conference on Database and Expert Systems Application,2008:29-33.
  • 9Gottron T. Combining content extraction heuristics: the CombinE sys- tem [ C ]//Proceedings of the 10th International Conference on Informa- tion IntegTation and Web- based Applications & Services (iiWAS). Linz ,2008:591 - 595.
  • 10穗志方,俞士汶.汉语单句谓语中心词识别知识的获取及应用[J].北京大学学报(自然科学版),1998,34(2):221-230. 被引量:17

共引文献8

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部