基于返回结果的Deep Web查询接口识别被引量：1

Recognized Query Interface of Deep Web Based on Response Pages

下载PDF

导出

摘要互联网上存在许多有价值的信息,搜索引擎只能索引静态页面,无法索引Deep Web数据,而Deep Web通常以表单形式存在,只有提交表单查询才能获得其数据,如何发现和识别Deep Web查询接口成为人们关注的问题。在分析表单表现形式与功能内在的联系的基础上,提出一个表单的抽象模型,依此过滤非Deep Web查询接口的表单。通过对返回结果页面分析方法,实现Deep Web查询接口的识别,实验结果证明了该方法的有效性。 There are many valuable resources on the Intemet, traditional search engines work well for finding the pages which are static and linked to other pages, and ignore the deep web which only produce results dynamically in response to a direct request of the form, so many people pay attention to find and recognize the query interface of deep web. On the deep web,many sources are structured by provid- ing structured query interfaces and results, extract the features of query forms based on the features available on the search interlaces ,and introduce a generic operational model which can support advanced query and analysis, then delete the interfaces which is not the interface of deep web on the model. Through analysing the structure of response pages, recognize the query interface of deep web. The results of experiments validate the feasibility of the approach.

作者周爱武李玉梅周闪闪王宝铜

机构地区安徽大学计算机科学与技术学院安徽大学计算智能与信号处理教育部重点实验室

出处《计算机技术与发展》 2009年第7期117-119,123,共4页 Computer Technology and Development

基金安徽省自然科学基金项目(070412051)

关键词 DEEP Web 查询接口 post—query 表单 deep web query interface post - query form

分类号 TP18 [自动化与计算机技术—控制理论与控制工程]

引文网络
相关文献

参考文献8

1Bergman M K. The Deep Web: Surfadng hidden value[EB/ OL]. 2001 - 09 - 24. http://www. brightplanet. com.
2马军,宋玲,韩晓晖,闫泼.基于网页上下文的Deep Web数据库分类[J].软件学报,2008,19(2):267-274. 被引量：31
3Cope J, Nick C, Davia H. Automated Discovery of Search Interfaces on the Web[C]//Conferences in Research and Practice in Information Technology. Australia: Australian Computer Society,2003.
4Lin Peiguang, Xu Ruzhi, Hong Zhimin, et al. Fingding the WDB's Query, Interface in Deep Web Automatically[C]//2008 International Conference on Intemet Gomputing in Science and Engineering. Washington, DC, USA: IEEE Computer Society,2008:195 - 200.
5袁柳,李战怀,陈世亮.基于本体的Deep Web数据标注[J].软件学报,2008,19(2):237-245. 被引量：28
6苏志华,杨冬青,唐世渭,王腾蛟.基于结构分析和实体识别的信息集成[J].计算机研究与发展,2004,41(10):1823-1828. 被引量：5
7Raghavan S, Garcia - Molina H. Crawling the Hidden Web [C]//the International Conference on Very Large Data Bases (VLDB). Rome: Morgan Kaufmann Publishers, 2001:129 - 138.
8郑冬冬,崔志明.Deep Web查询接口选择[J].计算机应用,2006,26(9):2024-2027. 被引量：6

二级参考文献56

1M E Califf, R J Mooney. Relational learning of pattern-match rules for information extraction. In: Proc of the 16th National Conf on Artificial Intelligence and the 11th Conf on Innovative Applications of Artificial Intelligence. Menlo Park, California:AAAI Press/The MIT Press, 1999. 328～334
2D Freitag. Machine learning for information extraction in informal domains. Machine Learning, 2000, 39(2-3): 169～202
3S SoderLan. Learning information extraction rules for semistructured and free text. Machine Learning, 1999, 34(1-3): 233～272
4A Sahuguet, F Azavant. Building intelligent Web applications using lightweight wrappers. Data and Knowledge Engineering,2001, 36(3): 283～316
5Liu L, Pu C, Han W. XWRAP: An XML-enabled wrapper construction system for Web information sources. In: Proc of the 16th Int'l Conf on Data Engineering. Los Alamitos, California:IEEE Computer Society, 2000. 611～621
6R Baumgartner, S Flesca, G Gottlob. Visual Web information extraction with Lixto. In: Proc of the 27th Int'l Conf on Very Large Data Bases. San Francisco: Morgan Kaufmann, 2001. 119～ 128
7V Crescenzi, G Mecca. Grammars have exceptions. Information Systems, 1998, 23(9): 539～565
8B Adelberg. NoDoSE-A tool for semi-automatically extracting structured and semi-structured data from text documents. In: Proc of the 1998 ACM SIGMOD Int'l Conf on Management of Data.New York: ACM Press, 1998. 283～294
9D Bikel, R Schwarta, R Weisehedel. An algorithm that learns what's in a name. Machine Learning, 1997, 34(1-3): 211～231
10D Freitag, A L McCallum. Information extraction using HMMs and shrinkage. In: Proc of the 16th National Conf on Artificial Intelligence. Menlo Park, California: AAAI Press, 1993. 31～36

共引文献65

1魏勇刚,张国春,常勇,袁方.基于词性分析和领域知识的Deep Web语义标注[J].郑州大学学报（理学版）,2009,41(1):52-55. 被引量：7
2张聚弘,山岚.基于页面对比分析的数据提取[J].计算机与数字工程,2006,34(1):49-52. 被引量：1
3王兵,王轲.Deep Web数据源聚类与分类[J].计算机与现代化,2007(8):36-40. 被引量：3
4YUAN Fang ZHAO Yao ZHOU Xu.A Deep Web Query Interfaces Classification Method Based on RBF Neural Network[J].Wuhan University Journal of Natural Sciences,2007,12(5):825-829. 被引量：1
5李石生,刘海博,赵耀.基于DeepWeb的图书检索系统设计[J].河北大学成人教育学院学报,2008,10(1):103-104. 被引量：3
6陈方,谭爱平,成亚玲,文益民.主题爬虫技术研究综述[J].湖南工业职业技术学院学报,2008,8(5):13-16. 被引量：6
7崔晓军,彭智勇,曾承.基于多标注源的Deep Web查询结果自动标注[J].计算机应用,2009,29(1):196-200. 被引量：3
8陈立娜.面向制造业的主动搜索平台的研究与实现[J].现代计算机,2009,15(2):189-191. 被引量：1
9常勇,王亮,姚增利,袁方.基于领域知识和决策树的Deep Web数据标注[J].广西师范大学学报（自然科学版）,2009,27(1):129-132. 被引量：1
10赵志宏,黄蕾,刘峰,陈振宇.Deep Web搜索技术进展综述[J].山东大学学报（工学版）,2009,39(2):15-20. 被引量：5

同被引文献3

1LIU Wei,LI Xian,LING Yanyan,ZHANG Xiaoyu,MENG Xiaofeng.A Deep Web Data Integration System for Job Search[J].Wuhan University Journal of Natural Sciences,2006,11(5):1197-1201. 被引量：6
2高岭,赵朋朋,崔志明.Deep Web查询接口的自动判定[J].计算机技术与发展,2007,17(5):148-151. 被引量：13
3刘伟,孟小峰,孟卫一.Deep Web数据集成研究综述[J].计算机学报,2007,30(9):1475-1489. 被引量：136

引证文献1

1王鸿,余建桥.基于N-Gram的Deep Web接口属性抽取[J].计算机与现代化,2010(12):135-138. 被引量：1

二级引证文献1

1石龙,强保华,何倩,吴春明,谌超.基于DOM的Deep Web查询接口属性抽取方法[J].桂林电子科技大学学报,2012,32(6):468-472.

1胡萍.三段式逐步求精的中文Deep Web数据源自动分类[J].电脑知识与技术（过刊）,2009,15(5X):3599-3601.
2丁迎.VFP如何按任意组合条件查询表单[J].电脑编程技巧与维护,2003(10):89-89. 被引量：2
3王琳,王行甫,杜云开.使用双层分类器在垂直搜索中自动识别交互式查询接口[J].小型微型计算机系统,2016,37(6):1138-1142.
4苏林英.dBASE-Ⅲ的一个自然语言接口[J].内蒙古大学学报（自然科学版）,1993,24(2):213-224.
5吕强,宋玲,马军,秦英林.基于本体的Deep Web语义分类研究[J].山东建筑大学学报,2010,25(2):118-124. 被引量：3
6Maxim Integrated推出汽车DC—DC转换器[J].现代制造,2013(45):21-21.
7林海伦,杨晓刚,熊锦华,王元卓,贾岩涛,程学旗.Deep Web数据采集查询构造方法研究[J].计算机科学与探索,2015,9(9):1025-1033. 被引量：2
8孙璐,陈军华,廉德胜.一种基于视觉特征的Deep Web信息抽取方法[J].计算机与数字工程,2016(6):1107-1111. 被引量：4
9颜嘉俊,唐遵烈,熊露,李金.基于USB3.0的相机实时图像采集和实时显示实现[J].半导体光电,2015,36(5):824-827. 被引量：2
10张雷,王学文,王淑平,丁华,杨兆建.UGNX与EDEM协同建立离散元前处理模型无缝连接技术[J].煤炭技术,2015,34(7):223-225. 被引量：2

计算机技术与发展

2009年第7期

浏览历史

内容加载中请稍等...

基于返回结果的Deep Web查询接口识别被引量：1

参考文献8

二级参考文献56

共引文献65

同被引文献3

引证文献1

二级引证文献1

相关作者

相关机构

相关主题

浏览历史

基于返回结果的Deep Web查询接口识别 被引量：1

参考文献8

二级参考文献56

共引文献65

同被引文献3

引证文献1

二级引证文献1

相关作者

相关机构

相关主题

浏览历史

基于返回结果的Deep Web查询接口识别被引量：1