摘要
用户通过web查询接口获取后台数据库的数据时,由于返回结果元组数量是受限的,只能获取隐藏数据库中的部分数据.现有的搜索引擎技术也很难有效的爬取隐藏数据库的全部数据.为此,针对后台隐藏数据库的数值属性类型,本文提出了基于数值属性的排序划分算法,通过该算法能够以较少的次数查询获取隐藏数据库数据的全部数据元组,并给出了算法查询代价的理论分析,通过实验验证了算法的有效性.
When the user obtains the data of the background database through the web query interface, the number of the returned result is limited, and only partial data of the hidden database is acquired. The existing search engine technology is also difficult to effectively crawl all the data in the hidden database. To this end, a sorting algorithm based on numerical attributes is proposed for type of the numerical attributes of the background hidden database. By this algorithm, the total data tuples of the hidden database can be acquired with less query time. The theoretical analysis of the query cost of the algorithm is given, and the validity of the algorithm is verified by experiments.
作者
孙阳
李贵
韩子扬
李征宇
孙平
SUN Yang;LI Gui;HAN Zi-yang;LI Zheng-yu;SUN Ping(Faculty of Information & Control Engineering,Shenyang Jianzhu University,Shenyang 110168,China)
出处
《信息工程期刊(中英文版)》
2016年第1期1-8,共8页
Scientific Journal of Information Engineering