1 引言 World Wide Web是目前全球最大的信息系统,在WWW上查询Web文档主要依赖于Internet上的索引信息系统,如Yahoo、Infoseek、AltaVista、WebCrawler、Excite、Lycos等等。由于WWW太大又没有良好的结构且Web服务器的自治性,所以Web文...1 引言 World Wide Web是目前全球最大的信息系统,在WWW上查询Web文档主要依赖于Internet上的索引信息系统,如Yahoo、Infoseek、AltaVista、WebCrawler、Excite、Lycos等等。由于WWW太大又没有良好的结构且Web服务器的自治性,所以Web文档的查询难以做到全面而精确。衡量Web文档查询的质量主要有两个方面:①是否能把所有相关的文档资源找出来,不要有所遗漏。展开更多
Recently,we designed a new experimental system MSearch,which is a cross-media meta-search system built on the database of the WikipediaMM task of ImageCLEF 2008.For a meta-search engine,the kernel problem is how to me...Recently,we designed a new experimental system MSearch,which is a cross-media meta-search system built on the database of the WikipediaMM task of ImageCLEF 2008.For a meta-search engine,the kernel problem is how to merge the results from multiple member search engines and provide a more effective rank list.This paper deals with a novel fusion model employing supervised learning.Our fusion model employs ranking SVM in training the fusion weight for each member search engine. We assume the fusion weight of each member search engine as a feature of a result document returned by the meta-search engine. For a returned result document,we first build a feature vector to represent the document,and set the value of each feature as the document's score returned by the corresponding member search engine.Then we construct a training set from the documents returned from the meta-search engine to learn the fusion parameter.Finally,we use the linear fusion model based on the overlap set to merge the results set.Experimental results show that our approach significantly improves the performance of the cross-media meta-search(MSearch) and outperforms many of the existing fusion methods.展开更多
With the explosive increase of the network information,it is more and more difficult for people to look up information. The occurrence of the Web search engines overcomes this problem in some degree. However, because ...With the explosive increase of the network information,it is more and more difficult for people to look up information. The occurrence of the Web search engines overcomes this problem in some degree. However, because different search engines use different mechanisms, scope and algorithms, the repetition of the search results for the same query is no more than 34 %. If wish to get relativly fullscale ,accurate search results,multi-search engines should be used and the meta search engines occur. In this paper ,the meta search engines are surveyed. At first ,the history ,the principles and the elements of the meta search engines are discussed. Then,the related creteria of the meta search engines are analyzed and several typical meta search engines are compared. Finally,on this base,the trend of the meta search engine is introduced.展开更多
测试用例优先排序(test case prioritization,简称TCP)问题是回归测试研究中的一个热点.通过设定特定排序准则,对测试用例进行排序以优化其执行次序,旨在最大化排序目标,例如最大化测试用例集的早期缺陷检测速率.TCP问题尤其适用于因测...测试用例优先排序(test case prioritization,简称TCP)问题是回归测试研究中的一个热点.通过设定特定排序准则,对测试用例进行排序以优化其执行次序,旨在最大化排序目标,例如最大化测试用例集的早期缺陷检测速率.TCP问题尤其适用于因测试预算不足以致不能执行完所有测试用例的测试场景.首先对TCP问题进行描述,并依次从源代码、需求和模型这3个角度出发对已有的TCP技术进行分类;然后对一类特殊的TCP问题(即测试资源感知的TCP问题)的已有研究成果进行总结;随后依次总结实证研究中常用的评测指标、评测数据集和缺陷类型对实证研究结论的影响;接着依次介绍TCP技术在一些特定测试领域中的应用,包括组合测试、事件驱动型应用测试、Web服务测试和缺陷定位等;最后对下一步工作进行展望.展开更多
文摘1 引言 World Wide Web是目前全球最大的信息系统,在WWW上查询Web文档主要依赖于Internet上的索引信息系统,如Yahoo、Infoseek、AltaVista、WebCrawler、Excite、Lycos等等。由于WWW太大又没有良好的结构且Web服务器的自治性,所以Web文档的查询难以做到全面而精确。衡量Web文档查询的质量主要有两个方面:①是否能把所有相关的文档资源找出来,不要有所遗漏。
基金Project supported by the National Natural Science Foundation of China(No.60605020)the National High-Tech R&D Program (863) of China(Nos.2006AA01Z320 and 2006AA010105)
文摘Recently,we designed a new experimental system MSearch,which is a cross-media meta-search system built on the database of the WikipediaMM task of ImageCLEF 2008.For a meta-search engine,the kernel problem is how to merge the results from multiple member search engines and provide a more effective rank list.This paper deals with a novel fusion model employing supervised learning.Our fusion model employs ranking SVM in training the fusion weight for each member search engine. We assume the fusion weight of each member search engine as a feature of a result document returned by the meta-search engine. For a returned result document,we first build a feature vector to represent the document,and set the value of each feature as the document's score returned by the corresponding member search engine.Then we construct a training set from the documents returned from the meta-search engine to learn the fusion parameter.Finally,we use the linear fusion model based on the overlap set to merge the results set.Experimental results show that our approach significantly improves the performance of the cross-media meta-search(MSearch) and outperforms many of the existing fusion methods.
文摘With the explosive increase of the network information,it is more and more difficult for people to look up information. The occurrence of the Web search engines overcomes this problem in some degree. However, because different search engines use different mechanisms, scope and algorithms, the repetition of the search results for the same query is no more than 34 %. If wish to get relativly fullscale ,accurate search results,multi-search engines should be used and the meta search engines occur. In this paper ,the meta search engines are surveyed. At first ,the history ,the principles and the elements of the meta search engines are discussed. Then,the related creteria of the meta search engines are analyzed and several typical meta search engines are compared. Finally,on this base,the trend of the meta search engine is introduced.
文摘测试用例优先排序(test case prioritization,简称TCP)问题是回归测试研究中的一个热点.通过设定特定排序准则,对测试用例进行排序以优化其执行次序,旨在最大化排序目标,例如最大化测试用例集的早期缺陷检测速率.TCP问题尤其适用于因测试预算不足以致不能执行完所有测试用例的测试场景.首先对TCP问题进行描述,并依次从源代码、需求和模型这3个角度出发对已有的TCP技术进行分类;然后对一类特殊的TCP问题(即测试资源感知的TCP问题)的已有研究成果进行总结;随后依次总结实证研究中常用的评测指标、评测数据集和缺陷类型对实证研究结论的影响;接着依次介绍TCP技术在一些特定测试领域中的应用,包括组合测试、事件驱动型应用测试、Web服务测试和缺陷定位等;最后对下一步工作进行展望.