As data grows in size,search engines face new challenges in extracting more relevant content for users’searches.As a result,a number of retrieval and ranking algorithms have been employed to ensure that the results a...As data grows in size,search engines face new challenges in extracting more relevant content for users’searches.As a result,a number of retrieval and ranking algorithms have been employed to ensure that the results are relevant to the user’s requirements.Unfortunately,most existing indexes and ranking algo-rithms crawl documents and web pages based on a limited set of criteria designed to meet user expectations,making it impossible to deliver exceptionally accurate results.As a result,this study investigates and analyses how search engines work,as well as the elements that contribute to higher ranks.This paper addresses the issue of bias by proposing a new ranking algorithm based on the PageRank(PR)algorithm,which is one of the most widely used page ranking algorithms We pro-pose weighted PageRank(WPR)algorithms to test the relationship between these various measures.The Weighted Page Rank(WPR)model was used in three dis-tinct trials to compare the rankings of documents and pages based on one or more user preferences criteria.Thefindings of utilizing the Weighted Page Rank model showed that using multiple criteria to rankfinal pages is better than using only one,and that some criteria had a greater impact on ranking results than others.展开更多
Web services is one of the basic network services, whose availability evaluation is of great significance to the promotion of users’ experience. This paper focuses on the problem of availability evaluation of Web ser...Web services is one of the basic network services, whose availability evaluation is of great significance to the promotion of users’ experience. This paper focuses on the problem of availability evaluation of Web services and proposes a method for availability evaluation of Web services using improved grey correlation analysis with entropy difference and weight (EWGCA).This method is based on grey correlation analysis, and use entropy difference to illustrate the changes of availability, set weight to quantize availability requirements of different operations or transactions in services. Through simulation experiment in high load scenarios for Web services, the experiment result shows that our method can realize hierarchical description and overall evaluation for availability of Web services accurately in the case of smaller test sample volumes or uncertain data even in the field of big data.展开更多
Automatic web page classification has become inevitable for web directories due to the multitude of web pages in the World Wide Web. In this paper an improved Term Weighting technique is proposed for automatic and eff...Automatic web page classification has become inevitable for web directories due to the multitude of web pages in the World Wide Web. In this paper an improved Term Weighting technique is proposed for automatic and effective classification of web pages. The web documents are represented as set of features. The proposed method selects and extracts the most prominent features reducing the high dimensionality problem of classifier. The proper selection of features among the large set improves the performance of the classifier. The proposed algorithm is implemented and tested on a benchmarked dataset. The results show the better performance than most of the existing term weighting techniques.展开更多
单个Web服务难以满足实际应用的需求,如何组合已有的服务,形成新的服务,已成为此领域的研究热点。现在的组合方法极少考虑服务质量QoS(Quality of Service)。对于一些提供相似功能的Web服务,服务质量是判断是否选择此服务的关键因素,组...单个Web服务难以满足实际应用的需求,如何组合已有的服务,形成新的服务,已成为此领域的研究热点。现在的组合方法极少考虑服务质量QoS(Quality of Service)。对于一些提供相似功能的Web服务,服务质量是判断是否选择此服务的关键因素,组合服务的质量必须满足用户的需求。本文基于SOA的服务开发思想,针对当前服务组合存在的问题,提出了一种基于QoS的服务组合方法,并给出了构建基于QoS的Web服务组合及选择最佳服务的策略,通过整合单个服务的质量以得到最终组合服务的整体最佳质量。在满足用户组合服务的功能需求的同时,也满足了用户对服务质量QoS的需求,实现了需求服务的优化。展开更多
针对Web数据库近似查询产生的多查询结果问题,提出了一种近似查询结果自动排序方法,该方法利用KL距离(Kullback-Leibler distance),PIR(probabilistic information retrieval)模型和查询历史(query history)来构建元组排序打分函数;打...针对Web数据库近似查询产生的多查询结果问题,提出了一种近似查询结果自动排序方法,该方法利用KL距离(Kullback-Leibler distance),PIR(probabilistic information retrieval)模型和查询历史(query history)来构建元组排序打分函数;打分函数根据结果元组中被查询指定的属性值对初始查询的满足度和未被查询指定的属性值与用户偏好的相关度来评估元组的排序分值.实验证明,提出的排序方法能够较好地满足用户需求和偏好,并具有较高执行效率.展开更多
文摘As data grows in size,search engines face new challenges in extracting more relevant content for users’searches.As a result,a number of retrieval and ranking algorithms have been employed to ensure that the results are relevant to the user’s requirements.Unfortunately,most existing indexes and ranking algo-rithms crawl documents and web pages based on a limited set of criteria designed to meet user expectations,making it impossible to deliver exceptionally accurate results.As a result,this study investigates and analyses how search engines work,as well as the elements that contribute to higher ranks.This paper addresses the issue of bias by proposing a new ranking algorithm based on the PageRank(PR)algorithm,which is one of the most widely used page ranking algorithms We pro-pose weighted PageRank(WPR)algorithms to test the relationship between these various measures.The Weighted Page Rank(WPR)model was used in three dis-tinct trials to compare the rankings of documents and pages based on one or more user preferences criteria.Thefindings of utilizing the Weighted Page Rank model showed that using multiple criteria to rankfinal pages is better than using only one,and that some criteria had a greater impact on ranking results than others.
基金This research is supported by the National Natural Science Foundation of China (61370212), the Research Fund for the Doctoral Program of Higher Education of China (20122304130002), the Natural Science Foundation of Heilongjiang Province (ZD 201102) and the Fundamental Research Fund for the Central Universities (HEUCFZ1213, HEUCF100601).
文摘Web services is one of the basic network services, whose availability evaluation is of great significance to the promotion of users’ experience. This paper focuses on the problem of availability evaluation of Web services and proposes a method for availability evaluation of Web services using improved grey correlation analysis with entropy difference and weight (EWGCA).This method is based on grey correlation analysis, and use entropy difference to illustrate the changes of availability, set weight to quantize availability requirements of different operations or transactions in services. Through simulation experiment in high load scenarios for Web services, the experiment result shows that our method can realize hierarchical description and overall evaluation for availability of Web services accurately in the case of smaller test sample volumes or uncertain data even in the field of big data.
文摘Automatic web page classification has become inevitable for web directories due to the multitude of web pages in the World Wide Web. In this paper an improved Term Weighting technique is proposed for automatic and effective classification of web pages. The web documents are represented as set of features. The proposed method selects and extracts the most prominent features reducing the high dimensionality problem of classifier. The proper selection of features among the large set improves the performance of the classifier. The proposed algorithm is implemented and tested on a benchmarked dataset. The results show the better performance than most of the existing term weighting techniques.
文摘单个Web服务难以满足实际应用的需求,如何组合已有的服务,形成新的服务,已成为此领域的研究热点。现在的组合方法极少考虑服务质量QoS(Quality of Service)。对于一些提供相似功能的Web服务,服务质量是判断是否选择此服务的关键因素,组合服务的质量必须满足用户的需求。本文基于SOA的服务开发思想,针对当前服务组合存在的问题,提出了一种基于QoS的服务组合方法,并给出了构建基于QoS的Web服务组合及选择最佳服务的策略,通过整合单个服务的质量以得到最终组合服务的整体最佳质量。在满足用户组合服务的功能需求的同时,也满足了用户对服务质量QoS的需求,实现了需求服务的优化。
文摘针对Web数据库近似查询产生的多查询结果问题,提出了一种近似查询结果自动排序方法,该方法利用KL距离(Kullback-Leibler distance),PIR(probabilistic information retrieval)模型和查询历史(query history)来构建元组排序打分函数;打分函数根据结果元组中被查询指定的属性值对初始查询的满足度和未被查询指定的属性值与用户偏好的相关度来评估元组的排序分值.实验证明,提出的排序方法能够较好地满足用户需求和偏好,并具有较高执行效率.