Focusing on the problem that it is hard to utilize the web multi-fields information with various forms in large scale web search,a novel approach,which can automatically acquire features from web pages based on a set ...Focusing on the problem that it is hard to utilize the web multi-fields information with various forms in large scale web search,a novel approach,which can automatically acquire features from web pages based on a set of well defined rules,is proposed.The features describe the contents of web pages from different aspects and they can be used to improve the ranking performance for web search.The acquired feature has the advantages of unified form and less noise,and can easily be used in web page relevance ranking.A special specs for judging the relevance between user queries and acquired features is also proposed.Experimental results show that the features acquired by the proposed approach and the feature relevance specs can significantly improve the relevance ranking performance for web search.展开更多
基金The National Natural Science Foundation of China(No.60673087)
文摘Focusing on the problem that it is hard to utilize the web multi-fields information with various forms in large scale web search,a novel approach,which can automatically acquire features from web pages based on a set of well defined rules,is proposed.The features describe the contents of web pages from different aspects and they can be used to improve the ranking performance for web search.The acquired feature has the advantages of unified form and less noise,and can easily be used in web page relevance ranking.A special specs for judging the relevance between user queries and acquired features is also proposed.Experimental results show that the features acquired by the proposed approach and the feature relevance specs can significantly improve the relevance ranking performance for web search.