摘要
随着互联网的发展,基于Web的信息处理技术越来越受到人们的重视,也是当前研究的前沿课题。本文探讨的是如何在现有检索技术的基础上,利用Web网页的链接信息,自动地得到更高质量的检索结果——关键资源。本文提出一种同时利用Web网页的结构和内容信息以及链接信息的新方法:先结合同页的结构信息和内容评分得到网页的文档评分,然后基于网页出链的文档评分计算网页的链接评分。实验表明,本文的方法减少了无用链接的干扰,比单纯利用链接信息的效果好得多。
With the development of the Internet, Web-based information processing is becoming more and more important, and it is one of the frontiers in current research. In this paper, we describe a new approach which is based oh the traditional retrieval technology and can automatically obtain the high-quality retrieval results from the basic retrieval ones by utilizing the Web information (including structure information and link information):calculating the document score of a Web page by combining its structure information and content score, then computing the link score of the page based on the document scores of its out-linking pages. In the experiments, this approach reduces the influence of the 'noisy' links and achieved fairly good resuhs.
出处
《计算机科学》
CSCD
北大核心
2004年第10期189-192,共4页
Computer Science
基金
国家自然科学基金(69935010
60103014)
863项目(2001AA114120
2002AA142090)