摘要
由于网页质量千差万别,对网页进行基于网络链接图的质量排序变成了现代搜索引擎的一个重要部件。分析了对网络排序模块的实现进行优化时,造成大规模稀疏矩阵-向量乘法运算低效的原因,并结合网络链接图的实际情况提出了几种不同的优化策略。然后,对几种优化策略做了实验性能比较,并综合考虑各种优化策略的运算效率和存储量需求,选择了适合实际系统的优化策略。同时,提出PageRank算法在实现时的一个变通处理——除汇。
Web page ranking model based on web link graph becomes a vital part of modem search engines. The causes resulting in the low efficiency of large-scale sparse matrix-vector multiplication are analyzed. Then, combined with the web link graph, several optimizing strategies based on the experience from other scholars are brought forward. After that, several optimizing strategies are choosen for experimental compare, and the best strategies are selected through the strict compare at both time efficiency and memory requirement. At the same time, it introduce an alternative solution in the realization of PageRank algorithm-removing of rank sinks.
出处
《计算机工程与设计》
CSCD
北大核心
2007年第7期1632-1635,共4页
Computer Engineering and Design
关键词
搜索引擎
网页排序
网络链接图
稀疏矩阵
汇点
search engine
web page ranking
web link graph
sparse matrix
rank sink