摘要
针对PageRank算法存在的不足,本文对网络链接的结构进行分析,并以此为基础对PageRank的算法进行了改进,提出了主题链接相似度的PageRank算法。本文算法的核心是将当前网页与入链网页的主题相关度作为传递权值,替换PageRank算法中以平均值作为权值。本文的PageRank-I算法将网页之间的链接作为链接的向量,以这种链接的关系来对向量的余弦相似度进行主题相关性的描述,而不用对额外的文本信息进行处理,减少了系统负担。实验结果证实本文的PageRank-I算法在没有增加系统的额外负担的同时,也没有增加时间上的复杂度,解决了PageRank算法中主题漂移的问题。
In view of the shortcomings of PageRank algorithm,this paper analyzes the structure of network links,and improves the algorithm of PageRank based on this,and puts forward the PageRank algorithm with similarity of theme links.In this paper,the core of the algorithm is to transfer the topic relevance of the current web page to the link page as the transfer weight,and replace the average value of the PageRank algorithm.PageRank-I algorithm in this paper,the links between web pages as vector of links,in this link the relationship between the topic relevance vector cosine similarity for description,instead of to the extra text information processing,reduce the burden of system experimental results confirmed the PageRank-I algorithm without increase the extra burden of system at the same time,also did not increase the complexity of the time,solve the problem of topic drift in the PageRank algorithm.
作者
杨泳丹
Yang Yongdan(Yunnan Power Grid Co.,Ltd Information Centre,Yunnan Kunming 650217,China)
出处
《科技通报》
2019年第7期178-181,185,共5页
Bulletin of Science and Technology