期刊文献+

基于PageRank算法的文本关键词权重计算研究

Research on Term's Weight Calculation Based on PageRank Algorithm
下载PDF
导出
摘要 关键词的权值计算绝大多数都是将关键词当作独立的部分,忽略关键词间关联性。试图从关键词间关联性出发,提出关键词的权值受到其他关键词的相互贡献作用,以PageRank算法中对于网页权值的迭代计算为理论基础,提出一种基于关键词间相互投票的权值迭代计算模型,将关键词抽象为模型中各个节点,关键词的初始权值采用经典的TF-IDF方法。将改进的关键词权值计算方法应用于Reuters21578 Top10和20Newsgroup数据集上,实验结果表明,新的算法能够较为明显地差异化关键词之间权值,达到区分文本中关键词重要程度的作用。 The weight calculation of terms in text which mainly regards terms as a separate part, ignoring the correlation among terms. A kind of theory, which is based on correlation among terms, proposed about the term' s weight could acquire contribution from other terms. The model of weight iterative calculation based on vote among terms is proposed under the foundation of PageRank algorithm on web page weight iterative calculation. Each of term is represented as node in the model, the initial weight of the node is obtained by TF - IDF method. The experimental results on open Reuters21578 ToplO and 20Newsgroup datasets show that the improved algorithm could differentiate terms through weight significantly in order to distinguish the features in text.
作者 王庆福
机构地区 辽宁行政学院
出处 《网络新媒体技术》 2015年第3期37-41,共5页 Network New Media Technology
关键词 词项权重 投票模型 迭代收敛 权值差异化 特征项区分 term' s weight, vote model, iteratively convergence, weight differentiation, feature distinguish
  • 相关文献

参考文献12

二级参考文献118

共引文献85

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部