摘要
传统的关键字提取方法一般基于TFIDF,不仅消耗的时间过多,而且效果也不理想。提出用信息增益的思想来对文中的词进行权重的计算,并在此基础上结合改进的PageRank来提取文中的关键字。实验结果表明,该种方法得到的结果明显优于传统方法得到的结果。
Traditional keyword extraction method is generally based on TFIDF,but the effect is not satisfactory and consumes too much time as well.In this paper,we present an idea of using information gain to calculate the weight of words in the text,and based on this,we extract Keywords in the text in combination with the improved PageRank.Experimental results demonstrate that the outcomes achieved with this method clearly outperform those with traditional method.
出处
《计算机应用与软件》
CSCD
北大核心
2012年第9期75-76,86,共3页
Computer Applications and Software
基金
国家自然科学基金项目(60673186
60971088)