摘要
文章通过分析传统关键词提取方法的特点和存在的问题,提出基于多特征融合的中文文本关键词提取方法。该方法通过融合中文文本词语的频率、关联度、词性以及位置多种特征,有效避免了传统关键词提取方法产生的偏差。实验结果表明,该方法在不同测试集上与传统方法相比关键词提取的平均召回率均得到明显提升。
Based on the analysis of the characteristics of and the problems in the traditional keyword extraction methods, this article proposes the Chinese text keyword extraction method based on multi-feature fusion. The method can effectively avoid the deviations in the traditional keyword extraction methods by fusing several characteristics of the Chinese text such as term frequency, word correlation, part of speech and position. The experimental results show that the method can significantly improve the average recall rate of keyword extraction in different test sets compared with the traditional methods.
出处
《情报理论与实践》
CSSCI
北大核心
2013年第10期105-108,共4页
Information Studies:Theory & Application
关键词
中文文档
特征融合
关键词
Chinese document
feature fusion
keyword