期刊文献+

哈萨克文网络热点关键词提取方法研究 被引量:1

RESEARCH ON THE KAZAKH NETWORK HOT KEYWORDS EXTRACTION METHOD
下载PDF
导出
摘要 针对目前少数民族语言方面热点关键词提取算法研究较少,而且精度和效率不高这一问题,提出一种哈萨克文网络热点关键词提取方法。将预处理后得到的文本利用多重因子加权改进的TF-IDF算法进行关键词提取,后续根据其位置和频率信息进行关键词组配,得到候选热点关键词集合;结合TF-PDF算法和媒体关注度思想,构造关键词热度评分标准公式KHD(Keywords Hot Degree),实现对热点关键词的提取。实验结果证明此方法可行有效,并且在提取精度和效率上都有显著提高。 In order to improve the accuracy and efficiency of the hot key words extraction algorithm for minority language,a new hot keywords extracting method is proposed. Firstly,this method extracts the keywords of the preprocessed text by the improved TF-IDF weighting algorithm and tries to link them together in the light of their location and frequency information,then the candidate hot keywords are obtained. Then,it constructs the KHD( Keywords Hot Degree) formula based on the combination of TF-PDF algorithm and the thought of media attention to achieve the extraction of hotkeywords. Experimental results show that this method is feasible and effective and the extraction accuracy and efficiency has been significantly improved.
出处 《计算机应用与软件》 2017年第1期45-49,67,共6页 Computer Applications and Software
基金 国家自然科学基金项目(61063025 61363062)
关键词 哈萨克文 词频 文档频率 媒体关注度 热点关键词 Kazakh Term frequency Document frequency Media attention Hot keywords
  • 相关文献

参考文献7

二级参考文献58

共引文献311

同被引文献4

引证文献1

二级引证文献4

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部