期刊文献+

基于农业网络信息分类的热词自动提取方法 被引量:6

Automatic Extraction Method of Hot Words Based on Agricultural Network Information Classification
下载PDF
导出
摘要 热词提取对于监控和分析农业舆情具有重要意义,目前已有一定研究基础,但仍存在针对性差等问题,无法满足农业领域不同产业用户群的个性化需求,为此,提出一种基于农业网络信息分类的热词自动提取方法。首先采用多标记分类算法对文本语料进行分类,按分类类别构建语料库,然后采用基于信息熵的方法对每个类别分别提取热词候选词,最后采用基于时间变化的方法进行候选词热度计算,根据候选词热度排序结果得到热词。本文抽取农业网站上的15 354条文本进行实验,结果表明,热词提取准确率达到0.9以上,能够较高质量地提取农业热词,为不同农业用户群体发现和分析产业热点提供帮助。 With the vigorous development of the Internet, the network information grows rapidly, so does the agricultural network information. Extracting hot words from massive information is of great significance for monitoring and analyzing agricultural public opinion. Up to now, there is some research on hot words extraction, but there are still many problems such as poor pertinence. Existing hot word extraction methods cannot meet the personalized needs of users in different industries in agriculture. Therefore, a method of automatically extracting hot words based on agricultural network information classification was proposed. Firstly, the texts were classified by using the multi-label classification algorithm and multiple corpuses were built according to the classification categories. Secondly, the hot word candidates for each category were extracted by using the method based on information entropy. Thirdly, the heat of each hot word candidate was calculated by using the method based on time variation. Finally, these candidates were sorted by heat degree, and hot words were got according to the sorting results. Totally 15354 texts from agricultural websites were extracted for the experiment, automatically obtaining the hot words in the specified time period. The experiment results showed that the accuracy was over 0.9. It proved that the proposed method can extract agricultural hot words with high quality and help different agricultural user groups find and analyze the hot spot information of the industry.
作者 段青玲 张璐 刘怡然 王沙沙 DUAN Qingling;ZHANG Lu;LIU Yiran;WANG Shasha(College of Information and Electrical Engineering, China Agricultural University, Beijiag 100083, China;Agricultural Information Technology Limited Liability Company of Beijing, Beijing 100081, China)
出处 《农业机械学报》 EI CAS CSCD 北大核心 2018年第7期160-167,共8页 Transactions of the Chinese Society for Agricultural Machinery
基金 国家高技术研究发展计划(863计划)项目(2013AA102306) "十二五"国家科技支撑计划项目(2012BAD35B06)
关键词 农业网络信息 农业舆情监测 热词 多标记分类 热度计算 agricultural network information agricultural public opinion monitoring hot word multilabel classification heat calculation
  • 相关文献

参考文献16

二级参考文献203

共引文献983

同被引文献93

引证文献6

二级引证文献42

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部