摘要
随着大数据时代的到来,数据量呈几何倍增长。文本信息是人们接触最多的信息,关键信息作为对文本主题的高度概括,成为用户了解文本主题的快速渠道,如何快速有效的挖掘文本关键信息成为研究的关键问题。本文以本溪市政府工作报告为研究对象,将文本信息进行抽象,利用TF-IDF算法实现对文本中频繁出现的短语进行批量自动提取,统计频繁短语出现的频次,进而提取关键信息。通过对政府工作报告的提取,可以看出政府建设本溪的总体趋势,并且积极响应国家号召,总体推进本溪政府工作不断向前。
With the advent of the big data era, the volume of data has increased exponentially.Text information is the most accessible information, and the key information, as a high summary of the text theme, has become a fast channel for users to understand the theme of the text.How to quickly and effectively excavate the key information of the text has become the key issue of the research.This paper takes the Benxi municipal government ' s work report as the research object and abstracts the text information.TF-IDF algorithm is used to automatically extract frequent phrases in the text, and the frequent occurrences of frequent phrases are extracted, and the key information is extracted.Through the extraction of the government work report, we can see the general trend of the government ' s construction of benxi, and actively respond to the national call, so as to push forward the work of benxi government.
作者
于韬
王洪岩
YU Tao ,WANG Hong-yan(Liaoning Institute of Science and Technology Benxi,Liaoning 117004,China)
出处
《科技视界》
2018年第16期117-118,共2页
Science & Technology Vision
基金
基于文献知识图谱的智能推荐系统(201811430044)
辽宁省教育厅科学技术研究青年项目(L2017lkyqn-01)
辽宁科技学院青年基金(Qn201603)
辽宁科技学院服务地方创新发展软科学项目(20162rkx-06)
关键词
进行关键词提取的工作
Key in fomlation extraetion
TF-IDF algorithm
Frequent phrases
Word frequency statistics