摘要
[目的/意义]针对关键词共现方法识别领域研究热点过程中数据清洗进行理论研究与探索,以辅助科研工作者准确识别领域研究热点。[方法/过程]在文献调研的基础上,阐述数据清洗的定义和对象,并分析脏数据产生的原因和影响,进而制定数据清洗的步骤和方案,并采用实证研究方法对数据清洗的效果和方案的可行性进行验证。[结果/结论]研究结果表明该数据清洗方案能够提高研究热点识别的准确性,从而证明了该方案的可行性。
[Purpose/significance] In order to efficiently aid researchers to identify research hotpot, this paper aims to explore theoretical basis and practical guidance of data cleaning in the process of identifying research hotpots based on keywords co-occurrence. [Method/process] On the basis of literature research, it firstly defines the conception and the objects of data cleaning. Then it analyses the reasons and influences of dirty data. Finally, it proposes the procedures of data cleaning, which is verified by empirical research method. [Result/conclusion] The result indicates that the procedures of data cleaning which are proved to be feasible can increase the accuracy of identification of research hotpot.
出处
《图书情报工作》
CSSCI
北大核心
2017年第7期111-117,共7页
Library and Information Service
基金
国家自然科学面上项目"嵌入式知识服务驱动下的领域多维知识库构建"(项目编号:71573102)
蚌埠医学院人文社科基金重点项目"医药专利研究领域的知识图谱绘制与分析"(项目编号:BYKY16110skZD)研究成果之一
关键词
关键词共现
研究热点
研究领域分析
数据清洗
数据挖掘
keywords co-occurrence research hotpot research area analysis data cleaning data mining