期刊文献+

国内共享科研数据热点主题及演化分析:从主题模型视角 被引量:6

Analysis of Hot Topics and Evolution of Shared Scientific Research Data in China:From the Perspective of Topic Models
原文传递
导出
摘要 从主题模型的视角出发,以文本挖掘的手段分析我国十年来有关共享科研数据的文献研究主题演化情况以及热点主题,促进在突发公共事件过程中科研数据共享行为更加合理,让科研数据最大化发挥自身的价值,为今后国内共享科研数据的研究提供借鉴和参考。结合本文的实际研究需求,在中国知网中选取2010到2019年涉及共享科研数据的文献作为研究数据集。使用LDA主题模型对其中的摘要部分进行文本挖掘,在对文本进行预处理的基础上识别出其中所包含的若干主题,分别测定不同主题在不同时间段内的强度,识别出热点主题,并根据强度的变化对主题进行分析。近十年来相关文献共计有32个研究主题,其中14个热点主题,"数据出版""数据共享能力""国内积极促进"3个主题呈现上升趋势,11个主题呈现出下降趋势。对于国内共享科研数据的规范性、国际交流之间的频繁性有所上升,高校图书馆也在其中起到了巨大的推动作用。但是也应该注意在共享科研数据过程中客观存在技术上的障碍、整体社会环境的发生变化要求科研数据共享程度进一步加深、不同数据主体之间共享意愿的巨大差异,如何弥补差异、克服障碍、让科研数据的共享与社会环境的变化结合起来是今后应当重点研究的主题。 From the perspective of the topic model,the paper uses text mining to analyze the evolution of literature research topics and hot topics in China over the past decade on shared scientific research data,promote scientific research data sharing behaviors in the course of public emergencies,and maximize the value of scientific research data,aiming at providing reference for future research on sharing scientific research data.The paper selects the articles on sharing scientific research data published in CNKI from 2010 to 2019 as the research data set,and uses the LDA topic model to perform text mining on the abstract part of the text.Based on the preprocessing of the text,several topics contained in it are identified,the strength of different topics in different time periods is measured,hot topics are identified,and the topics are aralyzed according to the changes in intensity.The related literature has a total of 32 research topics in the past decade,of which 14 are hot topics,and 3 topics including"data publishing","data sharing capabilities",and"proactive domestic promotion"have shown an upward trend,while 11 topics have shown a downward trend.Regarding the standardization of domestic scientific research data sharing and the increasing frequency of international exchanges,university libraries have also played a huge role in promoting it.However,it should also be noted that there are objective technical obstacles in the process of sharing scientific research data,changes in the overall social environment require further deepening of scientific research data sharing,and there are huge differences in the willingness to share between different data subjects.How to make up for the differences,overcome obstacles,and allow the combination of scientific research data sharing and changes in the social environment should be the focus of future research.
作者 张斌 Zhang Bin
出处 《图书馆学研究》 CSSCI 北大核心 2020年第14期11-18,共8页 Research on Library Science
关键词 文本挖掘 科研数据 主题演化 热点主题 共享 LDA主题模型 text mining scientific research data topic evolution hot topics sharing LDA topic model
  • 相关文献

参考文献35

二级参考文献427

共引文献1072

同被引文献104

引证文献6

二级引证文献21

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部