期刊文献+

用于文献可视化分析的数据清洗方法研究 被引量:5

Research of Data Cleaning Method for Visualization
下载PDF
导出
摘要 可视化分析是文献计量分析中较为重要的一种。在进行可视化分析时,数据清洗工作至关重要,但目前的可视化分析软件一般不具备数据清洗功能,即使具备,也无法实现批量化的快速清洗。文章以石墨烯领域的WOS文献题录数据可视化分析为对象,探索了利用OpenRefine的聚类功能,对题录中的重要信息进行聚类,形成软件可识别的规范术语文件,进行可视化的方法,验证了该方法用于文献情报挖掘中可视化分析的优越性。研究结果表明,利用OpenRefine聚类功能可以高效地对文献题录重要信息进行处理,机构合作网络中,重复节点减少了9%;关键词共现网络中,词频最大可增加742次,明显减少了重复节点,提高了可视化分析的准确性和情报挖掘的效率。 Visualization analysis is an important method among the bibliometric analysis.During the process of visualization analysis,data cleaning is very important,but the current visualization software generally can􀆳t be used to do this work,and even if some software can,it is relatively simple,needing to repeat one by one.The paper explores a modified strategy of information mining with the improvement of data visualization analysis which playsa significant role in information mining as an breakthrough point.Taking the visualization analysis of bibliography in graphene area from WOS as an example,the paper explores the method of using the clustering function of OpenRefine to deal with the important information of the bibliography,forming a standardized glossary to meet the needs of visualizing software,and then carrying out the visualization.The superiority of this method in the visualization analysis is verified.The results show that with the help of the clustering function of OpenRefine,people can quickly process the important information of literature in bulk,the number of duplicate nodes is reduced by 9%in the organization cooperation network,and the keywords frequency can be increased by 742 times in max,which significantly reduces the number of duplicate nodes in the keyword co-occurrence network.This method can improve the accuracy of visualization analysis and the efficiency of information mining.
作者 方小利 刘霞 FANG Xiao-li;LIU Xia(Wuhan University Library,Wuhan430072,China)
机构地区 武汉大学图书馆
出处 《大学图书情报学刊》 2021年第6期56-60,共5页 Journal of Academic Library and Information Science
基金 “武大通识3.0”项目(数据素养与数据利用)(2019年9月-2022年6月)(武大本函[2018]158号)。
关键词 大数据 CITESPACE VOSviewer OpenRefine 可视化 情报挖掘 数据清洗 big data CiteSpace VOSviewer OpenRefine visualization information mining data cleaning
  • 相关文献

参考文献5

二级参考文献20

  • 1邱均平.我国文献计量学的进展与发展方向[J].情报学报,1994,13(6):454-463. 被引量:40
  • 2王知津,姚广宽.三大中文数据库引文功能比较——CNKI、Vip和CSSCI实证研究[J].图书情报知识,2005,22(3):61-65. 被引量:36
  • 3康延兴.引文检索策略的探讨[J].情报科学,2005,23(8):1233-1236. 被引量:4
  • 4南京医科大学图书馆.汤森路透在线培训:InCites讲座与在线演示.[2012-07-15].http://lib.11jmu.edu.en/news/show.asp?id=605.
  • 5Thomson Reuters. InCites. [ 2012-04-20 ]. http://incites, isikno wledge, com.
  • 6中国科学技术大学图书馆.InCites.[2012-04-20].http://lib.ustc.edu.cn/lib/dbmap/anewpage.php?id=299.
  • 7汤森路透.InCites.[2012-04-20].http://www.thomsonscientific.tom.cn/productsservices/InCites.
  • 8Thomson Reuters. InCites Help: Glossary A to Z. [ 2012-04-23 ]. http ://incites-help. isiknowledge, corn/incites _ 19 _ live/ glossaryGroup/glossaryOnePageFull, html.
  • 9教育部发展规划司.中国教育统计年鉴(2010).北京:人民教育出版社,2010.
  • 10中国台湾省大学排行榜.[2012-07-15].http://wenku.baidu.corn/view/4704c80cdl84254b353511.html.

共引文献52

同被引文献93

引证文献5

二级引证文献4

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部