摘要
针对科研管理部门进行SCI论文统计时依赖人工判断识别作者、工作繁重且容易出错等问题,通过深入分析SCI论文文献特征,设计一种利用论文作者姓名和署名单位进行作者自动识别的方法。基于中国海洋大学2012-2016年发表的SCI论文数据进行实验和结果分析,针对作者识别过程中出现的作者重名问题,利用字符串模糊匹配和作者间合著关系对识别方法进行改进,再通过实验对比改进前后的作者识别结果,评估改进方法。实验结果表明,改进方法取得了比较理想的效果,达到了更高的识别精度。
Author recognition in the statistics of papers indexed by SCI is done in the mannual way and in order to solve the problem of heavy and error-prone work in the process of manual author identification,we analyzed the document characteristics of papers indexed by SCI and designed a method which uses the authors' names and institutions to recognize authors automatically.Based on the SCI papers published by Ocean University of China from 2012 to 2016,we carried out the experiment and analyzed the results.Because the homonym problem that different persons share the same name occurred in the course of automatic author identification,the string fuzzy matching and author's co-authorship networks were applied to improve the recognition method.This study evaluates the improved method by comparing the results before and after improvement by experiment.The experimental results show that the improved method achieves better results and higher accuracy rate.
作者
侯海东
洪腾龙
徐建良
HOU Hai-dong,HONG Teng-long,XU Jian-liang(College of Information Science and Engineering,Ocean University of China,Qingdao266100,Chin)
出处
《软件导刊》
2018年第8期57-60,共4页
Software Guide
关键词
作者识别
重名消歧
合著关系
模糊匹配
author recognition
name disambiguation
co-authorship network
fuzzy matching