期刊文献+

一种针对已知作者的姓名消歧方法 被引量:5

Method to Remove Ambiguity of Names of Known Authors
下载PDF
导出
摘要 在外文期刊数据库中,同一姓名简称代表多位作者的现象十分普遍,严重影响作者检索的精度。本次研究将规则与算法相结合,依据规则为分类算法标注训练数据,从而在无监督条件下使用有监督算法,实现作者的精确检索。该算法适用于论文查证等已知作者身份的姓名消歧问题,相比通用的消歧方法,该方法结合无监督算法无需人工标注的优点,以及有监督算法高效率、易对应实体的优点。实践结果表明,该方法具有较高的准确度。 In foreign periodicals databases, a prevalent problem is to use the same abbreviation for names of several authors. It seriously affects the accuracy of the author search. This paper attempts to, by utilizing rules and algorithms, enable accurate search by author names: it annotates training data for classification algorithm based on rules, so that supervised algorithm can be conducted in unsupervised conditions. The algorithm is suitable for author name disambiguation of the known authors. Compared with regular disambiguation methods, this method, because of the unsupervised algorithm, does not require manual annotation, and thus features higher efficiency and is easier to correspond with entity. The method is proved to result in higher accuracy in practice.
作者 范午攸 Fan Wuyou(Shanghai Jiao Tong University Library)
出处 《图书馆杂志》 CSSCI 北大核心 2018年第12期56-63,共8页 Library Journal
关键词 作者姓名消歧 数据标注 分类算法 朴素贝叶斯 Author name disambiguation Data annotation Classification algorithm Naive Bayes
  • 相关文献

参考文献6

二级参考文献104

  • 1孙茂松,黄昌宁,高海燕,方捷.中文姓名的自动辨识[J].中文信息学报,1995,9(2):16-27. 被引量:87
  • 2蒋龙,周明,简立峰.利用音译和网络挖掘翻译命名实体[J].中文信息学报,2007,21(1):23-29. 被引量:11
  • 3NIST. The ACE 2007 (ACE07) Evaluation Plan: Evaluation of the Detection and Recognition of ACE Entities, Values, Temporal Expressions, Relations, and Events [EB/OL]. [-2007]. http://www, hist. gov/ speech/tests/ace/2OOT/doc/aceOT-evalplan, vl. 3a. pdf.
  • 4Nancy A. Chinchor. Overview of MUC-7/MET-2[C]//Proceedings of the Seventh Message Under- standing Conference (MUC-7), Fairfax, Virginia, 1998.
  • 5Gina Anne Levow. The Third International Chinese Language Processing Bakeoff: Word Segmentation and Named Entity Recognition[C]//Proceedings of the Fifth SigHAN Workshop on Chinese Language Processing, Sydney: Association for Computational Lin- guistics, 2006:108 117.
  • 6A. Mikheev, C. Grover, Moens M. Description of the LTG System Used for MUC-7[C]//Proceedings of 7th Message Understanding Conference ( MUC-7 ), Fairfax, Virginia, 1998.
  • 7863计划中文信息处理与智能人机接口技术评测组.2004年度863计划中文信息处理与智能人机交互技术评测:命名实体评测结果报告[R].北京:863计划中文信息处理与智能人机接口技术评测组,2004.
  • 8Ralph Grishman, Beth Sundheim. Design of the MUC-6 evaluation [C]//Proceedings of 6th Message Under- standing Conference, Columbia, MD, 199S.
  • 9G. R. Krupka, K. Hausman. IsoQuest. Inc.:Description of the NetOwl TM Extractor System as Used for MUC-7 [C]//Proceedings of the 7th Message Understanding Conference. (MUC-7), Fairfax, Virginia, 1998.
  • 10W.J. Black, F. Rinaldi, D. Mowart. FACILE: Description of the NE System Used for MUC-7 [C]// Proceedings of the 7th Message Understanding Conference. (MUC-7), Fairfax, Virginia, 1998.

共引文献102

同被引文献45

引证文献5

二级引证文献14

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部