期刊文献+

基于融合特征的中文图书作者人名消歧方法研究 被引量:2

Research on Chinese Book Author's Name Disambiguation Based on Fusion Features
下载PDF
导出
摘要 中文图书作者中一人多名和多人同名现象普遍存在;且各属性描述参差不齐。融合特征消歧算法处理过程中准确率有所下降。本文将作者属性分为实体特征、上下文关系特征、社会关系特征。借助向量空间模型用属性互斥放大和特征矩阵空缺缩小方法调整属性和矩阵权重系数后计算作者相似度。通过基于凝聚的层次聚类实现消歧,构建中文图书作者信息模型。用B_Cubed指标评测消歧结果,准确率、F值分别达到为89.42%、87.45%。 There is a widespread phenomenon that one person has many names and mutil-persons have co-name in Chinese book authors; and the description of attributes are uneven.The phenomenon of the homonym of more than one and many people in Chinese book writers is common, and the description of each attribute is uneven.The accuracy of the fusion feature disambiguation algorithm is reduced.This paper divides the author's attributes into three categories: Entity Features, Contextual Relationships, and Social Relations.With the aid of the vector space model, the attribute mutex amplification and the matrix vacancy reduction method are used to adjust the weight, then calculate the authors' similarity.The Chinese book author information model is constructed by using the hierarchical agglomerative clustering to realize disambiguation. The results of disambiguation were evaluated with B_Cubed index. The accuracy and F-value were 89.42% and 90.47% respectively.
作者 李孟亚
出处 《电脑知识与技术》 2018年第4Z期182-184,共3页 Computer Knowledge and Technology
关键词 中文图书作者 人名消歧 互斥放大 空缺缩小 Chinese book author name disambiguation mutex amplification vacancy reduction
  • 相关文献

参考文献3

二级参考文献16

  • 1罗会兰,孔繁胜,李一啸.聚类集成中的差异性度量研究[J].计算机学报,2007,30(8):1315-1324. 被引量:36
  • 2蒲旭,王建勇,范晓明.GHOST:作者名字排歧系统[J].计算机研究与发展,2010,47:512—515.
  • 3Han H, Giles L, Zha H, et al. Two supewised learning approaches for name disambiguation in author citations E C ]//Proceedings of ACM/IEEE Joint Conference on Digital Libraries, Tuscon, AZ, USA,2004 : 296 - 305.
  • 4Han H,Zha H,Giles C L. Name disambiguation in au- thor citations using a K-way spectral clustering method [ C]//Proceedings of the 5th ACM/IEEE-CS joint conference on Digital libraries( JCDL'05 ), New York, NY, USA, 2005, ACM : 334 - 343.
  • 5Song Y, Huang J, Councill I G, et al. Efficient topic- based unsupervised name disambiguation [ C ]//Pro- ceedings of ACM/IEEE Joint Conference on Digital Libraries, Vancouver, British Columbia Canada, 2007 : 342 - 351.
  • 6Kang I S, Na S H, Lee S, et al. On co-authorship for author disambiguation [ J ]. Information Processing & Management,2009,45 ( 1 ) :84 - 97.
  • 7Tan Y F, Kan M Y, Lee D. Search engine driven author disambiguation [ C ]//proceedings of ACM/IEEE Joint Conference on Digital Libraries,2006:314 -315.
  • 8Fan X, Wang J, Lv B, et al. GHOST: an effective graph-based framework for name distinction [ C J//Pro- ceeding of the 17th ACM conference on Information and knowledge management, 2008, ACM: 1449 - 1450.
  • 9Salton G, Buckley C. Term-weighting approaches in automatic text retrieval [ J ]. Information Processing & Mana,ement. 1988.24 ( 5),513 - 523.
  • 10赵军.命名实体识别、排歧和跨语言关联[J].中文信息学报,2009,23(2):3-17. 被引量:50

共引文献10

同被引文献19

引证文献2

二级引证文献10

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部