期刊文献+

基于人名消歧的自引统计研究 被引量:1

Personal Name Disambiguation-based Research on Self-citation Statistics
下载PDF
导出
摘要 为解决中文检索系统中重名问题带来的自引统计不准确问题,设计了一种基于规则的人名消歧算法,包括作者机构、作者名、学科分类和来源期刊规则,以实现人名消歧,进而辅助自引统计。实验表明,对比基于KMeans的聚类算法,基于规则的人名消歧算法较为有效,综合测评指标F值最高达到0.87,可供自引统计模块使用。 The paper aims at solving the problem of self-citation statistics inaccuracy due to personal name duplication in Chinese retrieval system, designs a rule-based personal name disambiguation algorithm, including rules of authors’ organization, author name, discipline category and source journal, to realize the disambiguation of personal name and then to assist self-citation statistics.The experiment result shows that the rule-based personal name disambiguation algorithm is more effective than KMeans-based clustering algorithm, its comprehensive assessment index F tops at 0.87; it can be used for statistic module of self-citation.
出处 《情报探索》 2015年第5期57-59,67,共4页 Information Research
关键词 自引统计 人名消歧 聚类 规则 self-citation statistics personal name disambiguation cluster rule
  • 相关文献

参考文献6

二级参考文献30

共引文献115

同被引文献10

引证文献1

二级引证文献7

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部