期刊文献+

一种数字图书馆的名称消歧框架 被引量:1

A Digital Library Name Disambiguation Framework
下载PDF
导出
摘要 在学术论文数据库中,作者名称属性是识别学术资源实体最常用的标识符。然而,名称经常是模棱两可的,不具有唯一性,从而导致各种基于名称的检索问题。名称消歧是一项困难的数据管理任务,旨在正确区分共享相同名称的不同学术资源实体,特别是对于具有海量学术论文数据库的数字图书馆,因为可用于识别作者姓名的信息是有限的。为解决这个问题而进行研究大多基于文献引用记录中的合著作者或者论文标题的分层聚类方法 ,由于数据源的可用性不高导致这些消岐方法存在准确性和稳定性方面的不足。本文提出了一个多层动态聚类名称消歧框架,该框架采用三个数据聚类层和动态聚类检查机制来提高数据源的整体有效性并最小化聚类错误。实验结果表明所提出的框架是可行的。 In the database of academic papers, author’s name attribute is the most commonly used identifier for identifying academic resource entities. However, the name is often ambiguous and not always unique, leading to various problems. Name disambiguation is a data management task that aims to correctly distinguish between different academic resource entities that share the same name, especially for digital libraries which have a large library of academic papers, because the information that can be used to identify author names is limited. Most of the previous work done to solve this problem often used a hierarchical clustering method based on information in a document citation record, such as a co-author or paper title. This article proposes a multi-level name disambiguation framework that applies not only to digital libraries but also to other applications. The proposed framework uses multi-level clustering and dynamic clustering mechanisms to minimize clustering errors. The experimental results show that the proposed framework is feasible.
作者 王玉梅 WANG Yu-mei(Xi’an International University,Xi’an 710077 China)
出处 《自动化技术与应用》 2019年第2期32-36,52,共6页 Techniques of Automation and Applications
关键词 名称消歧 多层聚类 动态聚类 聚类算法 数字图书馆 name disambiguation multi-level clustering dynamic clustering clustering algorithm digital library
  • 相关文献

参考文献15

二级参考文献140

共引文献121

同被引文献11

引证文献1

二级引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部