摘要
【目的】通过运用叙词表和文本两种数据源和三种概念筛选方法提高领域概念筛选的效率。【方法】提出一种领域概念三层递进筛选方法,从叙词表和文本两种数据源提取领域概念,利用概念相关性、上下文和领域性以点到面三层递进的方式计算领域概念的概念属性和领域属性。【结果】实验结果表明,基于概念相关性、上下文和领域性的三层递进筛选方法将准确率和召回率分别提高到74.71%和71.25%。【局限】实验数据只来自测绘领域,还未使用其他领域的数据验证该方法的可行性。【结论】本研究提高领域概念筛选的准确率和召回率,综合效率高于样本中的其他方法,能够更加高效地筛选出不同学科的领域概念。
[Objective] To improve the efficiency of concepts filter by using three concept filter method with thesaurus and text. [Methods] This paper proposes a method for domain concepts triple-layer filter. Extract domain concepts from data sources containing thesaurus and text. Focuse on calculating the concepts properties and field properties of domain concepts through concepts correlation, concepts context and concepts territoriality. [Results] Experimental results show that the precision reaches 74.71% and the recall reaches 71.25% based on triple-layer filter method. [Limitations] Data sources are only about mapping, this paper doesn't use the data in other fields to demonstrate the feasibility of method. [Conclusions] This paper improves the precision and recall of domain concepts filter. Comprehensive efficiency is higher than other methods. This method could filter domain concepts from different subjects with high efficiency.
出处
《现代图书情报技术》
CSSCI
2015年第4期26-33,共8页
New Technology of Library and Information Service
基金
国家社会科学基金重大项目"基于语义的馆藏资源深度聚合与可视化展示研究"(项目编号:11&ZD152)
中国博士后科学基金项目"大数据在乳制品质量安全信息风险治理中的应用研究"(项目编号:2014M552089)的研究成果之一
关键词
三层概念筛选
概念相关性
概念上下文
概念领域性
叙词表
Triple-layer concepts filter
Concepts correlation
Concepts context
Concepts territoriality
Thesaurus