摘要
自从自然语言进入计算机检索系统以来,就因其不受不同职业、不同知识背景、不同检索经验的影响等特点受到终端用户的青睐。由于中文文献的标题是中文文献内容的集中体现,它反映了文献的中心思想。本文介绍的自然语言检索方法,限定在文献标题层面进行主题标引。该方法的基本思想是用自动标引方法分别对检索系统中数据库的文献标题和用于检索的自然语言进行自动赋词标引,对给定的关键词进行概念控制,即词义转换,形成最终标引词;然后,用向量空间模型对数据库的索引数据进行"或"运算检索,形成命中文献集合B,对B集合中的每篇文献标题再进行自动标引,形成的标引词与用于检索的自然语言标引词进行相似度计算,根据B集合中的每篇文献的相似度进行排序,把最符合检索要求的文献最先呈现给用户。此方法是一种简便、实用的自然语言检索方法。
Since natural language was introduced to the computer retrieval system, it has been favored by users, due to its freedom from restrictions of professional experience, knowledge background, and retrieval experience. As the title of the Chinese literature greatly refl ects its content, it embodies the central idea. Retrieval methods of natural language described in this article only refers to subject indexing in literature title. The basic idea of this method is automatically indexing the literature titles and natural language that is used in retrieval. To control the defi nition of a given keyword, namely, meaning transformation, it will produce the fi nal indexing words. Then, using the vector space model, an "or" operation wil be conducted on the index data, resulting in a document set B. Another automatic indexing will be performed on each document title in set B. The indexing terms thus produced will be used for a similarity calculation with the natural language indexing terms. The documents in set B will then be arranged as per the level of similarity, and the ones that meet the best retrieval requirements will be presented to users. This method is a simple and practical method of natural language retrieval.
出处
《图书馆杂志》
CSSCI
北大核心
2016年第6期66-72,共7页
Library Journal