摘要
利用语料库、释义词典、用户检索日志作为识别相关词的语境,设计并实现相关词自动提取系统。实验结果表明,虽然面向相同的基本词汇集合,但是基于不同语境提取的相关词之间的重复率很低,各个结果间的互补性很强,说明结果整合非常有必要。在本系统中,通过直接整合途径构建最后的相关词词表。
This paper chooses corpus, definitions dictionaries and users' query logs as contexts to extract the relevance terms. The experiment results show that the overlap ratio of results in different contexts is very low. So, it is necessary to integrate the different results. All of the relevance terms are integrated to a relevance table through direct integration.
出处
《现代图书情报技术》
CSSCI
北大核心
2006年第9期23-28,80,共7页
New Technology of Library and Information Service
关键词
相关词
多语境
语料
释义词典
用户日志
Relevance term Multi -context Corpus Definitions dictionary Query log