摘要
描述了一个英汉跨语言检索系统的设计与实现,其主要研究目的在于寻找更为有效的英汉查询翻译方法,以及提高汉语检索系统的性能。在英汉查询翻译方面,以英汉双语词典为基础,建立了查询翻译算法。在汉语检索方面,分析不同索引单元对于检索性能的影响,基于Lucene全文索引工具包建立了搜索引擎。在系统评测方面,提出了一种根据主题,快速构建评测数据的方法。
The paper describes the implementation of an English-Chinese cross-language information retrieval system (CLIR). It focuses on finding effective translation equivalents between English and Chinese, and improving the performance of Chinese IR. On English-Chinese CLIR, it adopts query translation as the dominant strategy, and utilizes English-Chinese bilingual dictionary as the important knowledge resource to acquire correct translations. On Chinese monolingual retrieval, it investigates the use of different entities as indexes and implements retrieval system based on the Lucene toolkit. On system evaluation, it presents a quick method to construct the sets of relevant documents for query topics.
出处
《计算机工程》
EI
CAS
CSCD
北大核心
2005年第13期62-64,共3页
Computer Engineering
基金
国家自然科学基金资助项目(60203010)
关键词
信息检索
跨语言信息检索
自然语言处理
机器翻译
Information retrieval (IR)
Cross-language information retrieval (CLIR)
Natural language processing (NLP)
Machine translation (MT)