期刊文献+

基于Lucene的英汉跨语言信息检索 被引量:12

English-Chinese Cross-language Information Retrieval Using Lucene System
下载PDF
导出
摘要 描述了一个英汉跨语言检索系统的设计与实现,其主要研究目的在于寻找更为有效的英汉查询翻译方法,以及提高汉语检索系统的性能。在英汉查询翻译方面,以英汉双语词典为基础,建立了查询翻译算法。在汉语检索方面,分析不同索引单元对于检索性能的影响,基于Lucene全文索引工具包建立了搜索引擎。在系统评测方面,提出了一种根据主题,快速构建评测数据的方法。 The paper describes the implementation of an English-Chinese cross-language information retrieval system (CLIR). It focuses on finding effective translation equivalents between English and Chinese, and improving the performance of Chinese IR. On English-Chinese CLIR, it adopts query translation as the dominant strategy, and utilizes English-Chinese bilingual dictionary as the important knowledge resource to acquire correct translations. On Chinese monolingual retrieval, it investigates the use of different entities as indexes and implements retrieval system based on the Lucene toolkit. On system evaluation, it presents a quick method to construct the sets of relevant documents for query topics.
出处 《计算机工程》 EI CAS CSCD 北大核心 2005年第13期62-64,共3页 Computer Engineering
基金 国家自然科学基金资助项目(60203010)
关键词 信息检索 跨语言信息检索 自然语言处理 机器翻译 Information retrieval (IR) Cross-language information retrieval (CLIR) Natural language processing (NLP) Machine translation (MT)
  • 相关文献

参考文献5

  • 1Foo S, Li Hui. Chinese Word Segmentation and Its Effect on Information Retrieval. Information Processing & Management, 2002.
  • 2Wu Z M, Tseng G. Chinese Text Segmentation for Text Retrieval:Achievements and Problems. Journal of the American Society for Information Science, 1993,44 (9): 532-542.
  • 3Gao Jianfeng. An Empirical Study of CLIR at MSRCN. Shanghai:International Workshop ILT&CIP-2001 on Innovative Language Technology and Chinese Information Processing, 2001.
  • 4Jakarta Lucene Home Page.http://jakarta.apache.org/lucene/.
  • 5Baeza-Yates R, Ribeiro-Neto B. Modem Information Retrieval.Addison-Wesley, 1999.

同被引文献49

引证文献12

二级引证文献42

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部