期刊文献+

结合本体论和统计方法的跨语言信息检索模型 被引量:5

CLIR model based on a combination of ontology and statistical method
下载PDF
导出
摘要 为了更有效地提高跨语言信息检索的性能,结合本体论和统计方法的特性,提出一种混合的跨语言信息检索模型.在该语言模型的结构上,提出一种本体描述框架,构造了一个形式化的语言本体知识表示,通过典型语料学习,融合了语法、语义、句法等多元信息,建立了源语言本体知识库.在跨语言信息检索的实际应用中,利用本体表示,获得初始的检索文档集,再基于源语言本体知识库,对全部候选文档重新排序,以提高TopN排列的精确度.利用NTCIR-3Workshop中的中英文跨语言信息检索数据集对该语言模型进行了评价,相关实验结果表明,该方法取得了较满意的实验效果. For improving the performance of cross-lingual information retrieval, a hybrid language presented based on a combination of ontology and statistical method. In the structure of the languag model is e model, an ontology description frame was given and a linguistic ontology knowledge presentation was determined. A linguistic ontology knowledge bank of source language was created, which combines with semantic, pragmatic and syntactic by learning typical corpus. In cross-lingual information retrieval, the initial document set will be obtained by ontology presentation and all documents will be re-ordered based on linguistic ontology knowledge of source language for improving the precision of Top N rank. The cross-lingual information retrieval data set in NTCIR-3 Workshop was used to evaluate the performance of the language model. The results indicate that the proposed method improves the precision of nature language processing.
出处 《哈尔滨工业大学学报》 EI CAS CSCD 北大核心 2008年第1期77-80,共4页 Journal of Harbin Institute of Technology
基金 国家自然科学基金资助项目(60736044) 国家高技术研究发展计划资助项目(2006AA01Z150 2004AA11701008)
关键词 跨语言信息检索 本体 统计方法 语言模型 知识获取 cross-lingual information retrieval ontology statistical method language model knowledge acquisition
  • 相关文献

参考文献18

  • 1GAO J F, NIE J Y, HE H ZH, et al. Resolving query translation ambiguity using a decaying co-occurrence model and syntactic dependence relations [C]//Proceedings of the 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. Finland: ACM. 2002:183 - 190.
  • 2XU J, WEISCHEDEL R. Trec-9 Cross-Lingual Retrieval at BBN [C]//Proceeding of the Ninth Text Retrieval Conference. USA : NIST. 2000 : 106 - 115.
  • 3GAO J F, Nie J Y, ZHANG J, et al. Trec-9 CLIR Experiments at MSRCN [C]//Proceeding of the Ninth Text Retrieval Conference. USA : NIST. 2000: 343 - 353.
  • 4WU L D, HUANG X J, GUO Y, et al. Fdu at Trec-9: CUR,Filtering and QA Tasks [C]//Proceeding of the Ninth Text Retrieval Conference. USA : NIST. 2000 :202 -219.
  • 5JIN H, WONG. K F. Trec-9 CLIR at Cuhk, Disambiguation by Similarity Values Between Adjacent Words [ C ]//Proceeding of the Ninth Text Retrieval Conference. USA: NIST. 2000:151-156.
  • 6JELINEK, F. Self-Organized Language Modeling for Speech Recognition [C]//Readings in Speech Recognition. San Mateo: Morgan kaufmann Publishers. 1990: 450 - 506.
  • 7BROWN P, PIETRA S D. , PIETRA V D, et al. The mathematics of statistical machinetranslation: Parameter estimation [ J ]. Computational Linguistics, 1993, 19 (2) : 269 -311.
  • 8CROFT W B, LAFFERTY J. Language Modeling for Information Retrieval [M]. Amsterdam: Springer, 2003.
  • 9NECHES R. , FIKES R. , FININ T. , et al. Enabling Technology for Knowledge Sharing [ J ]. AI Magazine, 1991, 12(3): 16-36.
  • 10CYCL Cycorp, Inc. [EB/OL]. http://www.cyc. com.

同被引文献98

引证文献5

二级引证文献37

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部