期刊文献+

汉蒙跨语言检索系统设计与实现 被引量:5

Design and Implementation of Chinese and Mongolian Cross-language Retrieval System
下载PDF
导出
摘要 [目的/意义]基于汉蒙间跨语言检索系统发展现状,设计并实现通过汉文、传统蒙古文关键词检索西里尔蒙古文文档的系统。[方法/过程]汉蒙跨语言检索系统包括机器翻译和文档检索。在机器翻译方面,实现了基于词典的汉文到西里尔蒙古文机器翻译,并实现了基于规则和统计的传统蒙古文到西里尔蒙古文转换;在文档检索方面,基于Lucene全文索引工具包对大量的西里尔蒙古文文档建立索引,并根据向量空间模型对查询和文档的相似度进行排序,得到与查询最为匹配的文档集。[结果/结论]本系统响应速度较快,准确率较高,达到可用水平。一方面促进中国与蒙古国之间的科技、文化、教育的交流;另一方面对我国西里尔蒙古文的研究有一定的促进作用。 [ Purpose/significance] This paper designs and implements a system for retrieving Cyrillic Mongolian documents through Chinese and traditional Mongolian keywords based on the current development of Chinese and Mongolian cross-language retrieval systems. [ Method/process ] The proposed Chinese-Mongolian cross-language retrieval system includes machine translation and document retrieval. In the aspect of machine translation, two translations are implemented : a dictionary-based Chinese to Cyrillic Mongolian translation; the traditional Mongolian to Cyrillic Mongolian conversion based on rules and statistics. For document retrieval, the Lucene full-text indexing toolkit is employed to index a large amount of Cyrillic Mongolian documents. The best matched documents are obtained using the vector space model. [ Result/conclusion ] This system has high accuracy of retrieval with rapid response, and it can be applied in practical system. On the one hand, this research promotes the exchange of science, technology, culture and education between China and Mongolia. On the other hand, it promotes the study of Cyril Mongolian in China.
出处 《情报理论与实践》 CSSCI 北大核心 2017年第4期128-132,144,共6页 Information Studies:Theory & Application
基金 国家自然科学基金项目"基于领域本体的蒙古文数字资源整合机制研究"的成果 项目编号:71163029
关键词 跨语言信息检索 信息检索系统 检索方法 cross-language information retrieval information retrieval system retrieval method
  • 相关文献

参考文献3

二级参考文献20

  • 1[1]Mark W. Davis and Ted E. Dunning. A TREC evaluation of query translation methods for multi-lingual text retrieval[A]. In:D. K. Harman, editor, The Fourth Text Retrieval Conference (TREC-4)[C]. NIST, November 1995.
  • 2[2]Christian Fluhr. Multilingual information retrieval[A]. In:Ronald A Cole, Joseph Mariani, Hans Uszkoreit, Annie Zaenen, and Victor Joe Zue, editors, Survey of the state of the art in human language technology[C]. 291-305. Center for Spoken Language Understanding, Oregon Graduate Institute, 1995.
  • 3[3]Pigur V A. Multilanguage information-retrieval systems: Integration levels and language support[J]. Automatic Documentation and Mathematical Linguistics, 1979,13(1):36-46.
  • 4[4]Chris Buckley, Gerard Salton, James Allan, and Amit Singhal. Automatic query expansion using SMART[C]: TREC 3. In D. K. Harman, editor, Overview of the Third Text Retrieval Conference (TREC-3), NIST, November 1994, 69-80.
  • 5[5]Pim van der Eijk. Automating the acquisition of bilingual terminology[C]. In:Sixth Conference of the European Chapter of the Association for Computational Linguistics, April 1993, 113-119.
  • 6[6]Chung hsin Lin and Hsinchun Chen. An automatic indexing and neural network approach to concept retrieval and classification of multilingual (Chinese-English) documents[J]. IEEE Transaction on Systems, Man and Cybernetics, February 1996,26(1):75-88.
  • 7[7]Kenney Ng. A maximum likelihood ratio information retrieval model[C].In:Proceedings of the 8th Text Retrieval Conference (TREC-8), 1999.
  • 8[8]Wu Li-de. Large scale chinese text processing[M]. Fudan University Press, 1997.
  • 9清格尔泰.蒙古语语法[M].呼和浩特:内蒙古人民出版社,1992.
  • 10Hpricot pylori style wiki. [2011 - 01 - 30]. http://tam. qmix. org/wki/Hpricot, html.

共引文献12

同被引文献41

引证文献5

二级引证文献42

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部