期刊文献+

汉维语短语搭配的识别和对齐

CHINESE-UYGHUR PHRASES COLLOCATION AND ALIGNMENT
下载PDF
导出
摘要 提出一种简单实用的汉维语短语搭配的抽取方法。该方法不需要汉语分词、词性标注等预处理工作,根据语料中汉语字和维语单词的共现信息,避免语料中个别词汇数目极少而共现信息值较大出现噪音,采用t检验消除,相对于利用分词和词性标注等技术的抽取方法,该算法简单且时间效率较高。实验结果表明,该方法利用较小规模的语料库也能达到较好的短语搭配抽取效果。 This dissertation puts forward a simple and practical extraction method for Chinese and Uyghur phrases collocation.The method does not require Chinese word segmentation,POS tagging and other pre-processing works,according to co-occurrence information of Chinese characters and Uyghur words in the corpus,it avoids the existence of the noise caused by sparseness of exceptional words but with quite big co-occurrence information value in the corpus and uses t-test to eliminate the noise.Compared with traditional extraction methods based on word segmentation and POS tagging technologies,the algorithm is simple and time efficient.Experimental results show that the method can achieve preferable effect of phrases collocation extraction even using smaller corpus library.
出处 《计算机应用与软件》 CSCD 2011年第6期43-46,共4页 Computer Applications and Software
基金 国家自然科学基金项目(60963017 60963018) 国家社科基金项目(10BTQ045) 新疆自治区高校科研计划项目(XJEDU2009I05)
关键词 双语语料 短语搭配 对齐 Bilingual corpora Phrases collocation Alignment
  • 相关文献

参考文献15

二级参考文献93

  • 1刘群,张华平,俞鸿魁,程学旗.基于层叠隐马模型的汉语词法分析[J].计算机研究与发展,2004,41(8):1421-1429. 被引量:197
  • 2黄河燕,陈肇雄.基于多策略的交互式智能辅助翻译平台总体设计[J].计算机研究与发展,2004,41(7):1266-1272. 被引量:12
  • 3孙宏林,俞士汶.浅层句法分析方法概述[J].当代语言学,2000,2(2):74-83. 被引量:38
  • 4侯宏旭,刘群,张玉洁,井佐原均.2005年度863机器翻译评测方法研究与实施[J].中文信息学报,2006,20(B03):7-18. 被引量:6
  • 5俞士汶等.机器翻译译文质量自动评估系统[A]..中国中文信息学会1991年会论文集[C].,.314—319.
  • 6Argamon S., Dagan I., Krymolowski Y.. A memory-based approach to learning shallow natural language patterns. In: Proceedings of COLING-ACL'98, Montreal, Canada, 1998, 63~73
  • 7Erik F., Tjong Kim Sang, Sabine Buchholz. Introduction to CoNLL-2000 shared task: Chunking. In: Proceedings of CoNLL-2000, Lisbon, Portugal, 2000, 127~132
  • 8Fung P., Church K.W.. K-vec: A new approach for aligning parallel texts. In: Proceedings of the 15th International Conference on Computational Linguistics (COLING'94), Tokyo, Japan, 1994, 1096~1102
  • 9Abney S.. Parsing by chunks. In: Berwick R., Abney S., Tenny C. eds.. Principle-Based Parsing. Kluwer Academic Publishers, 1991
  • 10Yael K., Edelman S.. Learning similarity-based word sense disambiguation. Computational Linguistics, 1998, 24(1): 41~60

共引文献248

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部