期刊文献+

基于字单元分析的中文辅助阅读系统 被引量:1

A Computer-Aided Chinese Reading System Based on Analysis Unit of Characters
下载PDF
导出
摘要 辅助汉语学习研究作为一个重要的研究领域,已经在自然语言处理领域激发起越来越多人的兴趣。文中提出一个基于字分析单元的辅助阅读系统,它可以为汉语学习者提供即时的辅助翻译和学习功能。系统首先提出基于字信息的汉语词法分析方法,对汉语网页中文本进行分词处理,然后利用基于组成字结构信息的方法发现新词。对于通用词典未收录的新词(例如:专业术语、专有名词和固定短语),系统提出了基于语义预测和反馈学习的方法在Web上挖掘出地道的译文。对于常用词,系统通过汉英(或汉日)词典提供即时的译文显示,用户也可通过词用法检索模块在网络上检索到该词的具体用法实例。该系统关键技术包括:基于字信息的汉语词法分析,基于组成字结构信息的新词发现,基于语义预测和反馈学习的新词译文获取,这些模块均以字分析单元的方法为主线,并始终贯穿着整个系统。实验表明该系统在各方面都具有良好的性能。 As one of the important research topics, computer-aided Chinese learning is attracting more and more interest in natural language processing society. A computer-aided reading and learning system based on analysis unit of characters is proposed to provide reading and learning assistant for Chinese learner in this paper. The system first employs character-based Chinese morphological analysis for segmenting Chinese texts into words, and then presents a method based on structure information of constituent characters for new word finding. For unknown words unregistered in the dictionary (such as: technical terms, proper nouns and fixed phrases), a method based on semantic prediction and feedback learning is proposed to mine their native translations from the Web. For frequent words, real-time translation display is implemented by the Chinese-English (Chinese-Japanese) dictionary database, and users can also obtain typical examples of this word usage through a word usage retrieval module. In this system, key technologies include: morphological analysis based on character information, word segmentation based on structure information of constituent characters, and translation acquisition of new words based on semantic prediction and feedback learning. A character analysis unit is the core of all proposed methods used in the whole system. Experiments show that our system has good performance in every aspect.
出处 《中文信息学报》 CSCD 北大核心 2008年第2期92-98,共7页 Journal of Chinese Information Processing
关键词 计算机应用 中文信息处理 词法分析 新词发现 术语翻译 WEB挖掘 辅助汉语学习 computer application Chinese information processing morphological analysis new word finding~ termtranslation Web mining computer-aided Chinese learning
  • 相关文献

参考文献9

  • 1Y. Meng, H. Yu, F. Nishino. A Lexicon Constrained Character Model for Chinese Morphological Analysis [A]. IJCNLP[C]. 2005. 542- 552.
  • 2H.T. Ng, J. Low: Chinese Part-of-Speech Tagging One-at-a-Time or All-at-Once? Word-Based or Charac ter Based[A]. In: Proc. of EMNLP[C]. 2004. 277 284.
  • 3T. Nakagawa. Chinese and Japanese Word Segmentation Using Word level and Character-level Information[A]. In: Proc. of the 20^th COLING[C]. 2004. 466-472.
  • 4邹纲,刘洋,刘群,孟遥,于浩,西野文人,亢世勇.面向Internet的中文新词语检测[J].中文信息学报,2004,18(6):1-9. 被引量:59
  • 5M. Nagata, T. Saito, K. Suzuki. Using the web as a bilingual dictionary[A]. Proc. ACL 2001 Workshop Data Driven Methods in Machine Translation[C]. 2001. 95-102.
  • 6P.J. Cheng, J.W. Teng, R.C. Chen, etal. Translating unknown queries with web corpora for cross-language information retrieval[A]. Proc. ACM SIGIR[C]. 2004. 146-153.
  • 7Y. Zhang, P. Vines. Using the web for automated translation extraction in cross-language information retrieval[A]. Proc. ACM SIGIR[C]. 2004. 162-169.
  • 8G.L. Fang, H. Yu, F. Nishino. Web-Based Terminology Translation Mining[A]. Proc. IJCNLP[C]. 2005. 1004-1016.
  • 9G.L. Fang, H. Yu, F. Nishino. Chinese-English Term Translation Mining Based on Semantic Prediction[A]. Proc. COLING/ACL poster[C]. 2006. 199-206.

二级参考文献2

  • 1Hua- Ping ZHANG, Qun LIU. et al, Chinese Name Entity Recognition Using Role Model[ J]. Special issue ''Word Formation and Chinese Language processing'' of the International Journal of Computational Linguistics and Chinese Language Processing, 2003, 8(2):2
  • 2Craig G. Nevill - Manning, Ian H. Witten. Identifying Hierarchical Structure in Sequences: A linear - time algorithm [J]. Journal of Artificial Intelligence Research, 1997, 7:67- 82

共引文献58

同被引文献13

引证文献1

二级引证文献15

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部