摘要
中日韩文拼写的复杂性对于计算语言学工具的开发者,尤其是在智能信息检索方面,是一个特殊的挑战。由于这些语言没有标准的正字法,特别是由于日语拼写的高度不规则性,使这些困难变得更加突出。本文着重于中日韩文拼写变异的类型,对这一语言学问题做一个简要的分析并论述词汇数据库在排歧的过程中起重要作用的原因。
The orthographical complexity of Chinese, Japanese and Korean (CJK) poses a special challenge to the developers of computational linguistic tools, especially in the area of intelligent information retrieval. These difficulties are exacerbated by the lack of a standardized orthography in these languages, especially the highly irregular Japanese orthography. This paper focuses on the typology of CJK orthographic variation, provides a brief analysis of the linguistic issues, and discusses why lexical databases should play a central role in the disambiguation process,
出处
《广东外语外贸大学学报》
2005年第B11期112-115,共4页
Journal of Guangdong University of Foreign Studies
关键词
异形词
信息检索
排歧
日语
汉语
韩文
orthographic variants, information retrieval, disambiguation, Japanese, Chinese, CJK