摘要
面向中文信息处理的粤方言字规范的制定是粤语语料库建设得以实施的前提,它对粤语的信息处理将会起到重要的推动作用。好的规范能为粤语多用途语料库、文语转换、机器翻译、粤语教学、粤语通讯等方面提供服务,还可以拓展中文信息处理的领域,为方言资源的信息化利用提供示范。粤方言字规范的主要内容为粤方言字的定性、异体字的规范(定形)、多音字的规范(定音)、简繁体的规范(定体)以及粤方言字编码的规范(定码)。规范的主要成果可体现为《面向中文信息处理的粤方言字规范表》、《信息处理用粤方言专用字符全集》及实用粤语输入法软件等方面。
Setting up a standard for Cantonese characters oriented to Cantonese information processing is the foundation of establishing a Cantonese corpus. It therefore will play an important role in Cantonese information processing. A Good standard would serve multifunctional Cantonese corpus, conversion of text and speech, computer translation, Cantonese teaching, Cantonese communication, it would also extend the range of Chinese information processing, and set a good example for information processing of Chinese dialects. The standardization of Cantonese characters involves the following aspects. 1. the definition of Cantonese characters: 2. standardization of polyphones; 3. standardization of simple and complex characters; 4. standardization of coding Cantonese characters. The targeted outcome of this work can be A form of the standardized Cantonese characters for Cantonese information processing, Cantonese characters oriented to Cantonese information processing and practical software for Cantonese input.
出处
《语言教学与研究》
CSSCI
北大核心
2014年第4期107-112,共6页
Language Teaching and Linguistic Studies
基金
教育部人文社会科学研究规划基金项目"信息处理用粤语字词的标准和规范"(11YJA740070)及"明末以来广东粤闽客方言字的演变"(12YJA740117)的阶段性成果
关键词
粤方言字
中文信息处理
规范
Cantonese character
Cantonese information processing
standardization