期刊文献+
共找到2篇文章
< 1 >
每页显示 20 50 100
面向中文电子病历的词法语料标注研究 被引量:9
1
作者 蒋志鹏 赵芳芳 +1 位作者 关毅 杨锦锋 《高技术通讯》 CAS CSCD 北大核心 2014年第6期609-615,共7页
针对中文电子病历(CEMR)标注语料匮乏,目前面向中文电子病历的分词和词性标注研究仍处于空白阶段的实际情况,从中文电子病历语料的构建出发,提出了从数据预处理到语料标注的整体方案,获得了较高的标注一致性,为进行更大规模更高质量的... 针对中文电子病历(CEMR)标注语料匮乏,目前面向中文电子病历的分词和词性标注研究仍处于空白阶段的实际情况,从中文电子病历语料的构建出发,提出了从数据预处理到语料标注的整体方案,获得了较高的标注一致性,为进行更大规模更高质量的病历语料标注工作提供了指导。通过实验量化中文电子病历与开放领域语料、英文电子病历语料的词法统计差异,系统地分析了通用标注模型在中文电子病历中的错误分布,为进行适用于中文电子病历分析的自然语言处理(NLP)技术研究奠定了基础。 展开更多
关键词 中文电子病历(CEMR) 词性标注 标注一致性 语料差异 错误分析
下载PDF
A CORPUS-BASED ANALYSIS OF THE VERB "DO" USED BY CHINESE LEARNERS OF ENGLISH 被引量:2
2
作者 戚焱 《Chinese Journal of Applied Linguistics》 2006年第6期37-41,126,共6页
The present study reports on the use of the high-frequency verb do in written English performance based on the Chinese Learner English Corpus (CLEC). The native English corpus for comparison is the Louvain Corpus of N... The present study reports on the use of the high-frequency verb do in written English performance based on the Chinese Learner English Corpus (CLEC). The native English corpus for comparison is the Louvain Corpus of Native English Essays (LOCNESS). A corpus-based Contrastive Interlanguage Analysis (CIA) approach has been adopted in the study. A comparison is made between Band 4 and 8 English majors' writings in CLEC and the native college students' writings in LOCNESS. Results indicate that as far as the overall frequency of do is concerned, there is no significant difference between Band 8 English majors and the native speakers, but Band 4 English majors use less than native speakers. With regard to the different uses of do, marked differences have been found between Chinese learners and native speakers, especially when do is used as an auxiliary verb or a delexical verb. To be specific, Chinese learners show a strong tendency to underuse do as an auxiliary verb to form an interrogative, negative or inverted sentence. Moreover, they tend to overuse do as a delexical verb, allowing it more freedom to collocate with a wider range of nouns. The underlying reasons might be mother-tongue interference, intralingual transfer, and overgeneralization, etc. In conclusion, the pedagogical implications of the study are discussed and suggestions made for adopting corpus-based exercises as a way of raising learners' awareness of the use of high-frequency words. 展开更多
关键词 CORPUS high-frequency word frequency difference [BT1]
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部