期刊文献+

基于大规模语料的英语词汇重复率研究 被引量:1

A Large-scale Corpus-based Study of English Vocabulary Repeat Rate
原文传递
导出
摘要 本研究将英国国家语料库(BNC)和美国国家语料库(ANC)大规模海量笔语语料随机分为60个实验组和41个检验组,总计83,864个语篇对,通过计算机编程的手段对英语词汇重复率进行动态分析,建立了估算词汇重复率的数学模型,并运用60个实验组对此公式进行了检验。研究发现,词汇重复率曲线的分布较有规律,极值较少;词汇重复率变化曲线为非线性;词汇重复率预测公式误差较小,可以用于估算不同长度的真实语篇英语词汇重复率的理论数值。 This research randomly divided large-scale w ritten British National Corpus( BNC) and American National Corpus( ANC) into the experimental set and test set,w ith the former containing 60 samples and the latter 41 samples,totaling 83,864 pairs of texts. A dynamic analysis w as made to study the English vocabulary repeat rate by means of computer programs. A mathematic model to calculate vocabulary repeat rate w as established and then tested based on the 60 samples in the experimental set. Results show ed that the distribution curves for vocabulary repeat rates w ere nonlinear and regular,w ith only a few outliers; the inferred formula experienced a very small margin of error in the calculation of theoretical repeat rate,and can be used to estimate the theoretical values of vocabulary repeat rate for authentic English texts of different lengths.
出处 《外语与外语教学》 CSSCI 北大核心 2016年第4期87-95,105,共10页 Foreign Languages and Their Teaching
基金 2012年教育部人文社科项目"英语动态篇际词汇重复率研究"(项目编号:12YJA740116)的阶段性成果
关键词 词汇 词汇重复率 Brunet模型 95%置信区间 vocabulary vocabulary repeat rate Brunet's model 95% confidence interval
  • 相关文献

参考文献28

  • 1Biber, D. 1990. Methodological issues regarding corpus-based analyses of linguistic variation I J]. Literary and Linguistic Computing, ( 5 ) :257 - 269.
  • 2Bogaards, P. 2001. Lexical units and the learning of foreign language vocabulary [J]. Studies in Second Language Acquisition, ( 23 ) :321 - 343.
  • 3Brunet, E.1978. Le Vocabulaire de Jean Giraudoux. Structure et Evolution [M ]. Geneve :Slatkine.
  • 4Crothers, E. & P. Suppes. 1967. Experiments in Second Language Learning [ M ]. New York:Academic Press.
  • 5Devore, J. 2000. Probability and Statistics [ M ]. Pacific Grove: Brooks/Cole.
  • 6Evert, S. 2004. A simple LNRE model for random character sequences[J]. Journes lnternationales d'Analyse Statistique des Donnes Textuelles, ( 1 ) : 1 - 20.
  • 7Fan, F. 2006a. A corpus-based empirical study on inter-textual vocabulary growth [ J ]. Journal of Quantitative Linguistics, ( 1 ) :111 - 127.
  • 8Fan, F. 2006b. Models for dynamic inter-textual type-token relationship [ J ]. Glottometrics, (12) :1 - 10.
  • 9Fan, F. 2008. A corpus-based empirical study on random textual vocabulary coverage [J]. Corpus Linguistics and Linguistic Theory, ( 1 ) :1 - 17.
  • 10Fan,F. 2010. An asymptotic model for the English hapax/vocabulary ratio[ J].spectives Computational Linguistics, (4) :631 -637.

二级参考文献33

共引文献55

同被引文献10

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部