期刊文献+

唐代以来汉语文学作品中的字频演变 被引量:4

The Evolution of Character Using Frequency in Chinese Literature Since the Tang Dynasty
下载PDF
导出
摘要 研究历史上各个时期汉语文学作品中的字频分布具有重要意义,可以帮助我们更加深入研究汉语言的历史演变,但这在以前的语言统计工作中是缺乏的。该文对唐代以来的文学作品按不同时期进行分类建立语料库,字频分析的结果表明自唐代以来人们使用汉字的习惯处于不断变化之中,时期越相近,汉字的使用习惯就更具一致性。从分布上看,不同时期的字频都可以用一个指数截断的幂律函数进行很好的拟合,随着历史的发展,幂律性质不断衰减而指数性质不断增强。 It is meaningful to investigate character frequency distribution among Chinese literatures across different periods since it could help us to know more about how Chinese language evolves over time. This paper presents the change of Chinese character frequency distribution since Tang Dynasty, by counting the character frequencies of 5 classical as well as modern Chinese literatures. It is clear that two character frequency distributions are more similar when they are derived from closer periods, and all the distributions could be well fitted by exponential power law functions. And the exponential property is increasing while the power law feature is decreasing over time.
出处 《中文信息学报》 CSCD 北大核心 2011年第3期93-97,共5页 Journal of Chinese Information Processing
基金 北京师范大学青年教师科研基金资助项目
关键词 汉语文学作品 字频分布 指数截断的幂律 Chinese literature character frequency distrihutiont exponential truncated power law
  • 相关文献

参考文献12

  • 1Zipf G K. The Psycho-Biology of Language[M]. Bos- ton: Houghton Mifflin, 1935.
  • 2Zipf G K. Human Behavior and the Principal of Least Effort[M]. Cambridge : Addison Wesley, MA, 1949.
  • 3Ha L Q, Stewart D, Hanna P. Zipf and Type-Token rules for the English, Spanish, Irish and Latin langua- ges[J]. Web Journal of Formal Computational & Cog- nitive Linguistics, 2006, http://fccl. ksu. ru/issue8/ ha_fccl_zipf2 pdf.
  • 4Hatzigeorgiu N, Mikros G, Carayannis G. Word length, word frequencies and Zipf's law in the Greek language [J]. Journal of Quantitative Linguistics, 2001,8 (3) : 175- 185.
  • 5Jayaram B D, Vidya M N. Zipf's law for Indian lan- guages [J]. Journal of Quantitative Linguistics, 2008, 15(4) : 293-317.
  • 6Dalktltc G, Cebi Y. Zipf's caw and mandelbrot's con- stants for Turkish language using Turkish corpus (TurCo) [J]. Lecture Notes in Computer Science, 2005,3261: 273-282.
  • 7Smith R D. Investigation of the Zipf-plot of the extinct Meroitie language [J]. Glottometrics, 2007, 15 : 53- 61.
  • 8Zhao Kaihua. Physics nomenclature in China[J]. A- merican Journal of Physics 58(5) (May 1990) 449- 452.
  • 9关毅,王晓龙,张凯.现代汉语计算语言模型中语言单位的频度—频级关系[J].中文信息学报,1999,13(2):8-15. 被引量:15
  • 10Ha L Q, Sicilia-Garcia E I, Ji Ming. Extension of Zipf's law to words and character N-gram for English and Chinese[J]. Computational Linguistics and Chi- nese Language Processing, 2003,8 ( 1 ) : 77-101.

二级参考文献1

  • 1Li W,IEEE Trans Information Theory,1992年,38卷,6期,1842页

共引文献14

同被引文献61

引证文献4

二级引证文献14

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部