期刊文献+

基于最大似然估计方法的齐普夫定律验证

Validation of Zipf's Law Based on Maximum Likelihood Estimation
下载PDF
导出
摘要 文章采用最大似然估计的方法对齐普夫分布曲线进行拟合。该方法对齐普夫定律的词谱分布,利用KS检验的方法得到在双对数坐标下拟合度最优的直线。与传统的最小二乘法相比,该方法拟合结果更为准确。为了验证该方法的有效性,通过3组中英文语料实验发现,英文较好地符合齐普夫定律,中文并不太符合。 This paper proposes a method of how to calculate the slope of Zipf's law based on maximum likelihood estimation.In this method,the frequency spectrum forms of Zipf's law is adopted for mathematic reasons and the Kolmogorov-Smirnov(KS)method is used to obtain a goodness-of-fit line in dual-logarithm coordinate.Compared with the traditional least square method,the maximum likelihood estimation method is more accurate in fitting results.To validate the method,the paper conducts an experiment with three Chinese and English corpuses.The experiment shows that the English words conform with the Zipf's law better,while the Chinese words do not conform with the Zipf's law.
出处 《情报理论与实践》 CSSCI 北大核心 2012年第11期6-11,共6页 Information Studies:Theory & Application
基金 "863"计划项目"科技文献服务为主的搜索引擎研制"(项目编号:2011AA01A206) 2011年南京大学研究生科研创新基金资助项目"中英双语文本聚类技术及其应用研究"(项目编号:2011CW12)的成果之一
关键词 齐普夫定律 最大似然估计 词谱分布 Zipf's law maximum likelihood estimation word frequency distribution
  • 相关文献

参考文献19

  • 1ZIPF G K. Human behavior and the principle of least-effort [M]. Cambridge MA: Addison-Wesley, 1949.
  • 2JAYARAM B D, VIDYA M N. Zipf' s law for Indian languages [J]. Journal of Quantitative Linguistics, 2008, 15 (4) : 293- 317.
  • 3TUZZI A, POPESCU I I, ALTMANN G. Zipf' s laws in Italian texts [ J]. Journal of Quantitative Linguistics, 2009, 16 (4) : 354-367.
  • 4ADAMIC L A, HUBERMAN B A. Zipf' s law and the Intemet [J]. Glottometries, 2002 (3): 143-450.
  • 5GABAIX X. Zipf' s law for cities : an explanation [ J ]. Quar- terly Joumal of Economics, 1999, 114: 739-767.
  • 6AXTELL R L. Zipf distribution of U. S. firm sizes [ J ]. Sci- ence, 2001,293 : 1818-1820.
  • 7LI W. Zipf' s law everywhere [J] , Glottometrics, 2005 (5) : 14-21.
  • 8ROUSSEAU R, ZHANG Q. Zipf' s data on the frequency of Chinese words revisited [ J]. Scientometfics, 1992, 24 (2) : 201-220.
  • 9SHENG L, LI C. English and Chinese languages as weighted complex networks [ J ]. Physica A, 2009, 388 ( 12 ) : 2561-2570.
  • 10HA L Q, SICILIA-GARCIA E I, MING J, et al. Extension of Zipf' s law to words and phrases [ C ] //Proceedings of the 19th International Conference on Computational Linguistics ( COLING 2002), 2002 : 315-320.

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部