期刊文献+

基于TF-IDF的古籍文本内容特征提取方法 被引量:2

下载PDF
导出
摘要 本文结合自然语言处理技术,以《庄子》内七篇文本为例,计算词频和逆文本频率指数,进而智能化地得到了文本的字频分布和不同篇目的文本内容特征信息。该方法意在尝试运用计算机技术辅助古籍研究,取得了较好的效果。
出处 《电子技术与软件工程》 2019年第17期130-131,共2页 ELECTRONIC TECHNOLOGY & SOFTWARE ENGINEERING
  • 相关文献

参考文献2

二级参考文献18

  • 1Zipf G K. The psycho-biology of language[M]. Boston: Houghton Mifflin, 1935.
  • 2Zipf G K. Human behavior and the principal of least effort[M]. Cambridge, MA: Addison-Wesley, 1949.
  • 3Ha L Q, Stewart D, Hanna P. Zipf and type-token rules for the English, Spanish, Irish and Latin languages[EB/ OL ]. [ 2008-12-01 ]. Web Journal of Formal Computational & Cognitive Linguistics, 2006. http:// fccl. ksu. ru/issue8/ha_fccl_zipf. pdf.
  • 4Hatzigeorgiu N, Mikros G, Carayannis G. Word length, word frequencies and Zipf's law in the Greek language [J]. Journal of Quantitative Linguistics,2001,8(3):175.
  • 5Jayaram B D, Vidya M N. Zipf's law for Indian languages [J]. Journal of Quantitative Linguistics, 2008,15 (4) : 293.
  • 6Smith R D. Investigation of the Zipf-plot of the extinct Meroitic language[J]. Glottometries, 2007, 15 : 53.
  • 7Choi S W. Some statistical properties and Zipf's law in Korean text corpus [J ]. Journal of Quantitative Linguistics, 2000,7(1): 19.
  • 8Dalkilic G, Ceebi Y. Zipf's law and Mandelbrot's constants for Turkish language using Turkish corpus (TurCo)[J]. Leeture Notes in Computer Seienee, 2005, 3261:273.
  • 9Li Wentian. Random texts exhibit Zipf's-law-like word frequency distribution [ J ]. IEEE Transactions in Information Theory, 1992,38 (6) : 1842.
  • 10Kanter I, Kessler D A. Markov processes: linguistics and Zipf's Law[J]. Physical Review Letters, 1995, 74 (22) : 4559.

共引文献48

同被引文献23

引证文献2

二级引证文献6

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部