摘要
该研究利用Python语言设计程序,对《红楼梦》程乙本和前脂后程本的内容进行无干预的遍历搜索,一次性创建1-6个字符长度的词典;将两个版本的120回分别分为前、中、后三部分,利用单因子方差分析和杜凯氏方法验证,分别统计出每个版本中三部分之间分布频度具有显著性差异的1-6个字符长度的字词,并做出每个字词在每回中出现次数与回数的关系图,得到了一套能够将前、中、后40回的差异直观而全面地呈现出来的数据库,并对其中一些字词进行了分析。结果显示:程乙本和前脂后程本前80与后40回的差异,比前40与中40回的差异更为显著;前40与中40回为同一作者,前80与后40回不是同一作者。
This research used a Python program to conduct an unrestrained traversal search on the content of A Dream of Red Mansions.Two different editions were analyzed:the Cheng Yi edition and a composite Zhi/Cheng edition(by combining the 80 chapters in the Zhi Ping edition with the last 40 chapters in the Cheng Yi edition).For both editions,the 120 chapters were divided into three parts:the first,middle,and last 40 chapters.Looking directly for unevenly distributed words using the one-way analysis of variance and the Tukey’s test,the statistics of all words 1-6 characters in length having a significant difference of occurrence between the three parts in each of the two editions were obtained along with line charts plotting the number of occurrences by chapter.The database thus created can provide an intuitive and comprehensive display of the differences between the first,middle,and last 40 chapters.The analysis of specific words in the database then followed.The results show that for both the Cheng Yi and composite Zhi/Cheng editions,the differences between the first 80 and last 40 chapters are more significant than those between the first 40 and middle 40 chapters.The first 40 and middle 40 chapters belong to the same author.The first 80 and last 40 chapters are not by the same author.
出处
《数据》
2023年第2期54-59,共6页
DATA
基金
中国香港科技大学本科生研究计划。