摘要
本文收集了《蝶恋花(庭院深深深几许)》等六首作者存在争议的作品所涉及作者的其他作品作为训练语料,经过分词和特征提取后,使用朴素贝叶斯分类器学习作者特征,随后对争议作品进行作者判断。判断结果详细显示了各首争议作品的作者可能性,除《生查子·元夕》外,其余判断结果与文献考证的契合度较高。本文还收集了三组唐朝并称诗人——“元白”“皮陆”“小李杜”的作品,使用朴素贝叶斯分类器进行作者判断,取得了较好的效果,进一步验证了该方法在作者检测上的有效性。
This paper collects six poems,whose authors are disputed,including“Deep,Deep the Courtyard”to the tune of Butterfly in Love with Flowers.Taking other works of the alleged authors as corpus,we try to analyze the word segmentation and summarize their features by applying Naive Bayes classifier,so as to make judgments on the authorship of the disputed works.All the probabilities of authorship for the disputed works can be shown in details,which were highly consistent with the literature research except for the poem of“Lantern Festival”to the tune of Shengzhazi.This paper also collects works of three pairs of the Tang poets often mentioned in the same breath:Yuan Zhen and Bai Juyi,Pi Rixiu and Lu Guimeng,and Li Shangyin and Du Mu.The desirable results have been achieved by using the Naive Bayes classifier to judge the authorship,further validating the effectiveness of this method in authorship detection.
作者
黄玮
冉启斌
Huang Wei;Ran Qibin
出处
《文学与文化》
2023年第3期95-104,共10页
Literature and Culture Studies
关键词
作者争议
作品风格特征
朴素贝叶斯分类器
古诗词
Dispute on Author
Stylistic Features of Works
Naive Bayes Classifier
Ancient Chinese Poetry