摘要
提出一种基于特征融合的细粒度教育领域情感词典构建方法。首先构建了教育领域语料库,包含正式、非正式领域情绪特征;其次提出一种融合特征的领域情绪词典构建方法,在情绪划分基础上识别词的语言概率特征以及统计概率特征,改进情感倾向点互信息,提出用于情绪分类的情感倾向点互信息算法,实现共现多分类情绪划分;最后得到细粒度教育领域情感词典,词典扩充至39 138个情绪词。实验表明:使用所提出方法构建的教育领域情绪词典除情绪“怒”以外,各类别F1综合指标均高于78.09%,整体性能良好。与通用词典相比,宏平均准确率、宏召回率和宏F1分别提升了21.95%、2.50%和13.01%,表明该融合特征方法能有效提取领域特征进而完成细粒度领域词典构建。
This paper presents a method for constructing a fine-grained Sentiment Lexi-con in Education to address specific emotional issues in sentiment analysis of educational feedback texts.First,we construct an educational domain corpus,which contains emo-tional features in both formal and informal domains.Second,a fusion-based method is proposed to construct a domain Sentiment Lexicon by identifying linguistic probability features and statistical probability features of words through sentiment classification.The proposed repetitive semantic orientation pointwise mutual information(R-SOPMI)algo-rithm enhances SO-PMI for sentiment classification,enabling co-occurrence multi-category sentiment classification.Finally,a fine-grained Sentiment Lexicon in the field of education is obtained,and the dictionary expands to 39138 emotional words.Experiment results show that except for"anger",the F1 of the emotion category of the constructed educational field emotion dictionary is all higher than 78.09%.Compared with a general dictionary,the Macro_Precision,Macro_Recall and Macro_F1 increased by 21.95%,2.50% and 13.01%,respectively.The fusion feature method effectively extracts domain features,facilitating the construction of a comprehensive fine-grained domain dictionary.
作者
陈俊
席宁丽
李佳敏
万晓容
CHEN Jun;XI Ningli;LI Jiamin;WAN Xiaorong(School of Education,Guizhou Normal University,Guiyang 550025,Guizhou,China)
出处
《应用科学学报》
CAS
CSCD
北大核心
2023年第5期870-880,共11页
Journal of Applied Sciences
基金
贵州省高校人文社会科学研究项目(No.2023GZGXRW146)资助。
关键词
情感词典
情绪分类
词向量
融合特征
Sentiment Lexicon
sentiment classification
Word2vec
fusion features