摘要
针对领域情感词典的局限性,提出一种中文领域情感词典自适应学习方法。从中文基础情感词典中选取少量种子词,采用基于CBOW模型和基于句法规则两种抽取方法,对领域语料库进行候选情感词的抽取,通过改进的SO_PMI算法对得到的候选情感词进行情感极性判定,形成领域正负情感词典。实验结果表明,该方法能够自适应生成领域情感词典,情感词识别准确率较高,该模型在中文情感分析应用中取得了较好的效果。
Aiming at the limitation of domain sentiment dictionary,an adaptive learning method of sentiment dictionary in Chinese domain was proposed.A small number of seed words was selected from the Chinese basic emotion dictionary,and the candidate vocabulary words were extracted from the domain corpus based on the CBOW model and the syntax-based rules.The candidate emotion words were judged using the improved SO_PMI algorithm.The polarity was determined to form a positive and negative emotion dictionary in the field.Experimental results show that the proposed method can adaptively generate domain sentiment lexicon,and the accuracy of sentiment recognition is high.Therefore,the model can achieve better results in Chinese sentiment analysis applications.
作者
叶霞
曹军博
许飞翔
郭鸿燕
尹列东
YE Xia;CAO Jun-bo;XU Fei-xiang;GUO Hong-yan;YIN Lie-dong(Academy of Combat Support,Rocket Force University of Engineering,Xi’an 710025,China;Beijing Institute of Computer Technology and Applications,Second Academg of China Aerospace Science and Industry Corporation,Beijing 100039,China)
出处
《计算机工程与设计》
北大核心
2020年第8期2231-2237,共7页
Computer Engineering and Design
基金
国家自然科学基金项目(61702525)。