摘要
以情感依存元组(EDT)作为中文情感表达的基本结构,把新闻文本主题情感倾向性判别任务分成主题识别、情感倾向性分析和主客观分类三个逐层递进的子任务。在主题识别前先对TF-IDF方法进行改进,再结合基于交叉熵方法提取主题特征词,同时考虑了新闻文章标题的主题表征作用,将标题词纳入主题特征集;然后基于空间向量模型计算句子与主题特征向量的相似度,在此基础上考虑句子位置、长度及句子与标题的相似度,计算句子的主题相关度以抽取主题句;最后建立情感依存元组判别模型计算主题句的情感,采用主、客观分类规则筛选出新闻倾向关键句。本方法在COAE 2014评测中各项指标皆逼近最好成绩,表明基于情感依存元组的分类方法具有较高的分类性能。
Taking the emotional dependency tuple (EDT) as the basic structure of Chinese emotional expression, the news text theme emotion recognition task was divided into Ihree progressive sub-tasks: topics identification, emotional tendentiousness analysis, subjective and objective classification. TF-IDF method was improved before identifying the topic, and then the cross-entropy-based method was combined to extract themes feature words. The topic representation of the news title was taken into consideration at the same time, and the title words were put into the theme feature set. The similarity between sentence and the topic feature vector was calculated based on the vector space model. Some sta- tistical rules such as sentence position, sentence length and sentence' s similarity with title were added on this foundation to get topic sentences. Finally, the emotional dependency tuple discriminant model was established to calculate sen- tences emotion and the subjective and objective judgment rule were used to filter out the tendency key sentence. The ap- proaching to the best results of experiment based on COAE 2014 evaluation data shows that the classification method based on the EDT has high classification performance.
出处
《山东大学学报(理学版)》
CAS
CSCD
北大核心
2014年第12期1-6,11,共7页
Journal of Shandong University(Natural Science)
基金
湖南省自然科学基金资助项目(11JJ6047,13JJ4076)
湖南省教育厅优秀青年项目(13B101)
南华大学重点学科和创新团队建设基金资助项目
衡阳市科技局科技计划项目(2013KG66,2013KG67)
关键词
情感分析
情感依存元组
主题情感
倾向关键句
sentiment analysis
emotional dependency tuple
theme emotional
tendency key sentence