摘要
针对中文情感识别中特征空间稀疏度和冗余度较高的特点,从集成学习视角出发,提出一种基于动态特征选择机制的情感识别方法.该方法首先采用核平滑方法构建特征子集划分的维度分布和特征空间的重要度分布,然后根据这两种分布函数,自适应划分特征空间,形成多个不同粒度的子空间,然后训练对应的基分类器,最后使用多数投票法的融合策略构造集成识别模型.在校园BBS评论数据上与其他基准算法进行对比实验,结果表明该算法在查全率和查准率等方面均优于其他算法,有效地提高了情感识别的准确性和鲁棒性.
Due to the high sparsity and redundancy of feature space in Chinese sentiment recognition research, a dynamic feature selec- tion mechanism based sentiment recognition method is proposed in terms of ensemble learning. This method first constructs the distri- bution of feature subset dimensionality and importance distribution of feature space via kernel smoothing method. Then the whole fea- ture space is adaptively divided into multiple subspaces with different granularity, and a base classifier is built on corresponding sub- space. Finally, a majority voting method is employed to fuse these base classifiers to form an ensemble recognition model. The ex- periments are conducted on the review posts collected from the campus BBS. The results show that the method achieves a considerable improvement in both recall and precision rate compared with other benchmark methods ( random feature subspace, random feature se- lection based on importance distribution of feature space and linear support vector machine} in sentiment recognition.
出处
《小型微型计算机系统》
CSCD
北大核心
2014年第2期358-364,共7页
Journal of Chinese Computer Systems
基金
国家十二五科技支撑计划项目(2011BAK08B03
2011BAK08B05)资助
教育部新世纪优秀人才支持计划项目(NCET-11-0654)资助
国家"核高基"重大专项基金项目(2010ZX01045-001-005)资助
关键词
情感识别
动态特征选择
特征子空间
集成学习
核平滑
sentiment recognition
dynamic feature selection
feature subspace
ensemble learning
kernel smoothing