期刊文献+

采用动态特征选择的中文情感识别研究 被引量:4

Research on Chinese Sentiment Recognition Using Dynamic Feature Selection Method
下载PDF
导出
摘要 针对中文情感识别中特征空间稀疏度和冗余度较高的特点,从集成学习视角出发,提出一种基于动态特征选择机制的情感识别方法.该方法首先采用核平滑方法构建特征子集划分的维度分布和特征空间的重要度分布,然后根据这两种分布函数,自适应划分特征空间,形成多个不同粒度的子空间,然后训练对应的基分类器,最后使用多数投票法的融合策略构造集成识别模型.在校园BBS评论数据上与其他基准算法进行对比实验,结果表明该算法在查全率和查准率等方面均优于其他算法,有效地提高了情感识别的准确性和鲁棒性. Due to the high sparsity and redundancy of feature space in Chinese sentiment recognition research, a dynamic feature selec- tion mechanism based sentiment recognition method is proposed in terms of ensemble learning. This method first constructs the distri- bution of feature subset dimensionality and importance distribution of feature space via kernel smoothing method. Then the whole fea- ture space is adaptively divided into multiple subspaces with different granularity, and a base classifier is built on corresponding sub- space. Finally, a majority voting method is employed to fuse these base classifiers to form an ensemble recognition model. The ex- periments are conducted on the review posts collected from the campus BBS. The results show that the method achieves a considerable improvement in both recall and precision rate compared with other benchmark methods ( random feature subspace, random feature se- lection based on importance distribution of feature space and linear support vector machine} in sentiment recognition.
出处 《小型微型计算机系统》 CSCD 北大核心 2014年第2期358-364,共7页 Journal of Chinese Computer Systems
基金 国家十二五科技支撑计划项目(2011BAK08B03 2011BAK08B05)资助 教育部新世纪优秀人才支持计划项目(NCET-11-0654)资助 国家"核高基"重大专项基金项目(2010ZX01045-001-005)资助
关键词 情感识别 动态特征选择 特征子空间 集成学习 核平滑 sentiment recognition dynamic feature selection feature subspace ensemble learning kernel smoothing
  • 相关文献

参考文献6

二级参考文献88

  • 1刘永丹,曾海泉,李荣陆,胡运发.基于语义分析的倾向性文本过滤[J].通信学报,2004,25(7):78-85. 被引量:34
  • 2朱嫣岚,闵锦,周雅倩,黄萱菁,吴立德.基于HowNet的词汇语义倾向计算[J].中文信息学报,2006,20(1):14-20. 被引量:326
  • 3徐琳宏,林鸿飞,杨志豪.基于语义理解的文本倾向性识别机制[J].中文信息学报,2007,21(1):96-100. 被引量:120
  • 4金千里,赵军,徐波.弱指导的统计隐含语义分析及其在跨语言信息检索中的应用[C]//全国第七届计算语言学联合学术会议.北京:清华大学,2003-08-01:527-533.
  • 5刘群 李素建.基于《知网》的词汇语义相似度的计算.中文计算语言学,2002,17(2):59-76.
  • 6Pang B, Lee L, Vaithyanathan S. Thumbs up? Sentiment classification using machine learning techniques, In Proc. EMNLP2002, Philadelphia, USA, Jnl. 7-12, 2002, pp.79-86.
  • 7Cui H, Mittal V, Datar M. Comparative experiments on sentiment classification for online product reviews. In Proc. AAAI2006, Boston, USA, Jul. 16-20, 2006, pp.1265-1270.
  • 8Kim S, Hovy E. Identifying opinion holders for question answering in opinion texts. In Proc. Workshop on Question Answering in Restricted Domains ( AAAI 2005), Pittsburgh, USA, Jul. 9-13, 2005, pp.100-107.
  • 9Ku L, Liang Y, Chen H. Opinion extraction, summarization and tracking in news and blog corpora. In Proc. the Spring Symposia on Computational Approaches to Analyzing Weblogs ( AA AI-CAA W 2006), Stanford University, USA, Mar. 27-29, 2006, pp.100-107.
  • 10Blitzer J, Dredze M, Pereira F. Biographies, Bollywood, boom-boxes and blenders: Domain adaptation for sentiment classification. In Proc. ACL 2007, Prague, Czech, Jun. 23-30, 2007, pp.440-447.

共引文献134

同被引文献41

  • 1数据堂[EB/OL].[2013—03—30].hffp://www.datatang.com/.
  • 2NiteshV. Chawla,Nathalie Japkowicz ? Aleksander Kolcz.Editorial : Special Issue on Learning from Imbalanced DataSets [J], ACM SIGKDD Exploration newsletter, 2004,6(1):1-6.
  • 3FormanG. An extensive empirical study of feature selec-tion metrics for test classification [J]. Journal of MachineLearning Research,2003(3) : 1289-1305.
  • 4MladenicD,Grobelnik M. Feature selection for unbalancedclass distribution and Move Bayes[C]//Proceedings of six-teenth International Conference on Machine Learning(ICML 1999). Bled Slovenia,1999:258-267.
  • 5YangY? Pedersen J O. A Comparative Study on FeatureSelection in Text Categorization [C]//Proceedings of thefourteenth International Conference on Machine Learning(ICML 1997). Mashville Tennessee USA?1997:412-420.
  • 6ZhengZ*Wu X,Srihari R. Feature Selection for Text Cat-egorization on Imbalanced Data[J]. ACM SIGKDD Ex-plorations newsletter, 2004 (1) : 80-89.
  • 7Zheng Z, Srihari R. Optimally Combining Positive andNegative Features for Text Categorization[C]//Proceed-ings of the ICML,03 Workshop on Learning from Imbal-anced Data Sets. Washington DC USA,2003:1-8.
  • 8ChenX?Michael Wasikowski. FAST: A ROC-based Fea-ture Selection Metric for Small Samples and ImbalancedData Classification Problems [ C ]//KDD 1 08. NevadaUSA, 2008:124-132.
  • 9Wang K, Bunjira Makond * Wang K. An Improved Sur-vivability Prognosis of Breast Cancer by Using Samplingand Feature Selection Technique to Solve Imbalanced Pa-tient Classification Data[J]. BMC Medical Informaticsand Decision Making?2013 : 1-14.
  • 10YueX,Mo H*Chi Z. Immune-inspired incremental fea-ture selection technology to data streams[J]. Applied softComputing. 2008,8(2):1041-1049.

引证文献4

二级引证文献27

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部