摘要
针对中文短文本信息量少、特征稀疏等特点,面向微博短文本进行情感分类研究,为了更好地提取短文本情感特征,从评论转发等上下文内容中挖掘具有语义递进关系的语料对原文本进行扩展,并抽取具有潜在感情色彩的特征词,采用Word2vec计算词语相似度以进行候选特征词扩展,最后引入深度信念网络(Deep Belief Network,DBN)对候选特征词进行深度自适应学习。在COAE(Chinese Opinion Analysis Evaluation)2015任务评测数据集上的实验表明,该方法能够有效地缓解短文本特征稀疏问题,并且能够较为准确地挖掘情感特征,提高情感分类的准确率。
This paper put forward the opinion recognition method on microblog short text,which contains a small amount of information,and the feature is sparse.The review and repost information of microblog were used to reconstruct the original microblog text.The tool of Word2 vec was adopted to cluster the similar sentiment word for feature extension.And also the feature was learned by deep belief network,which achieves the high-quality sentiment feature.The experimental result on the data of COAE(Chinese opinion analysis evaluation)2015 denotes that our method alleviates the problem of feature sparseness and also more effective sentimental features are mined.The system performance is improved with the precision of 64.1%.
出处
《计算机科学》
CSCD
北大核心
2017年第10期283-288,共6页
Computer Science
基金
国家科技支撑计划子课题(2013BAH21B02-01)
北京市自然科学基金资助项目(4153058)
上海市智能信息处理重点实验室开放基金(IIPL-2014-004)资助
关键词
情感挖掘
短文本
特征扩展
深度信念网络
Opinion mining ,Sho r t te x t ,F e a tu re extension,Deep belief netw o rk