摘要
针对现有机器学习方法在情感分析时,需要大量的训练数据和复杂的语言模型结构,但难以获取全文的情感问题,文中提出了一种在有限数据集的情况下,兼顾局部和全局的上下文信息情感模型结构。首先把词汇和语篇知识进行整合约束,然后通过后验正则化应用在条件随机场模型,最后得到句子的情感倾向。通过多组实验分析,本文使用的方法与CRF模型对情感句分类相比有明显的提升。
Now,most existing machine learning approaches need large training data and complex linguistic structures,but often fail to capture the non-local contextual sentiment. Therefore,a structure of the model can take into account both local and global contextual information under the environment of limit data sets. First,it encodes lexical and discourse knowledge as expressive constrains; then,it integrates them into the learning of condition random field via posterior regularization,finally,it gets the emotional tendency of the sentence. Through multi-sets of experiments,the method has significant improvement on sentiment classification in the CRF model.
出处
《信息技术》
2016年第4期135-138,143,共5页
Information Technology
关键词
情感分类
条件随机场
后验正则化
sentiment classification
CRF
posterior regularization