摘要
采用最大熵模型与情感分类词典组合的方式对这种短文本用户声音来生成观点。源声的领域通过最大熵模型识别,评价的好坏通过情感模型识别,最后通过领域和评价的组合来得出最终的分类。值得注意的是分类对象具有特征多和类别多等特点,对于源声有多个观点的,可以将源声以分隔符进行拆分,短文本通过分隔符由内向外的文本层次嵌套的分类方法来进行识别源声观点,防止错误输出。结果表明针对中文短文本观点分类,分类器融合是一种高效的分类组合算法。
By using maximum entropy model and emotion classification dictionary combination way, the viewpoint for voice of the consumer is generated. The domain of source sound is identified by the maximum entropy model, and the evaluation of emotion is through emotional model identification. Finally through the combination of domain and evaluation, the final classification is gained. It is worth noting that the classification of object has the characteristics such as more feature and categories. As to the sound source with multiple points of view, source sounds can be separated by separators. Through classification method of the separator from inside-out by hierarchically nested text, the short text identifies viewpoint of sound source to prevent the error output. The results show that as to this classification of Chinese short text, the classifier fusion is an effective combination portfolio algorithm.
出处
《现代工业经济和信息化》
2017年第3期95-97,99,共4页
Modern Industrial Economy and Informationization
关键词
文本层次分类
最大熵模型
情感词典
text hierarchical classification
maximum entropy model
emotional dictionary