摘要
针对文本情感分类研究中,有情感标注的语料在不同语言中的不均衡问题,提出一种基于AdaBoost的跨语言情感资源迁移策略。首先将目标语言训练集翻译成源语言;再在联合训练集上运用AdaBoost算法;通过设置滑动窗口更新训练集,训练最优弱分类器;最后得到适用于目标语言情感识别的分类器。实验表明,从目标语言到源语言的翻译方法是可行的。基于AdaBoost的分类策略获得了优于Base Line的正确率和召回率,证明了该方法的有效性。
In the study of text sentiment classification, the corpora marked with emotion tag are typically unbalanced in different langua- ges. Aiming at this issue, the paper proposes an AdaBoost-based strategy for cross-lingual sentiment resource migration. First, the training set of target language is translated to the source language; then the AdaBoost algorithm is applied on the combined training set; and the training set is updated through setting the sliding window to train the optimal weak classifier; finally, the classifier adapted for target language senti- ment recognition is obtained. Experiments showed that the translation method from target language to source language was feasible. The classi- fication strategy based on AdaBoost achieved the precision and recall rate superior to BaseLine' s, that proved the effectiveness of the method.
出处
《计算机应用与软件》
CSCD
2015年第11期77-79,87,共4页
Computer Applications and Software
关键词
机器翻译
跨语言
情感资源迁移
ADABOOST算法
Machine translation
Cross-lingual
Migration of sentiment resource
AdaBoost algorithm