期刊文献+

一种构建情感标签均衡语料库的主动学习算法 被引量:1

AN ACTIVE LEARNING ALGORITHM FOR CONSTRUCTING CORPUS WITH BALANCED EMOTION LABEL
下载PDF
导出
摘要 为提高构建的情感语料库中情感分布的均衡性,提出一种基于主动学习的算法以保持新构建训练集中情感标签的均衡。综合信息性、代表性、多样性和互补性标准于一体,通过文本的情感预测概率和特征统计量逐层筛选样本,利用互补性准则中的标签平衡措施抽取候选样本。该算法可以抑制模型选择高频次情感标签的样本,并促进低频次情感标签的样本选择,以达到情感标签平衡的目的。多标签情感分类实验表明,该算法能有效构造情感标签均衡的文本训练集,并通过所构造的训练集逐步提高文本情感分类的效果。 In order to improve the balance of emotional distribution in the constructed emotion corpus,an active learning based algorithm is proposed to maintain the balance of emotional labels in the newly constructed training.It integrated the informational,representative,diversity and complementarity standards,and selected the samples layer by layer through the emotional prediction probability and feature statistics of the text.The candidate samples were extracted by the label balancing measures in the complementarity criterion.The algorithm can suppress the model to select the samples of the high-frequency emotional labels,and promote the sample selection of the low-frequency emotional labels to achieve the purpose of emotion label balancing.Multi-label emotion classification experiments show that this algorithm can effectively construct a text training set with balanced emotion label,and gradually improve the effect of text sentiment classification through the constructed training set.
作者 时雪峰 康鑫 廖萍 任福继 Shi Xuefeng;Kang Xin;Liao Ping;Ren Fuji(School of Mechanical Engineering,Nantong University,Nantong 226019,Jiangsu,China;Faculty of Engineering,Tokushima University,Tokushima 770-8506,Tokushima-ken,Japan)
出处 《计算机应用与软件》 北大核心 2021年第7期265-270,349,共7页 Computer Applications and Software
基金 江苏高校优势学科建设工程项目(苏政办发[2018]192号) 江苏省重点研发计划项目(BE2018093) 日本学术振兴会基金项目(19K20345,19H04215)。
关键词 多标签情感分类 主动学习 标签平衡 Multi-label emotion classification Active learning Label balancing
  • 相关文献

参考文献8

二级参考文献40

共引文献551

同被引文献14

引证文献1

二级引证文献3

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部