期刊文献+

一种全面的少标签样本情形下的知识蒸馏方法

Data Analysis and Knowledge Discovery Knowledge Distillation with Few Labeled Samples
原文传递
导出
摘要 【目的】针对在自然语言处理中有标签样本稀缺和高性能的大规模参数量模型训练成本高的问题,本文在有标签样本不足情况下,通过知识蒸馏方法,提升在高性能大参数量模型指导下的小参数量模型性能。【方法】通过噪声提纯方法,从无标签数据中获取有价值的数据并赋予其伪标签,增加有标签样本数量;并在传统蒸馏模型基础上增加知识回顾机制和助教模型,实现从大参数量模型到小参数量模型的全面的知识迁移。【结果】在IMDB、AG_NEWS、Yahoo!Answers数据集的文本分类和情感分析任务上,使用原数据集规模的5%作为有标签数据,准确率表现与使用全部数据训练的传统蒸馏模型分别仅相差1.45%、2.75%、7.28%。【局限】仅针对自然语言处理中的文本分类以及情感分析任务进行实验研究,后续可进一步扩大任务覆盖面。【结论】本文所提方法在少量有标签样本的情形下,实现了较好的蒸馏效果,使得小参数量模型的性能得到显著提升。 [Objective]This paper uses the knowledge distillation method to improve the performance of a smallparameter model guided by the high-performance large-parameter model with insufficient labeled samples.It tries to address the issue of sample scarcity and reduce the cost of large-parameter models with high performance in natural language processing.[Methods]First,we used noise purification to obtain valuable data from an unlabeled corpus.Then,we added pseudo labels and increased the number of labeled samples.Meanwhile,we added the knowledge review mechanism and teaching assistant model to the traditional distillation model to realize comprehensive knowledge transfer from the large-parameter model to the small-parameter model.[Results]We conducted text classification and sentiment analysis tasks with the proposed model on IMDB,AG_NEWS,and Yahoo!Answers datasets.With only 5% of the original data labeled,the new model’s accuracy rate was only 1.45%,2.75%,and 7.28% less than the traditional distillation model trained with original data.[Limitations]We only examined the new model with text classification and sentiment analysis tasks in natural language processing,which need to be expanded in the future.[Conclusions]The proposed method could achieve a better distillation effect and improve the performance of the small-parameter model.
作者 刘彤 任欣儒 尹金辉 倪维健 Liu Tong;Ren Xinru;Yin Jinhui;Ni Weijian(College of Computer Science and Engineering,Shandong University of Science and Technology,Qingdao 266590,China)
出处 《数据分析与知识发现》 EI CSCD 北大核心 2024年第1期104-113,共10页 Data Analysis and Knowledge Discovery
基金 山东省自然科学基金项目(项目编号:ZR2022MF319) 山东科技大学青年教师教学拔尖人才培养项目(项目编号:BJ20211110)和山东科技大学专业学位研究生教学案例库建设项目的研究成果之一。
关键词 知识蒸馏 半监督学习 少标签样本 文本分类 Knowledge Distillation Semi-Supervised Learning Few Labeled Samples Text Classification
  • 相关文献

参考文献1

共引文献7

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部