摘要
前沿的自然场景文本检测方法大多基于全卷积语义分割网络,利用像素级分类结果有效检测任意形状的文本,其主要缺点是模型大、推理时间长、内存占用高,这在实际应用中限制了其部署.提出一种基于信息熵迁移的自蒸馏训练方法(Self-distillation via entropy transfer,SDET),利用文本检测网络深层网络输出的分割图(Segmentation map,SM)信息熵作为待迁移知识,通过辅助网络将信息熵反馈给浅层网络.与依赖教师网络的知识蒸馏(Knowledge distillation,KD)不同,SDET仅在训练阶段增加一个辅助网络,以微小的额外训练代价实现无需教师网络的自蒸馏(Self-distillation,SD).在多个自然场景文本检测的标准数据集上的实验结果表明,SDET在基线文本检测网络的召回率和F1得分上,能显著优于其他蒸馏方法.
Most of the state-of-the-art text detection methods in natural scenes are based on full convolutional network,which can effectively detect arbitrary shape text by using the pixel level classification results from the segmentation network.The main defects of these methods,i.e.large size of the networks,time-consuming forward reasoning and large memory occupation,hinder their deployment in practical applications.In this paper,we propose selfdistillation via entropy transfer(SDET),which takes the information entropy of the segmentation map(SM)output by the deep layers of the text detection network as the knowledge to be transferred,and feeds it directly back into the shallow layers through an auxiliary network.Different from traditional knowledge distillation(KD)which relies on teacher network,SDET utilizes an auxiliary network in the training stage and realizes self-distillation(SD)at a small extra training cost.Experiments conducted on multiple standard datasets for natural scene text detection demonstrate that SDET significantly improves the recall rate and F1 score of the baseline text detection networks,and outperforms other distillation methods.
作者
陈建炜
杨帆
赖永炫
CHEN Jian-Wei;YANG Fan;LAI Yong-Xuan(School of Aerospace Engineering,Xiamen University,Xiamen 361005;Shenzhen Research Institute,Xiamen University,Shenzhen 518057;School of Informatics,Xiamen University,Xiamen 361005)
出处
《自动化学报》
EI
CAS
CSCD
北大核心
2024年第11期2128-2139,共12页
Acta Automatica Sinica
基金
科技创新2030——“新一代人工智能”重大项目(2021ZD0112600)
国家自然科学基金委员会面上项目(62173282,61872154)
广东省自然科学基金(2021A1515011578)
深圳市基础研究专项面上项目(JCYJ20190809161603551)资助。
关键词
自然场景
文本检测
知识蒸馏
自蒸馏
信息熵
Natural scene
text detection
knowledge distillation(KD)
self-distillation(SD)
information entropy