摘要
人类蛋白图像分类的目的是识别蛋白质细胞器中的细胞核浆、核膜等定位标签。针对蛋白质分类数据集大、多标签类别不平衡以及类间差异小等问题,结合CSPPNet与集成学习,提出一种人类蛋白质图像分类方法。该方法构建了粗细结合的CSPPNet模型,且将该模型前几层卷积生成的特征图加入空间金字塔池化层,并与模型后期卷积生成的特征图相结合,同时利用图片的整体特征和局部特征自动检测图片差异,以提高细粒度图像分类问题的精度,再通过集成学习的方法来进一步提升准确率。实验结果表明,相比经典卷积神经网络(CNN),该模型的精度与F1值均有所提升。
The purpose of classification of human protein images is to identify the localization labels such as nucleus plasma and nuclear membrane in protein organelles.To address the large scale of protein classification data sets,imbalance of multi-label categories and small differences between classes,combining CSPPNet and ensemble learning,this paper proposes a classification method for human protein images.This method constructs a CSPPNet model that combines coarse-grained identification and fine-grained identification.The feature maps generated by the first few layers of the model are added to the spatial pyramid pooling layer,and combined with the feature map generated by the later convolution of the model.The overall features and local features are used to automatically detect differences in pictures to improve the precision of fine-grained image classification,and then ensemble learning is used to further improve accuracy.The experimental results show that the accuracy and F1 value of the model are improved compared with the classic convolutional neural network(CNN).
作者
李培媛
黄迟
LI Peiyuan;HUANG Chi(College of Mathematics,Taiyuan University of Technology,Taiyuan 030024,China;School of Information and Engineering,Southwestern University of Finance and Economics,Chengdu 611130,China)
出处
《计算机工程》
CAS
CSCD
北大核心
2020年第8期235-242,共8页
Computer Engineering
基金
国家自然科学基金(61603268)。
关键词
蛋白质
亚细胞定位
图像分类
空间金字塔池化
细粒度识别
集成学习
protein
subcellular localization
image classification
Spatial Pyramid Pooling(SPP)
fine-grained identification
ensemble learning