为解决图像分类任务中模型结构固化、产生巨大内存消耗、时间消耗的问题,提出一种增量式深度神经网络(IDNN)。输入样本通过聚类算法激活不同簇并被分别处理:如果新样本激活已有簇,则更新该簇参数;否则为新簇开辟分支,并训练独立特征集。...为解决图像分类任务中模型结构固化、产生巨大内存消耗、时间消耗的问题,提出一种增量式深度神经网络(IDNN)。输入样本通过聚类算法激活不同簇并被分别处理:如果新样本激活已有簇,则更新该簇参数;否则为新簇开辟分支,并训练独立特征集。在Caltech-101、ORL Face、ETH-80数据库的验证结果表明,该系统能自动调整网络结构,适用于轮廓、纹理、视角等不同环境的增量式学习,例如在Caltech-101库分类任务中准确率超出VGGNet 5.08%、Alex Net 3.44%。展开更多
Deep Learning is a powerful technique that is widely applied to Image Recognition and Natural Language Processing tasks amongst many other tasks. In this work, we propose an efficient technique to utilize pre-trained ...Deep Learning is a powerful technique that is widely applied to Image Recognition and Natural Language Processing tasks amongst many other tasks. In this work, we propose an efficient technique to utilize pre-trained Convolutional Neural Network (CNN) architectures to extract powerful features from images for object recognition purposes. We have built on the existing concept of extending the learning from pre-trained CNNs to new databases through activations by proposing to consider multiple deep layers. We have exploited the progressive learning that happens at the various intermediate layers of the CNNs to construct Deep Multi-Layer (DM-L) based Feature Extraction vectors to achieve excellent object recognition performance. Two popular pre-trained CNN architecture models i.e. the VGG_16 and VGG_19 have been used in this work to extract the feature sets from 3 deep fully connected multiple layers namely “fc6”, “fc7” and “fc8” from inside the models for object recognition purposes. Using the Principal Component Analysis (PCA) technique, the Dimensionality of the DM-L feature vectors has been reduced to form powerful feature vectors that have been fed to an external Classifier Ensemble for classification instead of the Softmax based classification layers of the two original pre-trained CNN models. The proposed DM-L technique has been applied to the Benchmark Caltech-101 object recognition database. Conventional wisdom may suggest that feature extractions based on the deepest layer i.e. “fc8” compared to “fc6” will result in the best recognition performance but our results have proved it otherwise for the two considered models. Our experiments have revealed that for the two models under consideration, the “fc6” based feature vectors have achieved the best recognition performance. State-of-the-Art recognition performances of 91.17% and 91.35% have been achieved by utilizing the “fc6” based feature vectors for the VGG_16 and VGG_19 models respectively. The recognition performance has been achieved by considering 30 sample images per class whereas the proposed system is capable of achieving improved performance by considering all sample images per class. Our research shows that for feature extraction based on CNNs, multiple layers should be considered and then the best layer can be selected that maximizes the recognition performance.展开更多
文摘为解决图像分类任务中模型结构固化、产生巨大内存消耗、时间消耗的问题,提出一种增量式深度神经网络(IDNN)。输入样本通过聚类算法激活不同簇并被分别处理:如果新样本激活已有簇,则更新该簇参数;否则为新簇开辟分支,并训练独立特征集。在Caltech-101、ORL Face、ETH-80数据库的验证结果表明,该系统能自动调整网络结构,适用于轮廓、纹理、视角等不同环境的增量式学习,例如在Caltech-101库分类任务中准确率超出VGGNet 5.08%、Alex Net 3.44%。
文摘Deep Learning is a powerful technique that is widely applied to Image Recognition and Natural Language Processing tasks amongst many other tasks. In this work, we propose an efficient technique to utilize pre-trained Convolutional Neural Network (CNN) architectures to extract powerful features from images for object recognition purposes. We have built on the existing concept of extending the learning from pre-trained CNNs to new databases through activations by proposing to consider multiple deep layers. We have exploited the progressive learning that happens at the various intermediate layers of the CNNs to construct Deep Multi-Layer (DM-L) based Feature Extraction vectors to achieve excellent object recognition performance. Two popular pre-trained CNN architecture models i.e. the VGG_16 and VGG_19 have been used in this work to extract the feature sets from 3 deep fully connected multiple layers namely “fc6”, “fc7” and “fc8” from inside the models for object recognition purposes. Using the Principal Component Analysis (PCA) technique, the Dimensionality of the DM-L feature vectors has been reduced to form powerful feature vectors that have been fed to an external Classifier Ensemble for classification instead of the Softmax based classification layers of the two original pre-trained CNN models. The proposed DM-L technique has been applied to the Benchmark Caltech-101 object recognition database. Conventional wisdom may suggest that feature extractions based on the deepest layer i.e. “fc8” compared to “fc6” will result in the best recognition performance but our results have proved it otherwise for the two considered models. Our experiments have revealed that for the two models under consideration, the “fc6” based feature vectors have achieved the best recognition performance. State-of-the-Art recognition performances of 91.17% and 91.35% have been achieved by utilizing the “fc6” based feature vectors for the VGG_16 and VGG_19 models respectively. The recognition performance has been achieved by considering 30 sample images per class whereas the proposed system is capable of achieving improved performance by considering all sample images per class. Our research shows that for feature extraction based on CNNs, multiple layers should be considered and then the best layer can be selected that maximizes the recognition performance.