期刊文献+
共找到6篇文章
< 1 >
每页显示 20 50 100
Fine-grained Ship Image Recognition Based on BCNN with Inception and AM-Softmax
1
作者 Zhilin Zhang Ting Zhang +4 位作者 Zhaoying Liu Peijie Zhang Shanshan Tu Yujian Li Muhammad Waqas 《Computers, Materials & Continua》 SCIE EI 2022年第10期1527-1539,共13页
The fine-grained ship image recognition task aims to identify various classes of ships.However,small inter-class,large intra-class differences between ships,and lacking of training samples are the reasons that make th... The fine-grained ship image recognition task aims to identify various classes of ships.However,small inter-class,large intra-class differences between ships,and lacking of training samples are the reasons that make the task difficult.Therefore,to enhance the accuracy of the fine-grained ship image recognition,we design a fine-grained ship image recognition network based on bilinear convolutional neural network(BCNN)with Inception and additive margin Softmax(AM-Softmax).This network improves the BCNN in two aspects.Firstly,by introducing Inception branches to the BCNN network,it is helpful to enhance the ability of extracting comprehensive features from ships.Secondly,by adding margin values to the decision boundary,the AM-Softmax function can better extend the inter-class differences and reduce the intra-class differences.In addition,as there are few publicly available datasets for fine-grained ship image recognition,we construct a Ship-43 dataset containing 47,300 ship images belonging to 43 categories.Experimental results on the constructed Ship-43 dataset demonstrate that our method can effectively improve the accuracy of ship image recognition,which is 4.08%higher than the BCNN model.Moreover,comparison results on the other three public fine-grained datasets(Cub,Cars,and Aircraft)further validate the effectiveness of the proposed method. 展开更多
关键词 fine-grained ship image recognition INCEPTION AM-softmax BCNN
下载PDF
Attention Guided Food Recognition via Multi-Stage Local Feature Fusion
2
作者 Gonghui Deng Dunzhi Wu Weizhen Chen 《Computers, Materials & Continua》 SCIE EI 2024年第8期1985-2003,共19页
The task of food image recognition,a nuanced subset of fine-grained image recognition,grapples with substantial intra-class variation and minimal inter-class differences.These challenges are compounded by the irregula... The task of food image recognition,a nuanced subset of fine-grained image recognition,grapples with substantial intra-class variation and minimal inter-class differences.These challenges are compounded by the irregular and multi-scale nature of food images.Addressing these complexities,our study introduces an advanced model that leverages multiple attention mechanisms and multi-stage local fusion,grounded in the ConvNeXt architecture.Our model employs hybrid attention(HA)mechanisms to pinpoint critical discriminative regions within images,substantially mitigating the influence of background noise.Furthermore,it introduces a multi-stage local fusion(MSLF)module,fostering long-distance dependencies between feature maps at varying stages.This approach facilitates the assimilation of complementary features across scales,significantly bolstering the model’s capacity for feature extraction.Furthermore,we constructed a dataset named Roushi60,which consists of 60 different categories of common meat dishes.Empirical evaluation of the ETH Food-101,ChineseFoodNet,and Roushi60 datasets reveals that our model achieves recognition accuracies of 91.12%,82.86%,and 92.50%,respectively.These figures not only mark an improvement of 1.04%,3.42%,and 1.36%over the foundational ConvNeXt network but also surpass the performance of most contemporary food image recognition methods.Such advancements underscore the efficacy of our proposed model in navigating the intricate landscape of food image recognition,setting a new benchmark for the field. 展开更多
关键词 fine-grained image recognition food image recognition attention mechanism local feature fusion
下载PDF
Task-specific Part Discovery for Fine-grained Few-shot Classification
3
作者 Yongxian Wei Xiu-Shen Wei 《Machine Intelligence Research》 EI CSCD 2024年第5期954-965,共12页
Localizing discriminative object parts(e.g.,bird head)is crucial for fine-grained classification tasks,especially for the more challenging fine-grained few-shot scenario.Previous work always relies on the learned obje... Localizing discriminative object parts(e.g.,bird head)is crucial for fine-grained classification tasks,especially for the more challenging fine-grained few-shot scenario.Previous work always relies on the learned object parts in a unified manner,where they attend the same object parts(even with common attention weights)for different few-shot episodic tasks.In this paper,we propose that it should adaptively capture the task-specific object parts that require attention for each few-shot task,since the parts that can distinguish different tasks are naturally different.Specifically for a few-shot task,after obtaining part-level deep features,we learn a task-specific part-based dictionary for both aligning and reweighting part features in an episode.Then,part-level categorical prototypes are generated based on the part features of support data,which are later employed by calculating distances to classify query data for evaluation.To retain the discriminative ability of the part-level representations(i.e.,part features and part prototypes),we design an optimal transport solution that also utilizes query data in a transductive way to optimize the aforementioned distance calculation for the final predictions.Extensive experiments on five fine-grained benchmarks show the superiority of our method,especially for the 1-shot setting,gaining 0.12%,8.56%and 5.87%improvements over state-of-the-art methods on CUB,Stanford Dogs,and Stanford Cars,respectively. 展开更多
关键词 fine-grained image recognition few-shot learning transductive learning visual dictionary part feature discovery
原文传递
Multi-granularity sequence generation for hierarchical image classification
4
作者 Xinda Liu Lili Wang 《Computational Visual Media》 SCIE EI CSCD 2024年第2期243-260,共18页
Hierarchical multi-granularity image classification is a challenging task that aims to tag each given image with multiple granularity labels simultaneously.Existing methods tend to overlook that different image region... Hierarchical multi-granularity image classification is a challenging task that aims to tag each given image with multiple granularity labels simultaneously.Existing methods tend to overlook that different image regions contribute differently to label prediction at different granularities,and also insufficiently consider relationships between the hierarchical multi-granularity labels.We introduce a sequence-to-sequence mechanism to overcome these two problems and propose a multi-granularity sequence generation(MGSG)approach for the hierarchical multi-granularity image classification task.Specifically,we introduce a transformer architecture to encode the image into visual representation sequences.Next,we traverse the taxonomic tree and organize the multi-granularity labels into sequences,and vectorize them and add positional information.The proposed multi-granularity sequence generation method builds a decoder that takes visual representation sequences and semantic label embedding as inputs,and outputs the predicted multi-granularity label sequence.The decoder models dependencies and correlations between multi-granularity labels through a masked multi-head self-attention mechanism,and relates visual information to the semantic label information through a crossmodality attention mechanism.In this way,the proposed method preserves the relationships between labels at different granularity levels and takes into account the influence of different image regions on labels with different granularities.Evaluations on six public benchmarks qualitatively and quantitatively demonstrate the advantages of the proposed method.Our project is available at https://github.com/liuxindazz/mgs. 展开更多
关键词 hierarchical multi-granularity classification vision and text transformer sequence generation fine-grained image recognition cross-modality attenti
原文传递
Mobile phone recognition method based on bilinear convolutional neural network 被引量:3
5
作者 HAN HongGui ZHEN Qi +2 位作者 YANG HongYan DU YongPing QIAO JunFei 《Science China(Technological Sciences)》 SCIE EI CAS CSCD 2021年第11期2477-2484,共8页
Model recognition of second-hand mobile phones has been considered as an essential process to improve the efficiency of phone recycling. However, due to the diversity of mobile phone appearances, it is difficult to re... Model recognition of second-hand mobile phones has been considered as an essential process to improve the efficiency of phone recycling. However, due to the diversity of mobile phone appearances, it is difficult to realize accurate recognition. To solve this problem, a mobile phone recognition method based on bilinear-convolutional neural network(B-CNN) is proposed in this paper.First, a feature extraction model, based on B-CNN, is designed to adaptively extract local features from the images of secondhand mobile phones. Second, a joint loss function, constructed by center distance and softmax, is developed to reduce the interclass feature distance during the training process. Third, a parameter downscaling method, derived from the kernel discriminant analysis algorithm, is introduced to eliminate redundant features in B-CNN. Finally, the experimental results demonstrate that the B-CNN method can achieve higher accuracy than some existing methods. 展开更多
关键词 bilinear convolutional neural network low-rank decomposition joint loss fine-grained image recognition
原文传递
Ensemble relation network with multi-level measure
6
作者 Li Xiaoxu Qu Xue Cao Jie 《The Journal of China Universities of Posts and Telecommunications》 EI CSCD 2022年第3期15-24,33,共11页
Fine-grained few-shot learning is a difficult task in image classification. The reason is that the discriminative features of fine-grained images are often located in local areas of the image, while most of the existi... Fine-grained few-shot learning is a difficult task in image classification. The reason is that the discriminative features of fine-grained images are often located in local areas of the image, while most of the existing few-shot learning image classification methods only use top-level features and adopt a single measure. In that way, the local features of the sample cannot be learned well. In response to this problem, ensemble relation network with multi-level measure(ERN-MM) is proposed in this paper. It adds the relation modules in the shallow feature space to compare the similarity between the samples in the local features, and finally integrates the similarity scores from the feature spaces to assign the label of the query samples. So the proposed method ERN-MM can use local details and global information of different grains. Experimental results on different fine-grained datasets show that the proposed method achieves good classification performance and also proves its rationality. 展开更多
关键词 fine-grained image classification few-shot learning local feature learning metric learning
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部