摘要
针对古塔建筑图像分类任务中难以准确定位判别性特征以及复杂场景干扰的问题,提出多尺度上下文引导特征消除的分类方法.构建以MogaNet为核心的特征提取网络,结合多尺度的特征融合以充分挖掘图像信息;设计上下文信息提取器,利用网络的语义上下文来对齐和过滤更具判别性的局部特征,加强网络捕捉细节特征的能力;提出特征消除策略,抑制模糊类特征和背景噪声干扰,并设计损失函数来约束模糊类特征消除和分类预测;建立中国古塔建筑图像数据集,为细粒度图像分类领域内针对复杂背景和模糊边界的研究提供数据支撑.实验结果表明,所提方法在自建的古塔建筑数据集上达到了96.3%的准确率,并在CUB-200-2011、Stanford Cars和FGVC-Aircraft这3个细粒度数据集上分别达到了92.4%、95.3%和94.6%的准确率,优于其他对比算法,可以实现古塔建筑图像的精确分类.
A multi-scale context-guided feature elimination classification method was proposed,for resolving the problems of ambiguous discriminative feature localization and complex scene interference in the classification task of ancient tower building images.First,a feature extraction network with MogaNet as the core was constructed,and multi-scale feature fusion was combined to fully explore the image information.Next,a context information extractor was designed to utilize the semantic context of the network to align and filter more discriminative local features,enhancing the ability to capture detailed features.Then,a feature elimination strategy was proposed to suppress fuzzy class features and background noise interference,and a loss function was designed to constrain fuzzy feature elimination and classification prediction.At last,a Chinese ancient tower architecture image dataset was established to provide data to support research on complex backgrounds and fuzzy boundaries in the field of fine-grained image categorization.This method achieved 96.3%accuracy on the self-constructed ancient tower architecture dataset,and 92.4%,95.3%and 94.6%accuracy on three fine-grained datasets,namely,CUB-200-2011,Stanford Cars and FGVC-Aircraft,respectively.The proposed method outperforms other comparison algorithms and enables accurate classification of images of ancient tower buildings.
作者
孟月波
王博
刘光辉
MENG Yuebo;WANG Bo;LIU Guanghui(College of Information and Control Engineering,Xi’an University of Architecture and Technology,Xi’an 710300,China;Key Laboratory of Construction Robots for Higher Education in Shaanxi Province)
出处
《浙江大学学报(工学版)》
EI
CAS
CSCD
北大核心
2024年第12期2489-2499,共11页
Journal of Zhejiang University:Engineering Science
基金
国家自然科学基金资助项目(52278125)
陕西省重点研发计划资助项目(2021SF-429)。
关键词
图像分类
上下文信息
特征消除
深度学习
特征融合
image classification
contextual information
feature elimination
deep learning
feature fusion