期刊文献+

融合CNN和Transformer的建筑风格分类算法

Architectural style classification algorithm fusing CNN and Transformer
下载PDF
导出
摘要 建筑风格的准确分类对研究建筑文化和人类历史文明具有重要意义。基于卷积神经网络(convolutional neural network,CNN)的模型由于其强大的特征提取能力,在建筑风格分类领域取得了良好的效果。但是,目前大多数的CNN模型只提取了建筑的局部特征,而基于Transformer的模型在注意力机制的作用下,可以提取建筑的全局特征。为了提高建筑风格分类的准确性,提出了一种融合CNN和Transformer的建筑风格分类方法,该网络的核心部分为CT-Block结构。该结构在通道维度上分为CNN和Transformer两个分支,特征分别通过这两个通道之后再拼接起来。该结构不仅能融合CNN提取的局部特征和Transformer提取的全局特征,而且还能减轻双分支结构带来的模型变大,参数量增多的问题。在Architectural Style Dataset和WikiChurches数据集上,该算法的准确率分别为79.83%和68.41%,优于建筑风格分类领域其他模型。 The accurate classification of architectural style is of great significance to the study of architectural culture and human history and civilization.Models based on convolutional neural network(CNN)has achieved good performance in the field of architectural style classification due to its powerful feature extraction ability.However,most current CNN models only extract the local features of architecture buildings.With the attention mechanism,a model based on Transformer can extract the globle features of architecture buildings.In order to improve the accuracy of architectural style classification,an architectural style classification method fusing CNN and Transformer is proposed. The core of the network is CT-Block structure. In terms of channeldimension, the structure is divided into two branches, CNN and Transformer, and the features passthrough the two channels respectively and then concatenate together. This structure thenconcatenate together. This structure can not only fuse the local features extracted by CNN and theglobal features extracted by Transformer, but also alleviate the problem of model size andparameter number increase caused by the two-branch structure. The experimental results ofArchitectural Style Dataset and WikiChurches dataset were 79.83% and 68.41% respectively,which was better than other models in the field of architectural style classification.
作者 刘东 张荣福 秦俊祥 龚俊哲 曹志彬 LIU Dong;ZHANG Rongfu;QIN Junxiang;GONG Junzhe;CAO Zhibin(School of Optical-Electrical and Computer Engineering,University of Shanghai for Science and Technology,Shanghai 200093,China)
出处 《光学仪器》 2024年第5期1-8,共8页 Optical Instruments
基金 国家重点研发计划“基础科研条件与重大科学仪器设备研发”重点专项(2022YFF0706003)。
关键词 建筑风格分类 卷积神经网络 Transformer模型 网络融合 注意力机制 architectural style classification convolutional neural network Transformer model network fusion attention mechanism
  • 相关文献

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部