摘要
传统工业过程的故障诊断使用的数据集一般是工业过程数据,即传感器数值数据,且近年来工业过程故障诊断在精度上遇到了瓶颈,而视频数据的出现为工业过程的故障诊断提供了新的方向,因此本研究提出了一种基于双流Swinc Transformer视频分类的工业过程故障诊断模型。在该方法中,为了捕获视频的时间特征和空间特征,本研究首先在Swin Transformer的Swin Transformer Block中加入了3D卷积模块,构建了Swinc Transformer深度学习模型。随后,为了进一步捕获视频的时间特征,使用Swinc Transformer作为主干网络,引入双流网络,将光流图像与RGB图像作为输入。最终,为了更好的将光流特征与图像特征融合,引入了交叉注意力机制(CAM),以自适应的分配光流与RGB图像特征权重。采用PRONTO基准数据集对该方法进行验证,实验结果表明,本研究中提出的双流Swinc Transformer方法相较于其他视频分类模型具有较好的分类性能,同时,相较于普通工业过程数据,视频数据在故障诊断精度方面也更具优势,其分类精度值为95.26%。
The data set used for fault diagnosis of traditional industrial process is generally industrial process data, that is, sensor numerical data. In recent years, industrial process fault diagnosis has encoun-tered a bottleneck in accuracy, and then the emergence of video data provides a new direction for industrial process fault diagnosis. Therefore, this study proposes a video classification industrial process fault diagnosis model based on Two Stream Swinc Transformer. In this method, in order to capture the temporal and spatial features of video, this study adds a 3D convolution module to the Swin Transformer Block of Swin Transformer first, thus constructs Swinc Transformer deep learn-ing model. Then, in order to further capture the temporal features of video, Swinc Transformer is used as the backbone network, and two-stream network is introduced. Finally, in order to better integrate optical flow features and image features, Cross Attention Mechanism (CAM) is introduced to adaptively allocate the weight of optical flow and RGB image features. The PRONTO benchmark data set is used to verify the method. The experimental results show that the proposed Two Stream Swinc Transformer method has better classification performance than other video classification models. At the same time, compared with ordinary industrial process data, video data has more advantages in fault diagnosis accuracy. The classification accuracy is 95.37%.
出处
《建模与仿真》
2023年第2期777-785,共9页
Modeling and Simulation