Watermarks can provide reliable and secure copyright protection for optical coherence tomography(OCT)fundus images.The effective image segmentation is helpful for promoting OCT image watermarking.However,OCT images ha...Watermarks can provide reliable and secure copyright protection for optical coherence tomography(OCT)fundus images.The effective image segmentation is helpful for promoting OCT image watermarking.However,OCT images have a large amount of low-quality data,which seriously affects the performance of segmentationmethods.Therefore,this paper proposes an effective segmentation method for OCT fundus image watermarking using a rough convolutional neural network(RCNN).First,the rough-set-based feature discretization module is designed to preprocess the input data.Second,a dual attention mechanism for feature channels and spatial regions in the CNN is added to enable the model to adaptively select important information for fusion.Finally,the refinement module for enhancing the extraction power of multi-scale information is added to improve the edge accuracy in segmentation.RCNN is compared with CE-Net and MultiResUNet on 83 gold standard 3D retinal OCT data samples.The average dice similarly coefficient(DSC)obtained by RCNN is 6%higher than that of CE-Net.The average 95 percent Hausdorff distance(95HD)and average symmetric surface distance(ASD)obtained by RCNN are 32.4%and 33.3%lower than those of MultiResUNet,respectively.We also evaluate the effect of feature discretization,as well as analyze the initial learning rate of RCNN and conduct ablation experiments with the four different models.The experimental results indicate that our method can improve the segmentation accuracy of OCT fundus images,providing strong support for its application in medical image watermarking.展开更多
Detection of floating garbage in inland rivers is crucial for water environmental protection,as it effectively reduces ecological damage and ensures the safety of water resources.To address the inefficiency of traditi...Detection of floating garbage in inland rivers is crucial for water environmental protection,as it effectively reduces ecological damage and ensures the safety of water resources.To address the inefficiency of traditional cleanup methods and the challenges in detecting small targets,an improved YOLOv5 object detection model was proposed in this study.In order to enhance the model’s sensitivity to small targets and mitigate the impact of redundant information on detection performance,a bi-level routing attention mechanism was introduced and embedded into the backbone network.Additionally,a multi-scale detection head was incorporated into the model,allowing for more comprehensive coverage of floating garbage of various sizes through multi-scale feature extraction and detection.The Focal-EIoU loss function was also employed to optimize the model parameters,improving localization accuracy.Experimental results on the publicly available FloW_Img dataset demonstrated that the improved YOLOv5 model outperforms the original YOLOv5 model in terms of precision and recall,achieving a mAP(mean average precision)of 86.12%,with significant improvements and faster convergence.展开更多
The image emotion classification task aims to use the model to automatically predict the emotional response of people when they see the image.Studies have shown that certain local regions are more likely to inspire an...The image emotion classification task aims to use the model to automatically predict the emotional response of people when they see the image.Studies have shown that certain local regions are more likely to inspire an emotional response than the whole image.However,existing methods perform poorly in predicting the details of emotional regions and are prone to overfitting during training due to the small size of the dataset.Therefore,this study proposes an image emotion classification network based on multilayer attentional interaction and adaptive feature aggregation.To perform more accurate emotional region prediction,this study designs a multilayer attentional interaction module.The module calculates spatial attention maps for higher-layer semantic features and fusion features through amultilayer shuffle attention module.Through layer-by-layer up-sampling and gating operations,the higher-layer features guide the lower-layer features to learn,eventually achieving sentiment region prediction at the optimal scale.To complement the important information lost by layer-by-layer fusion,this study not only adds an intra-layer fusion to the multilayer attention interaction module but also designs an adaptive feature aggregation module.The module uses global average pooling to compress spatial information and connect channel information from all layers.Then,the module adaptively generates a set of aggregated weights through two fully connected layers to augment the original features of each layer.Eventually,the semantics and details of the different layers are aggregated through gating operations and residual connectivity to complement the lost information.To reduce overfitting on small datasets,the network is pre-trained on the FI dataset,and further weight fine-tuning is performed on the small dataset.The experimental results on the FI,Twitter I and Emotion ROI(Region of Interest)datasets show that the proposed network exceeds existing image emotion classification methods,with accuracies of 90.27%,84.66%and 84.96%.展开更多
The implementation of early and accurate detection of aircraft cargo compartment fire is of great significance to ensure flight safety.The current airborne fire detection technology mostly relies on single-parameter s...The implementation of early and accurate detection of aircraft cargo compartment fire is of great significance to ensure flight safety.The current airborne fire detection technology mostly relies on single-parameter smoke detection using infrared light.This often results in a high false alarm rate in complex air transportation envi-ronments.The traditional deep learning model struggles to effectively address the issue of long-term dependency in multivariate fire information.This paper proposes a multi-technology collaborative fire detection method based on an improved transformers model.Dual-wavelength optical sensors,flue gas analyzers,and other equipment are used to carry out multi-technology collaborative detection methods and characterize various feature dimensions of fire to improve detection accuracy.The improved Transformer model which integrates the self-attention mechanism and position encoding mechanism is applied to the problem of long-time series modeling of fire information from a global perspective,which effectively solves the problem of gradient disappearance and gradient explosion in traditional RNN(recurrent neural network)and CNN(convolutional neural network).Two different multi-head self-attention mechanisms are used to classify and model multivariate fire information,respectively,which solves the problem of confusing time series modeling and classification modeling in dealing with multivariate classification tasks by a single attention mechanism.Finally,the output results of the two models are fused through the gate mechanism.The research results show that,compared with the traditional single-feature detection technology,the multi-technology collaborative fire detection method can better capture fire information.Compared with the traditional deep learning model,the multivariate fire pre-diction model constructed by the improved Transformer can better detect fires,and the accuracy rate is 0.995.展开更多
Remote sensing and deep learning are being widely combined in tasks such as urban planning and disaster prevention.However,due to interference occasioned by density,overlap,and coverage,the tiny object detection in re...Remote sensing and deep learning are being widely combined in tasks such as urban planning and disaster prevention.However,due to interference occasioned by density,overlap,and coverage,the tiny object detection in remote sensing images has always been a difficult problem.Therefore,we propose a novel TO–YOLOX(Tiny Object–You Only Look Once)model.TO–YOLOX possesses a MiSo(Multiple-in-Singleout)feature fusion structure,which exhibits a spatial-shift structure,and the model balances positive and negative samples and enhances the information interaction pertaining to the local patch of remote sensing images.TO–YOLOX utilizes an adaptive IOU-T(Intersection Over Uni-Tiny)loss to enhance the localization accuracy of tiny objects,and it applies attention mechanism Group-CBAM(group-convolutional block attention module)to enhance the perception of tiny objects in remote sensing images.To verify the effectiveness and efficiency of TO–YOLOX,we utilized three aerial-photography tiny object detection datasets,namely VisDrone2021,Tiny Person,and DOTA–HBB,and the following mean average precision(mAP)values were recorded,respectively:45.31%(+10.03%),28.9%(+9.36%),and 63.02%(+9.62%).With respect to recognizing tiny objects,TO–YOLOX exhibits a stronger ability compared with Faster R-CNN,RetinaNet,YOLOv5,YOLOv6,YOLOv7,and YOLOX,and the proposed model exhibits fast computation.展开更多
基金the China Postdoctoral Science Foundation under Grant 2021M701838the Natural Science Foundation of Hainan Province of China under Grants 621MS042 and 622MS067the Hainan Medical University Teaching Achievement Award Cultivation under Grant HYjcpx202209.
文摘Watermarks can provide reliable and secure copyright protection for optical coherence tomography(OCT)fundus images.The effective image segmentation is helpful for promoting OCT image watermarking.However,OCT images have a large amount of low-quality data,which seriously affects the performance of segmentationmethods.Therefore,this paper proposes an effective segmentation method for OCT fundus image watermarking using a rough convolutional neural network(RCNN).First,the rough-set-based feature discretization module is designed to preprocess the input data.Second,a dual attention mechanism for feature channels and spatial regions in the CNN is added to enable the model to adaptively select important information for fusion.Finally,the refinement module for enhancing the extraction power of multi-scale information is added to improve the edge accuracy in segmentation.RCNN is compared with CE-Net and MultiResUNet on 83 gold standard 3D retinal OCT data samples.The average dice similarly coefficient(DSC)obtained by RCNN is 6%higher than that of CE-Net.The average 95 percent Hausdorff distance(95HD)and average symmetric surface distance(ASD)obtained by RCNN are 32.4%and 33.3%lower than those of MultiResUNet,respectively.We also evaluate the effect of feature discretization,as well as analyze the initial learning rate of RCNN and conduct ablation experiments with the four different models.The experimental results indicate that our method can improve the segmentation accuracy of OCT fundus images,providing strong support for its application in medical image watermarking.
文摘Detection of floating garbage in inland rivers is crucial for water environmental protection,as it effectively reduces ecological damage and ensures the safety of water resources.To address the inefficiency of traditional cleanup methods and the challenges in detecting small targets,an improved YOLOv5 object detection model was proposed in this study.In order to enhance the model’s sensitivity to small targets and mitigate the impact of redundant information on detection performance,a bi-level routing attention mechanism was introduced and embedded into the backbone network.Additionally,a multi-scale detection head was incorporated into the model,allowing for more comprehensive coverage of floating garbage of various sizes through multi-scale feature extraction and detection.The Focal-EIoU loss function was also employed to optimize the model parameters,improving localization accuracy.Experimental results on the publicly available FloW_Img dataset demonstrated that the improved YOLOv5 model outperforms the original YOLOv5 model in terms of precision and recall,achieving a mAP(mean average precision)of 86.12%,with significant improvements and faster convergence.
基金This study was supported,in part,by the National Nature Science Foundation of China under Grant 62272236in part,by the Natural Science Foundation of Jiangsu Province under Grant BK20201136,BK20191401.
文摘The image emotion classification task aims to use the model to automatically predict the emotional response of people when they see the image.Studies have shown that certain local regions are more likely to inspire an emotional response than the whole image.However,existing methods perform poorly in predicting the details of emotional regions and are prone to overfitting during training due to the small size of the dataset.Therefore,this study proposes an image emotion classification network based on multilayer attentional interaction and adaptive feature aggregation.To perform more accurate emotional region prediction,this study designs a multilayer attentional interaction module.The module calculates spatial attention maps for higher-layer semantic features and fusion features through amultilayer shuffle attention module.Through layer-by-layer up-sampling and gating operations,the higher-layer features guide the lower-layer features to learn,eventually achieving sentiment region prediction at the optimal scale.To complement the important information lost by layer-by-layer fusion,this study not only adds an intra-layer fusion to the multilayer attention interaction module but also designs an adaptive feature aggregation module.The module uses global average pooling to compress spatial information and connect channel information from all layers.Then,the module adaptively generates a set of aggregated weights through two fully connected layers to augment the original features of each layer.Eventually,the semantics and details of the different layers are aggregated through gating operations and residual connectivity to complement the lost information.To reduce overfitting on small datasets,the network is pre-trained on the FI dataset,and further weight fine-tuning is performed on the small dataset.The experimental results on the FI,Twitter I and Emotion ROI(Region of Interest)datasets show that the proposed network exceeds existing image emotion classification methods,with accuracies of 90.27%,84.66%and 84.96%.
基金This work was funded by the National Science Foundation of China(Grant No.U2033206)the Project of Civil Aircraft Fire Science and Safety Engineering Key Laboratory of Sichuan Province(Grant No.MZ2022KF05,Grant No.MZ2022JB01)+3 种基金the project of Key Laboratory of Civil Aviation Emergency Science&Technology,CAAC(Grant No.NJ2022022,Grant No.NJ2023025)the project of Postgraduate Project of Civil Aviation Flight University of China(Grant No X2023-1)the project of the undergraduate innovation and entrepreneurship training program(Grant No 202210624024)the project of General Programs of the Civil Aviation Flight University of China(Grant No J2020-072).
文摘The implementation of early and accurate detection of aircraft cargo compartment fire is of great significance to ensure flight safety.The current airborne fire detection technology mostly relies on single-parameter smoke detection using infrared light.This often results in a high false alarm rate in complex air transportation envi-ronments.The traditional deep learning model struggles to effectively address the issue of long-term dependency in multivariate fire information.This paper proposes a multi-technology collaborative fire detection method based on an improved transformers model.Dual-wavelength optical sensors,flue gas analyzers,and other equipment are used to carry out multi-technology collaborative detection methods and characterize various feature dimensions of fire to improve detection accuracy.The improved Transformer model which integrates the self-attention mechanism and position encoding mechanism is applied to the problem of long-time series modeling of fire information from a global perspective,which effectively solves the problem of gradient disappearance and gradient explosion in traditional RNN(recurrent neural network)and CNN(convolutional neural network).Two different multi-head self-attention mechanisms are used to classify and model multivariate fire information,respectively,which solves the problem of confusing time series modeling and classification modeling in dealing with multivariate classification tasks by a single attention mechanism.Finally,the output results of the two models are fused through the gate mechanism.The research results show that,compared with the traditional single-feature detection technology,the multi-technology collaborative fire detection method can better capture fire information.Compared with the traditional deep learning model,the multivariate fire pre-diction model constructed by the improved Transformer can better detect fires,and the accuracy rate is 0.995.
基金funded by the Innovative Research Program of the International Research Center of Big Data for Sustainable Development Goals(Grant No.CBAS2022IRP04)the Sichuan Natural Resources Department Project(Grant NO.510201202076888)+3 种基金the Project of the Geological Exploration Management Department of the Ministry of Natural Resources(Grant NO.073320180876/2)the Key Research and Development Program of Guangxi(Guike-AB22035060)the National Natural Science Foundation of China(Grant No.42171291)the Chengdu University of Technology Postgraduate Innovative Cultivation Program:Tunnel Geothermal Disaster Susceptibility Evaluation in Sichuan-Tibet Railway Based on Deep Learning(CDUT2022BJCX015).
文摘Remote sensing and deep learning are being widely combined in tasks such as urban planning and disaster prevention.However,due to interference occasioned by density,overlap,and coverage,the tiny object detection in remote sensing images has always been a difficult problem.Therefore,we propose a novel TO–YOLOX(Tiny Object–You Only Look Once)model.TO–YOLOX possesses a MiSo(Multiple-in-Singleout)feature fusion structure,which exhibits a spatial-shift structure,and the model balances positive and negative samples and enhances the information interaction pertaining to the local patch of remote sensing images.TO–YOLOX utilizes an adaptive IOU-T(Intersection Over Uni-Tiny)loss to enhance the localization accuracy of tiny objects,and it applies attention mechanism Group-CBAM(group-convolutional block attention module)to enhance the perception of tiny objects in remote sensing images.To verify the effectiveness and efficiency of TO–YOLOX,we utilized three aerial-photography tiny object detection datasets,namely VisDrone2021,Tiny Person,and DOTA–HBB,and the following mean average precision(mAP)values were recorded,respectively:45.31%(+10.03%),28.9%(+9.36%),and 63.02%(+9.62%).With respect to recognizing tiny objects,TO–YOLOX exhibits a stronger ability compared with Faster R-CNN,RetinaNet,YOLOv5,YOLOv6,YOLOv7,and YOLOX,and the proposed model exhibits fast computation.