Gestures are one of the most natural and intuitive approach for human-computer interaction.Compared with traditional camera-based or wearable sensors-based solutions,gesture recognition using the millimeter wave radar...Gestures are one of the most natural and intuitive approach for human-computer interaction.Compared with traditional camera-based or wearable sensors-based solutions,gesture recognition using the millimeter wave radar has attracted growing attention for its characteristics of contact-free,privacy-preserving and less environmentdependence.Although there have been many recent studies on hand gesture recognition,the existing hand gesture recognition methods still have recognition accuracy and generalization ability shortcomings in shortrange applications.In this paper,we present a hand gesture recognition method named multiscale feature fusion(MSFF)to accurately identify micro hand gestures.In MSFF,not only the overall action recognition of the palm but also the subtle movements of the fingers are taken into account.Specifically,we adopt hand gesture multiangle Doppler-time and gesture trajectory range-angle map multi-feature fusion to comprehensively extract hand gesture features and fuse high-level deep neural networks to make it pay more attention to subtle finger movements.We evaluate the proposed method using data collected from 10 users and our proposed solution achieves an average recognition accuracy of 99.7%.Extensive experiments on a public mmWave gesture dataset demonstrate the superior effectiveness of the proposed system.展开更多
Faced with the massive amount of online shopping clothing images,how to classify them quickly and accurately is a challenging task in image classification.In this paper,we propose a novel method,named Multi_XMNet,to s...Faced with the massive amount of online shopping clothing images,how to classify them quickly and accurately is a challenging task in image classification.In this paper,we propose a novel method,named Multi_XMNet,to solve the clothing images classification problem.The proposed method mainly consists of two convolution neural network(CNN)branches.One branch extracts multiscale features from the whole expressional image by Multi_X which is designed by improving the Xception network,while the other extracts attention mechanism features from the whole expressional image by MobileNetV3-small network.Both multiscale and attention mechanism features are aggregated before making classification.Additionally,in the training stage,global average pooling(GAP),convolutional layers,and softmax classifiers are used instead of the fully connected layer to classify the final features,which speed up model training and alleviate the problem of overfitting caused by too many parameters.Experimental comparisons are made in the public DeepFashion dataset.The experimental results show that the classification accuracy of this method is 95.38%,which is better than InceptionV3,Xception and InceptionV3_Xception by 5.58%,3.32%,and 2.22%,respectively.The proposed Multi_XMNet image classification model can help enterprises and researchers in the field of clothing e-commerce to automaticly,efficiently and accurately classify massive clothing images.展开更多
Semantic segmentation is for pixel-level classification tasks,and contextual information has an important impact on the performance of segmentation.In order to capture richer contextual information,we adopt ResNet as ...Semantic segmentation is for pixel-level classification tasks,and contextual information has an important impact on the performance of segmentation.In order to capture richer contextual information,we adopt ResNet as the backbone network and designs an encoder-decoder architecture based on multidimensional attention(MDA)module and multiscale upsampling(MSU)module.The MDA module calculates the attention matrices of the three dimensions to capture the dependency of each position,and adaptively captures the image features.The MSU module adopts parallel branches to capture the multiscale features of the images,and multiscale feature aggregation can enhance contextual information.A series of experiments demonstrate the validity of the model on Cityscapes and Camvid datasets.展开更多
There is instability in the distributed energy storage cloud group end region on the power grid side.In order to avoid large-scale fluctuating charging and discharging in the power grid environment and make the capaci...There is instability in the distributed energy storage cloud group end region on the power grid side.In order to avoid large-scale fluctuating charging and discharging in the power grid environment and make the capacitor components showa continuous and stable charging and discharging state,a hierarchical time-sharing configuration algorithm of distributed energy storage cloud group end region on the power grid side based on multi-scale and multi feature convolution neural network is proposed.Firstly,a voltage stability analysis model based onmulti-scale and multi feature convolution neural network is constructed,and the multi-scale and multi feature convolution neural network is optimized based on Self-OrganizingMaps(SOM)algorithm to analyze the voltage stability of the cloud group end region of distributed energy storage on the grid side under the framework of credibility.According to the optimal scheduling objectives and network size,the distributed robust optimal configuration control model is solved under the framework of coordinated optimal scheduling at multiple time scales;Finally,the time series characteristics of regional power grid load and distributed generation are analyzed.According to the regional hierarchical time-sharing configuration model of“cloud”,“group”and“end”layer,the grid side distributed energy storage cloud group end regional hierarchical time-sharing configuration algorithm is realized.The experimental results show that after applying this algorithm,the best grid side distributed energy storage configuration scheme can be determined,and the stability of grid side distributed energy storage cloud group end region layered timesharing configuration can be improved.展开更多
Network intrusion detection systems(NIDS)based on deep learning have continued to make significant advances.However,the following challenges remain:on the one hand,simply applying only Temporal Convolutional Networks(...Network intrusion detection systems(NIDS)based on deep learning have continued to make significant advances.However,the following challenges remain:on the one hand,simply applying only Temporal Convolutional Networks(TCNs)can lead to models that ignore the impact of network traffic features at different scales on the detection performance.On the other hand,some intrusion detection methods considermulti-scale information of traffic data,but considering only forward network traffic information can lead to deficiencies in capturing multi-scale temporal features.To address both of these issues,we propose a hybrid Convolutional Neural Network that supports a multi-output strategy(BONUS)for industrial internet intrusion detection.First,we create a multiscale Temporal Convolutional Network by stacking TCN of different scales to capture the multiscale information of network traffic.Meanwhile,we propose a bi-directional structure and dynamically set the weights to fuse the forward and backward contextual information of network traffic at each scale to enhance the model’s performance in capturing the multi-scale temporal features of network traffic.In addition,we introduce a gated network for each of the two branches in the proposed method to assist the model in learning the feature representation of each branch.Extensive experiments reveal the effectiveness of the proposed approach on two publicly available traffic intrusion detection datasets named UNSW-NB15 and NSL-KDD with F1 score of 85.03% and 99.31%,respectively,which also validates the effectiveness of enhancing the model’s ability to capture multi-scale temporal features of traffic data on detection performance.展开更多
A novel convolutional neural network based on spatial pyramid for image classification is proposed.The network exploits image features with spatial pyramid representation.First,it extracts global features from an orig...A novel convolutional neural network based on spatial pyramid for image classification is proposed.The network exploits image features with spatial pyramid representation.First,it extracts global features from an original image,and then different layers of grids are utilized to extract feature maps from different convolutional layers.Inspired by the spatial pyramid,the new network contains two parts,one of which is just like a standard convolutional neural network,composing of alternating convolutions and subsampling layers.But those convolution layers would be averagely pooled by the grid way to obtain feature maps,and then concatenated into a feature vector individually.Finally,those vectors are sequentially concatenated into a total feature vector as the last feature to the fully connection layer.This generated feature vector derives benefits from the classic and previous convolution layer,while the size of the grid adjusting the weight of the feature maps improves the recognition efficiency of the network.Experimental results demonstrate that this model improves the accuracy and applicability compared with the traditional model.展开更多
Low-light image enhancement methods have limitations in addressing issues such as color distortion,lack of vibrancy,and uneven light distribution and often require paired training data.To address these issues,we propo...Low-light image enhancement methods have limitations in addressing issues such as color distortion,lack of vibrancy,and uneven light distribution and often require paired training data.To address these issues,we propose a two-stage unsupervised low-light image enhancement algorithm called Retinex and Exposure Fusion Network(RFNet),which can overcome the problems of over-enhancement of the high dynamic range and under-enhancement of the low dynamic range in existing enhancement algorithms.This algorithm can better manage the challenges brought about by complex environments in real-world scenarios by training with unpaired low-light images and regular-light images.In the first stage,we design a multi-scale feature extraction module based on Retinex theory,capable of extracting details and structural information at different scales to generate high-quality illumination and reflection images.In the second stage,an exposure image generator is designed through the camera response mechanism function to acquire exposure images containing more dark features,and the generated images are fused with the original input images to complete the low-light image enhancement.Experiments show the effectiveness and rationality of each module designed in this paper.And the method reconstructs the details of contrast and color distribution,outperforms the current state-of-the-art methods in both qualitative and quantitative metrics,and shows excellent performance in the real world.展开更多
The point segmentation of power lines and towers aims to use unmanned aerial vehicles(UAVs)for the inspection of power facilities,risk detection and modelling.Because of the unclear spatial relationship between the po...The point segmentation of power lines and towers aims to use unmanned aerial vehicles(UAVs)for the inspection of power facilities,risk detection and modelling.Because of the unclear spatial relationship between the point clouds,the point segmentation of power lines and towers is challenging.In this paper,the power line and tower point datasets are constructed using Light Detection and Ranging(LiDAR)and a point segmentation method is proposed based on multiscale density features and a point-based deep learning network.First,the data are blocked and the neighbourhood is constructed.Second,the point clouds are downsampled to produce sparse point clouds.The point clouds before and after sampling are rotated,and their density is calculated.Next,a direct mapping method is selected to fuse the density information;a lightweight network is built to learn the features.Finally,the point clouds are segmented by concatenating the local features provided by PointCNN.The algorithm performs effectively on different types of power lines and towers.The mean interaction over union is 82.73%,and the overall accuracy can reach 91.76%.This approach can achieve the end-to-end integration of segmentation and provide theoretical support for the segmentation of large scenic point clouds.展开更多
Qinghai-Tibet Plateau lakes are important carriers of water resources in the‘Asian’s Water Tower’,and it is of great significance to grasp the spatial distribution of plateau lakes for the climate,ecological enviro...Qinghai-Tibet Plateau lakes are important carriers of water resources in the‘Asian’s Water Tower’,and it is of great significance to grasp the spatial distribution of plateau lakes for the climate,ecological environment,and regional water cycle.However,the differences in spatial-spectral characteristics of various types of plateau lakes,and the complex background information of plateau both influence the extraction effect of lakes.Therefore,it is a great challenge to completely and effectively extract plateau lakes.In this study,we proposed a multiscale contextual information aggregation network,termed MSCANet,to automatically extract Plateau lake regions.It consists of three main components:a multiscale lake feature encoder,a feature decoder,and a Multicore Pyramid Pooling Module(MPPM).The multiscale lake feature encoder suppressed noise interference to capture multiscale spatial-spectral information from heterogeneous scenes.The MPPM module aggregated the contextual information of various lakes globally.We applied the MSCANet to the lake extraction of the Qinghai-Tibet Plateau based on Google data;additionally,comparative experiments showed that the MSCANet proposed had obvious improvement in lake detection accuracy and morphological integrity.Finally,we transferred the pre-trained optimal model to the Landsat-8 and Sentinel-2A dataset to verify the generalization of the MSCANet.展开更多
The quality of the exposed avionics solder joints has a significant impact on the stable operation of the inorbit spacecrafts.Nevertheless,the previously reported inspection methods for multi-scale solder joint defect...The quality of the exposed avionics solder joints has a significant impact on the stable operation of the inorbit spacecrafts.Nevertheless,the previously reported inspection methods for multi-scale solder joint defects generally suffer low accuracy and slow detection speed.Herein,a novel real-time detector VMMAO-YOLO is demonstrated based on variable multi-scale concurrency and multi-depth aggregation network(VMMANet)backbone and“one-stop”global information gather-distribute(OS-GD)module.Combined with infrared thermography technology,it can achieve fast and high-precision detection of both internal and external solder joint defects.Specifically,VMMANet is designed for efficient multi-scale feature extraction,which mainly comprises variable multi-scale feature concurrency(VMC)and multi-depth feature aggregation-alignment(MAA)modules.VMC can extract multi-scale features via multiple fix-sized and deformable convolutions,while MAA can aggregate and align multi-depth features on the same order for feature inference.This allows the low-level features with more spatial details to be transmitted in depth-wise,enabling the deeper network to selectively utilize the preceding inference information.The VMMANet replaces inefficient highdensity deep convolution by increasing the width of intermediate feature levels,leading to a salient decline in parameters.The OS-GD is developed for efficacious feature extraction,aggregation and distribution,further enhancing the global information gather and deployment capability of the network.On a self-made solder joint image data set,the VMMAOYOLO achieves a mean average precision mAP@0.5 of 91.6%,surpassing all the mainstream YOLO-series models.Moreover,the VMMAO-YOLO has a body size of merely 19.3 MB and a detection speed up to 119 frame per second,far superior to the prevalent YOLO-series detectors.展开更多
基金supported by the National Natural Science Foundation of China under grant no.62272242.
文摘Gestures are one of the most natural and intuitive approach for human-computer interaction.Compared with traditional camera-based or wearable sensors-based solutions,gesture recognition using the millimeter wave radar has attracted growing attention for its characteristics of contact-free,privacy-preserving and less environmentdependence.Although there have been many recent studies on hand gesture recognition,the existing hand gesture recognition methods still have recognition accuracy and generalization ability shortcomings in shortrange applications.In this paper,we present a hand gesture recognition method named multiscale feature fusion(MSFF)to accurately identify micro hand gestures.In MSFF,not only the overall action recognition of the palm but also the subtle movements of the fingers are taken into account.Specifically,we adopt hand gesture multiangle Doppler-time and gesture trajectory range-angle map multi-feature fusion to comprehensively extract hand gesture features and fuse high-level deep neural networks to make it pay more attention to subtle finger movements.We evaluate the proposed method using data collected from 10 users and our proposed solution achieves an average recognition accuracy of 99.7%.Extensive experiments on a public mmWave gesture dataset demonstrate the superior effectiveness of the proposed system.
基金Fundamental Research Funds for the Central Universities of Ministry of Education of China(No.19D111201)。
文摘Faced with the massive amount of online shopping clothing images,how to classify them quickly and accurately is a challenging task in image classification.In this paper,we propose a novel method,named Multi_XMNet,to solve the clothing images classification problem.The proposed method mainly consists of two convolution neural network(CNN)branches.One branch extracts multiscale features from the whole expressional image by Multi_X which is designed by improving the Xception network,while the other extracts attention mechanism features from the whole expressional image by MobileNetV3-small network.Both multiscale and attention mechanism features are aggregated before making classification.Additionally,in the training stage,global average pooling(GAP),convolutional layers,and softmax classifiers are used instead of the fully connected layer to classify the final features,which speed up model training and alleviate the problem of overfitting caused by too many parameters.Experimental comparisons are made in the public DeepFashion dataset.The experimental results show that the classification accuracy of this method is 95.38%,which is better than InceptionV3,Xception and InceptionV3_Xception by 5.58%,3.32%,and 2.22%,respectively.The proposed Multi_XMNet image classification model can help enterprises and researchers in the field of clothing e-commerce to automaticly,efficiently and accurately classify massive clothing images.
基金Fundamental Research Fund in Heilongjiang Provincial Universities(Nos.135409602,135409102)。
文摘Semantic segmentation is for pixel-level classification tasks,and contextual information has an important impact on the performance of segmentation.In order to capture richer contextual information,we adopt ResNet as the backbone network and designs an encoder-decoder architecture based on multidimensional attention(MDA)module and multiscale upsampling(MSU)module.The MDA module calculates the attention matrices of the three dimensions to capture the dependency of each position,and adaptively captures the image features.The MSU module adopts parallel branches to capture the multiscale features of the images,and multiscale feature aggregation can enhance contextual information.A series of experiments demonstrate the validity of the model on Cityscapes and Camvid datasets.
基金supported by State Grid Corporation Limited Science and Technology Project Funding(Contract No.SGCQSQ00YJJS2200380).
文摘There is instability in the distributed energy storage cloud group end region on the power grid side.In order to avoid large-scale fluctuating charging and discharging in the power grid environment and make the capacitor components showa continuous and stable charging and discharging state,a hierarchical time-sharing configuration algorithm of distributed energy storage cloud group end region on the power grid side based on multi-scale and multi feature convolution neural network is proposed.Firstly,a voltage stability analysis model based onmulti-scale and multi feature convolution neural network is constructed,and the multi-scale and multi feature convolution neural network is optimized based on Self-OrganizingMaps(SOM)algorithm to analyze the voltage stability of the cloud group end region of distributed energy storage on the grid side under the framework of credibility.According to the optimal scheduling objectives and network size,the distributed robust optimal configuration control model is solved under the framework of coordinated optimal scheduling at multiple time scales;Finally,the time series characteristics of regional power grid load and distributed generation are analyzed.According to the regional hierarchical time-sharing configuration model of“cloud”,“group”and“end”layer,the grid side distributed energy storage cloud group end regional hierarchical time-sharing configuration algorithm is realized.The experimental results show that after applying this algorithm,the best grid side distributed energy storage configuration scheme can be determined,and the stability of grid side distributed energy storage cloud group end region layered timesharing configuration can be improved.
基金sponsored by the Autonomous Region Key R&D Task Special(2022B01008)the National Key R&D Program of China(SQ2022AAA010308-5).
文摘Network intrusion detection systems(NIDS)based on deep learning have continued to make significant advances.However,the following challenges remain:on the one hand,simply applying only Temporal Convolutional Networks(TCNs)can lead to models that ignore the impact of network traffic features at different scales on the detection performance.On the other hand,some intrusion detection methods considermulti-scale information of traffic data,but considering only forward network traffic information can lead to deficiencies in capturing multi-scale temporal features.To address both of these issues,we propose a hybrid Convolutional Neural Network that supports a multi-output strategy(BONUS)for industrial internet intrusion detection.First,we create a multiscale Temporal Convolutional Network by stacking TCN of different scales to capture the multiscale information of network traffic.Meanwhile,we propose a bi-directional structure and dynamically set the weights to fuse the forward and backward contextual information of network traffic at each scale to enhance the model’s performance in capturing the multi-scale temporal features of network traffic.In addition,we introduce a gated network for each of the two branches in the proposed method to assist the model in learning the feature representation of each branch.Extensive experiments reveal the effectiveness of the proposed approach on two publicly available traffic intrusion detection datasets named UNSW-NB15 and NSL-KDD with F1 score of 85.03% and 99.31%,respectively,which also validates the effectiveness of enhancing the model’s ability to capture multi-scale temporal features of traffic data on detection performance.
基金Supported by the National Natural Science Foundation of China(61601176)the Science and Technology Foundation of Hubei Provincial Department of Education(Q20161405)
文摘A novel convolutional neural network based on spatial pyramid for image classification is proposed.The network exploits image features with spatial pyramid representation.First,it extracts global features from an original image,and then different layers of grids are utilized to extract feature maps from different convolutional layers.Inspired by the spatial pyramid,the new network contains two parts,one of which is just like a standard convolutional neural network,composing of alternating convolutions and subsampling layers.But those convolution layers would be averagely pooled by the grid way to obtain feature maps,and then concatenated into a feature vector individually.Finally,those vectors are sequentially concatenated into a total feature vector as the last feature to the fully connection layer.This generated feature vector derives benefits from the classic and previous convolution layer,while the size of the grid adjusting the weight of the feature maps improves the recognition efficiency of the network.Experimental results demonstrate that this model improves the accuracy and applicability compared with the traditional model.
基金supported by the National Key Research and Development Program Topics(Grant No.2021YFB4000905)the National Natural Science Foundation of China(Grant Nos.62101432 and 62102309)in part by Shaanxi Natural Science Fundamental Research Program Project(No.2022JM-508).
文摘Low-light image enhancement methods have limitations in addressing issues such as color distortion,lack of vibrancy,and uneven light distribution and often require paired training data.To address these issues,we propose a two-stage unsupervised low-light image enhancement algorithm called Retinex and Exposure Fusion Network(RFNet),which can overcome the problems of over-enhancement of the high dynamic range and under-enhancement of the low dynamic range in existing enhancement algorithms.This algorithm can better manage the challenges brought about by complex environments in real-world scenarios by training with unpaired low-light images and regular-light images.In the first stage,we design a multi-scale feature extraction module based on Retinex theory,capable of extracting details and structural information at different scales to generate high-quality illumination and reflection images.In the second stage,an exposure image generator is designed through the camera response mechanism function to acquire exposure images containing more dark features,and the generated images are fused with the original input images to complete the low-light image enhancement.Experiments show the effectiveness and rationality of each module designed in this paper.And the method reconstructs the details of contrast and color distribution,outperforms the current state-of-the-art methods in both qualitative and quantitative metrics,and shows excellent performance in the real world.
基金Chengdu University of Technology Postgraduate Innovative Cultivation Program(CDUT2022BJCX015).
文摘The point segmentation of power lines and towers aims to use unmanned aerial vehicles(UAVs)for the inspection of power facilities,risk detection and modelling.Because of the unclear spatial relationship between the point clouds,the point segmentation of power lines and towers is challenging.In this paper,the power line and tower point datasets are constructed using Light Detection and Ranging(LiDAR)and a point segmentation method is proposed based on multiscale density features and a point-based deep learning network.First,the data are blocked and the neighbourhood is constructed.Second,the point clouds are downsampled to produce sparse point clouds.The point clouds before and after sampling are rotated,and their density is calculated.Next,a direct mapping method is selected to fuse the density information;a lightweight network is built to learn the features.Finally,the point clouds are segmented by concatenating the local features provided by PointCNN.The algorithm performs effectively on different types of power lines and towers.The mean interaction over union is 82.73%,and the overall accuracy can reach 91.76%.This approach can achieve the end-to-end integration of segmentation and provide theoretical support for the segmentation of large scenic point clouds.
基金supported by the Second Tibetan Plateau Scientific Expedition and Research(STEP)program under Grant 2019QZKK0106the Science and Technology Major Project of Henan Province under Grant 201400210900.
文摘Qinghai-Tibet Plateau lakes are important carriers of water resources in the‘Asian’s Water Tower’,and it is of great significance to grasp the spatial distribution of plateau lakes for the climate,ecological environment,and regional water cycle.However,the differences in spatial-spectral characteristics of various types of plateau lakes,and the complex background information of plateau both influence the extraction effect of lakes.Therefore,it is a great challenge to completely and effectively extract plateau lakes.In this study,we proposed a multiscale contextual information aggregation network,termed MSCANet,to automatically extract Plateau lake regions.It consists of three main components:a multiscale lake feature encoder,a feature decoder,and a Multicore Pyramid Pooling Module(MPPM).The multiscale lake feature encoder suppressed noise interference to capture multiscale spatial-spectral information from heterogeneous scenes.The MPPM module aggregated the contextual information of various lakes globally.We applied the MSCANet to the lake extraction of the Qinghai-Tibet Plateau based on Google data;additionally,comparative experiments showed that the MSCANet proposed had obvious improvement in lake detection accuracy and morphological integrity.Finally,we transferred the pre-trained optimal model to the Landsat-8 and Sentinel-2A dataset to verify the generalization of the MSCANet.
基金supported by the National Natural Science Foundation of China(Grant No.52305623)the Natural Science Foundation of Hubei Province,China(Grant No.2022CFB589)the Natural Science Foundation of Chongqing,China(Grant No.CSTB2023NSCQ-MSX0636).
文摘The quality of the exposed avionics solder joints has a significant impact on the stable operation of the inorbit spacecrafts.Nevertheless,the previously reported inspection methods for multi-scale solder joint defects generally suffer low accuracy and slow detection speed.Herein,a novel real-time detector VMMAO-YOLO is demonstrated based on variable multi-scale concurrency and multi-depth aggregation network(VMMANet)backbone and“one-stop”global information gather-distribute(OS-GD)module.Combined with infrared thermography technology,it can achieve fast and high-precision detection of both internal and external solder joint defects.Specifically,VMMANet is designed for efficient multi-scale feature extraction,which mainly comprises variable multi-scale feature concurrency(VMC)and multi-depth feature aggregation-alignment(MAA)modules.VMC can extract multi-scale features via multiple fix-sized and deformable convolutions,while MAA can aggregate and align multi-depth features on the same order for feature inference.This allows the low-level features with more spatial details to be transmitted in depth-wise,enabling the deeper network to selectively utilize the preceding inference information.The VMMANet replaces inefficient highdensity deep convolution by increasing the width of intermediate feature levels,leading to a salient decline in parameters.The OS-GD is developed for efficacious feature extraction,aggregation and distribution,further enhancing the global information gather and deployment capability of the network.On a self-made solder joint image data set,the VMMAOYOLO achieves a mean average precision mAP@0.5 of 91.6%,surpassing all the mainstream YOLO-series models.Moreover,the VMMAO-YOLO has a body size of merely 19.3 MB and a detection speed up to 119 frame per second,far superior to the prevalent YOLO-series detectors.