Microphone array-based sound source localization(SSL)is widely used in a variety of occasions such as video conferencing,robotic hearing,speech enhancement,speech recognition and so on.The traditional SSL methods cann...Microphone array-based sound source localization(SSL)is widely used in a variety of occasions such as video conferencing,robotic hearing,speech enhancement,speech recognition and so on.The traditional SSL methods cannot achieve satisfactory performance in adverse noisy and reverberant environments.In order to improve localization performance,a novel SSL algorithm using convolutional residual network(CRN)is proposed in this paper.The spatial features including time difference of arrivals(TDOAs)between microphone pairs and steered response power-phase transform(SRPPHAT)spatial spectrum are extracted in each Gammatone sub-band.The spatial features of different sub-bands with a frame are combine into a feature matrix as the input of CRN.The proposed algorithm employ CRN to fuse the spatial features.Since the CRN introduces the residual structure on the basis of the convolutional network,it reduce the difficulty of training procedure and accelerate the convergence of the model.A CRN model is learned from the training data in various reverberation and noise environments to establish the mapping regularity between the input feature and the sound azimuth.Through simulation verification,compared with the methods using traditional deep neural network,the proposed algorithm can achieve a better localization performance in SSL task,and provide better generalization capacity to untrained noise and reverberation.展开更多
Robotic grasps play an important role in the service and industrial fields,and the robotic arm can grasp the object properly depends on the accuracy of the grasping detection result.In order to predict grasping detect...Robotic grasps play an important role in the service and industrial fields,and the robotic arm can grasp the object properly depends on the accuracy of the grasping detection result.In order to predict grasping detection positions for known or unknown objects by a modular robotic system,a convolutional neural network(CNN)with the residual block is proposed,which can be used to generate accurate grasping detection for input images of the scene.The proposed model architecture was trained on the standard Cornell grasp dataset and evaluated on the test dataset.Moreover,it was evaluated on different types of household objects and cluttered multi-objects.On the Cornell grasp dataset,the accuracy of the model on image-wise splitting detection and object-wise splitting detection achieved 95.5%and 93.6%,respectively.Further,the real detection time per image was 109 ms.The experimental results show that the model can quickly detect the grasping positions of a single object or multiple objects in image pixels in real time,and it keeps good stability and robustness.展开更多
Recently,deep learning(DL)has been widely used in the field of remaining useful life(RUL)prediction.Among various DL technologies,recurrent neural network(RNN)and its variant,e.g.,long short-term memory(LSTM)network,h...Recently,deep learning(DL)has been widely used in the field of remaining useful life(RUL)prediction.Among various DL technologies,recurrent neural network(RNN)and its variant,e.g.,long short-term memory(LSTM)network,have gained extensive attention for their ability to capture temporal dependence.Although existing RNN-based methods have demonstrated their RUL prediction effectiveness,they still suffer from the following two limitations:1)it is difficult for the RNN to directly extract degradation features from original monitoring data and 2)most RNN-based prognostics methods are unable to quantify RUL uncertainty.To address the aforementioned limitations,this paper proposes a new prognostics method named residual convolution LSTM(RC-LSTM)network.In the RC-LSTM,a new ResNet-based convolution LSTM(Res-ConvLSTM)layer is stacked with a convolution LSTM(ConvLSTM)layer to extract degradation representations from monitoring data.Then,under the assumption that the RUL follows a normal distribution,an appropriate output layer is constructed to quantify the uncertainty of prediction results.Finally,the effectiveness and superiority of the RC-LSTM are verified using monitoring data from accelerated bearing degradation tests.展开更多
Taking the real part and the imaginary part of complex sound pressure of the sound field as features,a transfer learning model is constructed.Based on the pre-training of a large amount of underwater acoustic data in ...Taking the real part and the imaginary part of complex sound pressure of the sound field as features,a transfer learning model is constructed.Based on the pre-training of a large amount of underwater acoustic data in the preselected sea area using the convolutional neural network(CNN),the few-shot underwater acoustic data in the test sea area are retrained to study the underwater sound source ranging problem.The S5 voyage data of SWellEX-96 experiment is used to verify the proposed method,realize the range estimation for the shallow source in the experiment,and compare the range estimation performance of the underwater target sound source of four methods:matched field processing(MFP),generalized regression neural network(GRNN),traditional CNN,and transfer learning.Experimental data processing results show that the transfer learning model based on residual CNN can effectively realize range estimation in few-shot scenes,and the estimation performance is remarkably better than that of other methods.展开更多
Because behavior recognition is based on video frame sequences,this paper proposes a behavior recognition algorithm that combines 3D residual convolutional neural network(R3D)and long short-term memory(LSTM).First,the...Because behavior recognition is based on video frame sequences,this paper proposes a behavior recognition algorithm that combines 3D residual convolutional neural network(R3D)and long short-term memory(LSTM).First,the residual module is extended to three dimensions,which can extract features in the time and space domain at the same time.Second,by changing the size of the pooling layer window the integrity of the time domain features is preserved,at the same time,in order to overcome the difficulty of network training and over-fitting problems,the batch normalization(BN)layer and the dropout layer are added.After that,because the global average pooling layer(GAP)is affected by the size of the feature map,the network cannot be further deepened,so the convolution layer and maxpool layer are added to the R3D network.Finally,because LSTM has the ability to memorize information and can extract more abstract timing features,the LSTM network is introduced into the R3D network.Experimental results show that the R3D+LSTM network achieves 91%recognition rate on the UCF-101 dataset.展开更多
This paper develops a fully data-driven,missingdata tolerant method for post-fault short-term voltage stability(STVS)assessment of power systems against the incomplete PMU measurements.The super-resolution perception(...This paper develops a fully data-driven,missingdata tolerant method for post-fault short-term voltage stability(STVS)assessment of power systems against the incomplete PMU measurements.The super-resolution perception(SRP),based on a deep residual learning convolutional neural network,is employed to cope with the missing PMU measurements.The incremental broad learning(BL)is used to rapidly update the model to maintain and enhance the online application performance.Being different from the state-of-the-art methods,the proposed method is fully data-driven and can fill up missing data under any PMU placement information loss and network topology change scenario.Simulation results demonstrate that the proposed method has the best performance in terms of STVS assessment accuracy and missing-data tolerance among the existing methods on the benchmark testing system.展开更多
Complex nature of underwater environment poses biggest challenge towards image acquisition and transmission of underwater images.This paper proposes an integrated approach which consists of a non-learning enhancement ...Complex nature of underwater environment poses biggest challenge towards image acquisition and transmission of underwater images.This paper proposes an integrated approach which consists of a non-learning enhancement method with deep Convolutional Neural Networks(CNN)for compression and reconstruction of the image.The proposed method does color and contrast correction for image enhancement.The enhanced images are down-sampled using 9-layer CNN followed by Discrete Wavelet Transform(DWT).The decompression is done by using Inverse DWT.Further,the sub-pixel up-sampled image is de-blurred using a three-layer CNN.Residual Dense CNN(RD-CNN)is used to improve the quality of the reconstructed image after deblurring.The quality of the reconstructed images is measured using Peak Signal to Noise Ratio(PSNR)and Structural Similarity Index Metric(SSIM).The proposed model provides better image enhancement,compression,and reconstruction quality than the existing state-of-the-art methods and Super Resolution CNN(SRCNN)respectively.展开更多
基金supported by Nature Science Research Project of Higher Education Institutions in Jiangsu Province under Grant No.21KJB510018National Nature Science Foundation of China (NSFC)under Grant No.62001215.
文摘Microphone array-based sound source localization(SSL)is widely used in a variety of occasions such as video conferencing,robotic hearing,speech enhancement,speech recognition and so on.The traditional SSL methods cannot achieve satisfactory performance in adverse noisy and reverberant environments.In order to improve localization performance,a novel SSL algorithm using convolutional residual network(CRN)is proposed in this paper.The spatial features including time difference of arrivals(TDOAs)between microphone pairs and steered response power-phase transform(SRPPHAT)spatial spectrum are extracted in each Gammatone sub-band.The spatial features of different sub-bands with a frame are combine into a feature matrix as the input of CRN.The proposed algorithm employ CRN to fuse the spatial features.Since the CRN introduces the residual structure on the basis of the convolutional network,it reduce the difficulty of training procedure and accelerate the convergence of the model.A CRN model is learned from the training data in various reverberation and noise environments to establish the mapping regularity between the input feature and the sound azimuth.Through simulation verification,compared with the methods using traditional deep neural network,the proposed algorithm can achieve a better localization performance in SSL task,and provide better generalization capacity to untrained noise and reverberation.
基金National Natural Science Foundation of China(No.52101346)Fundamental Research Funds for the Central Universities,China(No.2232019D3-61)Initial Research Fund for the Young Teachers of Donghua University,China。
文摘Robotic grasps play an important role in the service and industrial fields,and the robotic arm can grasp the object properly depends on the accuracy of the grasping detection result.In order to predict grasping detection positions for known or unknown objects by a modular robotic system,a convolutional neural network(CNN)with the residual block is proposed,which can be used to generate accurate grasping detection for input images of the scene.The proposed model architecture was trained on the standard Cornell grasp dataset and evaluated on the test dataset.Moreover,it was evaluated on different types of household objects and cluttered multi-objects.On the Cornell grasp dataset,the accuracy of the model on image-wise splitting detection and object-wise splitting detection achieved 95.5%and 93.6%,respectively.Further,the real detection time per image was 109 ms.The experimental results show that the model can quickly detect the grasping positions of a single object or multiple objects in image pixels in real time,and it keeps good stability and robustness.
基金This research was supported by National Natural Science Foundation of China(52005387,52025056)Project funded by China Postdoctoral Science Foundation(2020M673380)Fundamental Research Funds for the Central Universities.
文摘Recently,deep learning(DL)has been widely used in the field of remaining useful life(RUL)prediction.Among various DL technologies,recurrent neural network(RNN)and its variant,e.g.,long short-term memory(LSTM)network,have gained extensive attention for their ability to capture temporal dependence.Although existing RNN-based methods have demonstrated their RUL prediction effectiveness,they still suffer from the following two limitations:1)it is difficult for the RNN to directly extract degradation features from original monitoring data and 2)most RNN-based prognostics methods are unable to quantify RUL uncertainty.To address the aforementioned limitations,this paper proposes a new prognostics method named residual convolution LSTM(RC-LSTM)network.In the RC-LSTM,a new ResNet-based convolution LSTM(Res-ConvLSTM)layer is stacked with a convolution LSTM(ConvLSTM)layer to extract degradation representations from monitoring data.Then,under the assumption that the RUL follows a normal distribution,an appropriate output layer is constructed to quantify the uncertainty of prediction results.Finally,the effectiveness and superiority of the RC-LSTM are verified using monitoring data from accelerated bearing degradation tests.
基金supported by the National Natural Science Foundation of China(1197428611904274)+1 种基金the Shaanxi Young Science and Technology Star Program(2021KJXX-07)the fundamental research funding for characteristic disciplines(G2022WD0235)。
文摘Taking the real part and the imaginary part of complex sound pressure of the sound field as features,a transfer learning model is constructed.Based on the pre-training of a large amount of underwater acoustic data in the preselected sea area using the convolutional neural network(CNN),the few-shot underwater acoustic data in the test sea area are retrained to study the underwater sound source ranging problem.The S5 voyage data of SWellEX-96 experiment is used to verify the proposed method,realize the range estimation for the shallow source in the experiment,and compare the range estimation performance of the underwater target sound source of four methods:matched field processing(MFP),generalized regression neural network(GRNN),traditional CNN,and transfer learning.Experimental data processing results show that the transfer learning model based on residual CNN can effectively realize range estimation in few-shot scenes,and the estimation performance is remarkably better than that of other methods.
基金Supported by the Shaanxi Province Key Research and Development Project (No. 2021GY-280)Shaanxi Province Natural Science Basic Research Program (No. 2021JM-459)the National Natural Science Foundation of China (No. 61772417)
文摘Because behavior recognition is based on video frame sequences,this paper proposes a behavior recognition algorithm that combines 3D residual convolutional neural network(R3D)and long short-term memory(LSTM).First,the residual module is extended to three dimensions,which can extract features in the time and space domain at the same time.Second,by changing the size of the pooling layer window the integrity of the time domain features is preserved,at the same time,in order to overcome the difficulty of network training and over-fitting problems,the batch normalization(BN)layer and the dropout layer are added.After that,because the global average pooling layer(GAP)is affected by the size of the feature map,the network cannot be further deepened,so the convolution layer and maxpool layer are added to the R3D network.Finally,because LSTM has the ability to memorize information and can extract more abstract timing features,the LSTM network is introduced into the R3D network.Experimental results show that the R3D+LSTM network achieves 91%recognition rate on the UCF-101 dataset.
基金The work was supported in part by National Natural Science Foundation of China(51807009,71931003,72061147004).
文摘This paper develops a fully data-driven,missingdata tolerant method for post-fault short-term voltage stability(STVS)assessment of power systems against the incomplete PMU measurements.The super-resolution perception(SRP),based on a deep residual learning convolutional neural network,is employed to cope with the missing PMU measurements.The incremental broad learning(BL)is used to rapidly update the model to maintain and enhance the online application performance.Being different from the state-of-the-art methods,the proposed method is fully data-driven and can fill up missing data under any PMU placement information loss and network topology change scenario.Simulation results demonstrate that the proposed method has the best performance in terms of STVS assessment accuracy and missing-data tolerance among the existing methods on the benchmark testing system.
文摘Complex nature of underwater environment poses biggest challenge towards image acquisition and transmission of underwater images.This paper proposes an integrated approach which consists of a non-learning enhancement method with deep Convolutional Neural Networks(CNN)for compression and reconstruction of the image.The proposed method does color and contrast correction for image enhancement.The enhanced images are down-sampled using 9-layer CNN followed by Discrete Wavelet Transform(DWT).The decompression is done by using Inverse DWT.Further,the sub-pixel up-sampled image is de-blurred using a three-layer CNN.Residual Dense CNN(RD-CNN)is used to improve the quality of the reconstructed image after deblurring.The quality of the reconstructed images is measured using Peak Signal to Noise Ratio(PSNR)and Structural Similarity Index Metric(SSIM).The proposed model provides better image enhancement,compression,and reconstruction quality than the existing state-of-the-art methods and Super Resolution CNN(SRCNN)respectively.