A method based on multiple images captured under different light sources at different incident angles was developed to recognize the coal density range in this study.The innovation is that two new images were construc...A method based on multiple images captured under different light sources at different incident angles was developed to recognize the coal density range in this study.The innovation is that two new images were constructed based on images captured under four single light sources.Reconstruction image 1 was constructed by fusing greyscale versions of the original images into one image,and Reconstruction image2 was constructed based on the differences between the images captured under the different light sources.Subsequently,the four original images and two reconstructed images were input into the convolutional neural network AlexNet to recognize the density range in three cases:-1.5(clean coal) and+1.5 g/cm^(3)(non-clean coal);-1.8(non-gangue) and+1.8 g/cm^(3)(gangue);-1.5(clean coal),1.5-1.8(middlings),and+1.8 g/cm^(3)(gangue).The results show the following:(1) The reconstructed images,especially Reconstruction image 2,can effectively improve the recognition accuracy for the coal density range compared with images captured under single light source.(2) The recognition accuracies for gangue and non-gangue,clean coal and non-clean coal,and clean coal,middlings,and gangue reached88.44%,86.72% and 77.08%,respectively.(3) The recognition accuracy increases as the density moves further away from the boundary density.展开更多
Scene recognition is a popular open problem in the computer vision field.Among lots of methods proposed in recent years,Convolutional Neural Network(CNN)based approaches achieve the best performance in scene recogniti...Scene recognition is a popular open problem in the computer vision field.Among lots of methods proposed in recent years,Convolutional Neural Network(CNN)based approaches achieve the best performance in scene recognition.We propose in this paper an advanced feature fusion algorithm using Multiple Convolutional Neural Network(Multi-CNN)for scene recognition.Unlike existing works that usually use individual convolutional neural network,a fusion of multiple different convolutional neural networks is applied for scene recognition.Firstly,we split training images in two directions and apply to three deep CNN model,and then extract features from the last full-connected(FC)layer and probabilistic layer on each model.Finally,feature vectors are fused with different fusion strategies in groups forwarded into SoftMax classifier.Our proposed algorithm is evaluated on three scene datasets for scene recognition.The experimental results demonstrate the effectiveness of proposed algorithm compared with other state-of-art approaches.展开更多
Non-orthogonal multiple access(NOMA), featuring high spectrum efficiency, massive connectivity and low latency, holds immense potential to be a novel multi-access technique in fifth-generation(5G) communication. Succe...Non-orthogonal multiple access(NOMA), featuring high spectrum efficiency, massive connectivity and low latency, holds immense potential to be a novel multi-access technique in fifth-generation(5G) communication. Successive interference cancellation(SIC) is proved to be an effective method to detect the NOMA signal by ordering the power of received signals and then decoding them. However, the error accumulation effect referred to as error propagation is an inevitable problem. In this paper,we propose a convolutional neural networks(CNNs) approach to restore the desired signal impaired by the multiple input multiple output(MIMO) channel. Especially in the uplink NOMA scenario,the proposed method can decode multiple users' information in a cluster instantaneously without any traditional communication signal processing steps. Simulation experiments are conducted in the Rayleigh channel and the results demonstrate that the error performance of the proposed learning system outperforms that of the classic SIC detection. Consequently, deep learning has disruptive potential to replace the conventional signal detection method.展开更多
<div style="text-align:justify;"> Load identification method is one of the major technical difficulties of non-intrusive composite monitoring. Binary V-I trajectory image can reflect the original V-I t...<div style="text-align:justify;"> Load identification method is one of the major technical difficulties of non-intrusive composite monitoring. Binary V-I trajectory image can reflect the original V-I trajectory characteristics to a large extent, so it is widely used in load identification. However, using single binary V-I trajectory feature for load identification has certain limitations. In order to improve the accuracy of load identification, the power feature is added on the basis of the binary V-I trajectory feature in this paper. We change the initial binary V-I trajectory into a new 3D feature by mapping the power feature to the third dimension. In order to reduce the impact of imbalance samples on load identification, the SVM SMOTE algorithm is used to balance the samples. Based on the deep learning method, the convolutional neural network model is used to extract the newly produced 3D feature to achieve load identification in this paper. The results indicate the new 3D feature has better observability and the proposed model has higher identification performance compared with other classification models on the public data set PLAID. </div>展开更多
Achieving sound communication systems in Under Water Acoustic(UWA)environment remains challenging for researchers.The communication scheme is complex since these acoustic channels exhibit uneven characteristics such a...Achieving sound communication systems in Under Water Acoustic(UWA)environment remains challenging for researchers.The communication scheme is complex since these acoustic channels exhibit uneven characteristics such as long propagation delay and irregular Doppler shifts.The development of machine and deep learning algorithms has reduced the burden of achieving reli-able and good communication schemes in the underwater acoustic environment.This paper proposes a novel intelligent selection method between the different modulation schemes such as Code Division Multiple Access(CDMA),Time Divi-sion Multiple Access(TDMA),and Orthogonal Frequency Division Multiplexing(OFDM)techniques using the hybrid combination of the convolutional neural net-works(CNN)and ensemble single feedforward layers(SFL).The convolutional neural networks are used for channel feature extraction,and boosted ensembled feedforward layers are used for modulation selection based on the CNN outputs.The extensive experimentation is carried out and compared with other hybrid learning models and conventional methods.Simulation results demonstrate that the performance of the proposed hybrid learning model has achieved nearly 98%accuracy and a 30%increase in BER performance which outperformed the other learning models in achieving the communication schemes under dynamic underwater environments.展开更多
Background The use of remote photoplethysmography(rPPG)to estimate blood volume pulse in a noncontact manner has been an active research topic in recent years.Existing methods are primarily based on a singlescale regi...Background The use of remote photoplethysmography(rPPG)to estimate blood volume pulse in a noncontact manner has been an active research topic in recent years.Existing methods are primarily based on a singlescale region of interest(ROI).However,some noise signals that are not easily separated in a single-scale space can be easily separated in a multi-scale space.Also,existing spatiotemporal networks mainly focus on local spatiotemporal information and do not emphasize temporal information,which is crucial in pulse extraction problems,resulting in insufficient spatiotemporal feature modelling.Methods Here,we propose a multi-scale facial video pulse extraction network based on separable spatiotemporal convolution(SSTC)and dimension separable attention(DSAT).First,to solve the problem of a single-scale ROI,we constructed a multi-scale feature space for initial signal separation.Second,SSTC and DSAT were designed for efficient spatiotemporal correlation modeling,which increased the information interaction between the long-span time and space dimensions;this placed more emphasis on temporal features.Results The signal-to-noise ratio(SNR)of the proposed network reached 9.58dB on the PURE dataset and 6.77dB on the UBFC-rPPG dataset,outperforming state-of-the-art algorithms.Conclusions The results showed that fusing multi-scale signals yielded better results than methods based on only single-scale signals.The proposed SSTC and dimension-separable attention mechanism will contribute to more accurate pulse signal extraction.展开更多
For studying and optimizing the performance of general-purpose computing on graphics processing units(GPGPU)based on single instruction multiple threads(SIMT)processor about the neural network application,this work co...For studying and optimizing the performance of general-purpose computing on graphics processing units(GPGPU)based on single instruction multiple threads(SIMT)processor about the neural network application,this work contributes a self-developed SIMT processor named Pomelo and correlated assembly program.The parallel mechanism of SIMT computing mode and self-developed Pomelo processor is briefly introduced.A common convolutional neural network(CNN)is built to verify the compatibility and functionality of the Pomelo processor.CNN computing flow with task level and hardware level optimization is adopted on the Pomelo processor.A specific algorithm for organizing a Z-shaped memory structure is developed,which addresses reducing memory access in mass data computing tasks.Performing the above-combined adaptation and optimization strategy,the experimental result demonstrates that reducing memory access in SIMT computing mode plays a crucial role in improving performance.A 6.52 times performance is achieved on the 4 processing elements case.展开更多
Predicting the direction of the stock market has always been a huge challenge.Also,the way of forecasting the stock market reduces the risk in the financial market,thus ensuring that brokers can make normal returns.De...Predicting the direction of the stock market has always been a huge challenge.Also,the way of forecasting the stock market reduces the risk in the financial market,thus ensuring that brokers can make normal returns.Despite the complexities of the stock market,the challenge has been increasingly addressed by experts in a variety of disciplines,including economics,statistics,and computer science.The introduction of machine learning,in-depth understanding of the prospects of the financial market,thus doing many experiments to predict the future so that the stock price trend has different degrees of success.In this paper,we propose a method to predict stocks from different industries and markets,as well as trend prediction using traditional machine learning algorithms such as linear regression,polynomial regression and learning techniques in time series prediction using two forms of special types of recursive neural networks:long and short time memory(LSTM)and spoken short-term memory.展开更多
目前的脑电(EEG)情感识别模型忽略了不同时段情感状态的差异性,未能强化关键的情感信息。针对上述问题,提出一种多上下文向量优化的卷积递归神经网络(CR-MCV)。首先构造脑电信号的特征矩阵序列,通过卷积神经网络(CNN)学习多通道脑电的...目前的脑电(EEG)情感识别模型忽略了不同时段情感状态的差异性,未能强化关键的情感信息。针对上述问题,提出一种多上下文向量优化的卷积递归神经网络(CR-MCV)。首先构造脑电信号的特征矩阵序列,通过卷积神经网络(CNN)学习多通道脑电的空间特征;然后利用基于多头注意力的递归神经网络生成多上下文向量进行高层抽象特征提取;最后利用全连接层进行情感分类。在DEAP(Database for Emotion Analysis using Physiological signals)数据集上进行实验,CR-MCV在唤醒和效价维度上分类准确率分别为88.09%和89.30%。实验结果表明,CR-MCV在利用电极空间位置信息和不同时段情感状态显著性特征基础上,能够自适应地分配特征的注意力并强化情感状态显著性信息。展开更多
文摘A method based on multiple images captured under different light sources at different incident angles was developed to recognize the coal density range in this study.The innovation is that two new images were constructed based on images captured under four single light sources.Reconstruction image 1 was constructed by fusing greyscale versions of the original images into one image,and Reconstruction image2 was constructed based on the differences between the images captured under the different light sources.Subsequently,the four original images and two reconstructed images were input into the convolutional neural network AlexNet to recognize the density range in three cases:-1.5(clean coal) and+1.5 g/cm^(3)(non-clean coal);-1.8(non-gangue) and+1.8 g/cm^(3)(gangue);-1.5(clean coal),1.5-1.8(middlings),and+1.8 g/cm^(3)(gangue).The results show the following:(1) The reconstructed images,especially Reconstruction image 2,can effectively improve the recognition accuracy for the coal density range compared with images captured under single light source.(2) The recognition accuracies for gangue and non-gangue,clean coal and non-clean coal,and clean coal,middlings,and gangue reached88.44%,86.72% and 77.08%,respectively.(3) The recognition accuracy increases as the density moves further away from the boundary density.
文摘Scene recognition is a popular open problem in the computer vision field.Among lots of methods proposed in recent years,Convolutional Neural Network(CNN)based approaches achieve the best performance in scene recognition.We propose in this paper an advanced feature fusion algorithm using Multiple Convolutional Neural Network(Multi-CNN)for scene recognition.Unlike existing works that usually use individual convolutional neural network,a fusion of multiple different convolutional neural networks is applied for scene recognition.Firstly,we split training images in two directions and apply to three deep CNN model,and then extract features from the last full-connected(FC)layer and probabilistic layer on each model.Finally,feature vectors are fused with different fusion strategies in groups forwarded into SoftMax classifier.Our proposed algorithm is evaluated on three scene datasets for scene recognition.The experimental results demonstrate the effectiveness of proposed algorithm compared with other state-of-art approaches.
基金supported by the National Natural Science Foundation of China (61471021)。
文摘Non-orthogonal multiple access(NOMA), featuring high spectrum efficiency, massive connectivity and low latency, holds immense potential to be a novel multi-access technique in fifth-generation(5G) communication. Successive interference cancellation(SIC) is proved to be an effective method to detect the NOMA signal by ordering the power of received signals and then decoding them. However, the error accumulation effect referred to as error propagation is an inevitable problem. In this paper,we propose a convolutional neural networks(CNNs) approach to restore the desired signal impaired by the multiple input multiple output(MIMO) channel. Especially in the uplink NOMA scenario,the proposed method can decode multiple users' information in a cluster instantaneously without any traditional communication signal processing steps. Simulation experiments are conducted in the Rayleigh channel and the results demonstrate that the error performance of the proposed learning system outperforms that of the classic SIC detection. Consequently, deep learning has disruptive potential to replace the conventional signal detection method.
文摘<div style="text-align:justify;"> Load identification method is one of the major technical difficulties of non-intrusive composite monitoring. Binary V-I trajectory image can reflect the original V-I trajectory characteristics to a large extent, so it is widely used in load identification. However, using single binary V-I trajectory feature for load identification has certain limitations. In order to improve the accuracy of load identification, the power feature is added on the basis of the binary V-I trajectory feature in this paper. We change the initial binary V-I trajectory into a new 3D feature by mapping the power feature to the third dimension. In order to reduce the impact of imbalance samples on load identification, the SVM SMOTE algorithm is used to balance the samples. Based on the deep learning method, the convolutional neural network model is used to extract the newly produced 3D feature to achieve load identification in this paper. The results indicate the new 3D feature has better observability and the proposed model has higher identification performance compared with other classification models on the public data set PLAID. </div>
文摘Achieving sound communication systems in Under Water Acoustic(UWA)environment remains challenging for researchers.The communication scheme is complex since these acoustic channels exhibit uneven characteristics such as long propagation delay and irregular Doppler shifts.The development of machine and deep learning algorithms has reduced the burden of achieving reli-able and good communication schemes in the underwater acoustic environment.This paper proposes a novel intelligent selection method between the different modulation schemes such as Code Division Multiple Access(CDMA),Time Divi-sion Multiple Access(TDMA),and Orthogonal Frequency Division Multiplexing(OFDM)techniques using the hybrid combination of the convolutional neural net-works(CNN)and ensemble single feedforward layers(SFL).The convolutional neural networks are used for channel feature extraction,and boosted ensembled feedforward layers are used for modulation selection based on the CNN outputs.The extensive experimentation is carried out and compared with other hybrid learning models and conventional methods.Simulation results demonstrate that the performance of the proposed hybrid learning model has achieved nearly 98%accuracy and a 30%increase in BER performance which outperformed the other learning models in achieving the communication schemes under dynamic underwater environments.
基金Supported by the National Natural Science Foundation of China(61903336,61976190)the Natural Science Foundation of Zhejiang Province(LY21F030015)。
文摘Background The use of remote photoplethysmography(rPPG)to estimate blood volume pulse in a noncontact manner has been an active research topic in recent years.Existing methods are primarily based on a singlescale region of interest(ROI).However,some noise signals that are not easily separated in a single-scale space can be easily separated in a multi-scale space.Also,existing spatiotemporal networks mainly focus on local spatiotemporal information and do not emphasize temporal information,which is crucial in pulse extraction problems,resulting in insufficient spatiotemporal feature modelling.Methods Here,we propose a multi-scale facial video pulse extraction network based on separable spatiotemporal convolution(SSTC)and dimension separable attention(DSAT).First,to solve the problem of a single-scale ROI,we constructed a multi-scale feature space for initial signal separation.Second,SSTC and DSAT were designed for efficient spatiotemporal correlation modeling,which increased the information interaction between the long-span time and space dimensions;this placed more emphasis on temporal features.Results The signal-to-noise ratio(SNR)of the proposed network reached 9.58dB on the PURE dataset and 6.77dB on the UBFC-rPPG dataset,outperforming state-of-the-art algorithms.Conclusions The results showed that fusing multi-scale signals yielded better results than methods based on only single-scale signals.The proposed SSTC and dimension-separable attention mechanism will contribute to more accurate pulse signal extraction.
基金the Scientific Research Program Funded by Shaanxi Provincial Education Department(20JY058)。
文摘For studying and optimizing the performance of general-purpose computing on graphics processing units(GPGPU)based on single instruction multiple threads(SIMT)processor about the neural network application,this work contributes a self-developed SIMT processor named Pomelo and correlated assembly program.The parallel mechanism of SIMT computing mode and self-developed Pomelo processor is briefly introduced.A common convolutional neural network(CNN)is built to verify the compatibility and functionality of the Pomelo processor.CNN computing flow with task level and hardware level optimization is adopted on the Pomelo processor.A specific algorithm for organizing a Z-shaped memory structure is developed,which addresses reducing memory access in mass data computing tasks.Performing the above-combined adaptation and optimization strategy,the experimental result demonstrates that reducing memory access in SIMT computing mode plays a crucial role in improving performance.A 6.52 times performance is achieved on the 4 processing elements case.
文摘Predicting the direction of the stock market has always been a huge challenge.Also,the way of forecasting the stock market reduces the risk in the financial market,thus ensuring that brokers can make normal returns.Despite the complexities of the stock market,the challenge has been increasingly addressed by experts in a variety of disciplines,including economics,statistics,and computer science.The introduction of machine learning,in-depth understanding of the prospects of the financial market,thus doing many experiments to predict the future so that the stock price trend has different degrees of success.In this paper,we propose a method to predict stocks from different industries and markets,as well as trend prediction using traditional machine learning algorithms such as linear regression,polynomial regression and learning techniques in time series prediction using two forms of special types of recursive neural networks:long and short time memory(LSTM)and spoken short-term memory.
文摘针对工业环境中广泛在多工况下多滚动轴承实时状态监测的需求和部署环境受限的挑战,提出一种基于卷积神经网络(Convolutional Neural Network,CNN)的面向多传感器滚动轴承运行状态监控方法。该方法将两个不同工况下的一维时间序列数据集以均方根(Root Mean Square,RMS)指标标注,并通过将一维时间序列多传感器数据重构为二维空间张量的形式输入卷积神经网络训练。最后利用层融合和16比特量化优化,将网络部署到FPGA上,用以解决CNN的计算开销。实验结果表明,在结合了两种不同工况的数据集下,网络测试推理准确度依然高达99.24%,比多层感知机实现高10.48%,比多层感知机结合支持向量机的实现高2.91%,该算法对于新加入的数据集也有较强的鲁棒性,经过重训练,新加入的数据集准确率可以达到99.17%。基于FPGA部署优化的网络的峰值能效为76.217GPOS/W,为CPU实现的33.09倍,GPU实现的5.39倍。其中,16比特精度部署的网络测试精度相较32比特精度实现仅降低0.001%。
文摘目前的脑电(EEG)情感识别模型忽略了不同时段情感状态的差异性,未能强化关键的情感信息。针对上述问题,提出一种多上下文向量优化的卷积递归神经网络(CR-MCV)。首先构造脑电信号的特征矩阵序列,通过卷积神经网络(CNN)学习多通道脑电的空间特征;然后利用基于多头注意力的递归神经网络生成多上下文向量进行高层抽象特征提取;最后利用全连接层进行情感分类。在DEAP(Database for Emotion Analysis using Physiological signals)数据集上进行实验,CR-MCV在唤醒和效价维度上分类准确率分别为88.09%和89.30%。实验结果表明,CR-MCV在利用电极空间位置信息和不同时段情感状态显著性特征基础上,能够自适应地分配特征的注意力并强化情感状态显著性信息。