Convolutional neural networks struggle to accurately handle changes in angles and twists in the direction of images,which affects their ability to recognize patterns based on internal feature levels. In contrast, Caps...Convolutional neural networks struggle to accurately handle changes in angles and twists in the direction of images,which affects their ability to recognize patterns based on internal feature levels. In contrast, CapsNet overcomesthese limitations by vectorizing information through increased directionality and magnitude, ensuring that spatialinformation is not overlooked. Therefore, this study proposes a novel expression recognition technique calledCAPSULE-VGG, which combines the strengths of CapsNet and convolutional neural networks. By refining andintegrating features extracted by a convolutional neural network before introducing theminto CapsNet, ourmodelenhances facial recognition capabilities. Compared to traditional neural network models, our approach offersfaster training pace, improved convergence speed, and higher accuracy rates approaching stability. Experimentalresults demonstrate that our method achieves recognition rates of 74.14% for the FER2013 expression dataset and99.85% for the CK+ expression dataset. By contrasting these findings with those obtained using conventionalexpression recognition techniques and incorporating CapsNet’s advantages, we effectively address issues associatedwith convolutional neural networks while increasing expression identification accuracy.展开更多
Automatic modulation recognition(AMR)of radiation source signals is a research focus in the field of cognitive radio.However,the AMR of radiation source signals at low SNRs still faces a great challenge.Therefore,the ...Automatic modulation recognition(AMR)of radiation source signals is a research focus in the field of cognitive radio.However,the AMR of radiation source signals at low SNRs still faces a great challenge.Therefore,the AMR method of radiation source signals based on two-dimensional data matrix and improved residual neural network is proposed in this paper.First,the time series of the radiation source signals are reconstructed into two-dimensional data matrix,which greatly simplifies the signal preprocessing process.Second,the depthwise convolution and large-size convolutional kernels based residual neural network(DLRNet)is proposed to improve the feature extraction capability of the AMR model.Finally,the model performs feature extraction and classification on the two-dimensional data matrix to obtain the recognition vector that represents the signal modulation type.Theoretical analysis and simulation results show that the AMR method based on two-dimensional data matrix and improved residual network can significantly improve the accuracy of the AMR method.The recognition accuracy of the proposed method maintains a high level greater than 90% even at -14 dB SNR.展开更多
In recent years,wearable devices-based Human Activity Recognition(HAR)models have received significant attention.Previously developed HAR models use hand-crafted features to recognize human activities,leading to the e...In recent years,wearable devices-based Human Activity Recognition(HAR)models have received significant attention.Previously developed HAR models use hand-crafted features to recognize human activities,leading to the extraction of basic features.The images captured by wearable sensors contain advanced features,allowing them to be analyzed by deep learning algorithms to enhance the detection and recognition of human actions.Poor lighting and limited sensor capabilities can impact data quality,making the recognition of human actions a challenging task.The unimodal-based HAR approaches are not suitable in a real-time environment.Therefore,an updated HAR model is developed using multiple types of data and an advanced deep-learning approach.Firstly,the required signals and sensor data are accumulated from the standard databases.From these signals,the wave features are retrieved.Then the extracted wave features and sensor data are given as the input to recognize the human activity.An Adaptive Hybrid Deep Attentive Network(AHDAN)is developed by incorporating a“1D Convolutional Neural Network(1DCNN)”with a“Gated Recurrent Unit(GRU)”for the human activity recognition process.Additionally,the Enhanced Archerfish Hunting Optimizer(EAHO)is suggested to fine-tune the network parameters for enhancing the recognition process.An experimental evaluation is performed on various deep learning networks and heuristic algorithms to confirm the effectiveness of the proposed HAR model.The EAHO-based HAR model outperforms traditional deep learning networks with an accuracy of 95.36,95.25 for recall,95.48 for specificity,and 95.47 for precision,respectively.The result proved that the developed model is effective in recognizing human action by taking less time.Additionally,it reduces the computation complexity and overfitting issue through using an optimization approach.展开更多
The ever-growing available visual data(i.e.,uploaded videos and pictures by internet users)has attracted the research community’s attention in the computer vision field.Therefore,finding efficient solutions to extrac...The ever-growing available visual data(i.e.,uploaded videos and pictures by internet users)has attracted the research community’s attention in the computer vision field.Therefore,finding efficient solutions to extract knowledge from these sources is imperative.Recently,the BlazePose system has been released for skeleton extraction from images oriented to mobile devices.With this skeleton graph representation in place,a Spatial-Temporal Graph Convolutional Network can be implemented to predict the action.We hypothesize that just by changing the skeleton input data for a different set of joints that offers more information about the action of interest,it is possible to increase the performance of the Spatial-Temporal Graph Convolutional Network for HAR tasks.Hence,in this study,we present the first implementation of the BlazePose skeleton topology upon this architecture for action recognition.Moreover,we propose the Enhanced-BlazePose topology that can achieve better results than its predecessor.Additionally,we propose different skeleton detection thresholds that can improve the accuracy performance even further.We reached a top-1 accuracy performance of 40.1%on the Kinetics dataset.For the NTU-RGB+D dataset,we achieved 87.59%and 92.1%accuracy for Cross-Subject and Cross-View evaluation criteria,respectively.展开更多
This study aims to reduce the interference of ambient noise in mobile communication,improve the accuracy and authenticity of information transmitted by sound,and guarantee the accuracy of voice information deliv-ered ...This study aims to reduce the interference of ambient noise in mobile communication,improve the accuracy and authenticity of information transmitted by sound,and guarantee the accuracy of voice information deliv-ered by mobile communication.First,the principles and techniques of speech enhancement are analyzed,and a fast lateral recursive least square method(FLRLS method)is adopted to process sound data.Then,the convolutional neural networks(CNNs)-based noise recognition CNN(NR-CNN)algorithm and speech enhancement model are proposed.Finally,related experiments are designed to verify the performance of the proposed algorithm and model.The experimental results show that the noise classification accuracy of the NR-CNN noise recognition algorithm is higher than 99.82%,and the recall rate and F1 value are also higher than 99.92.The proposed sound enhance-ment model can effectively enhance the original sound in the case of noise interference.After the CNN is incorporated,the average value of all noisy sound perception quality evaluation system values is improved by over 21%compared with that of the traditional noise reduction method.The proposed algorithm can adapt to a variety of voice environments and can simultaneously enhance and reduce noise processing on a variety of different types of voice signals,and the processing effect is better than that of traditional sound enhancement models.In addition,the sound distortion index of the proposed speech enhancement model is inferior to that of the control group,indicating that the addition of the CNN neural network is less likely to cause sound signal distortion in various sound environments and shows superior robustness.In summary,the proposed CNN-based speech enhancement model shows significant sound enhancement effects,stable performance,and strong adapt-ability.This study provides a reference and basis for research applying neural networks in speech enhancement.展开更多
Accurate handwriting recognition has been a challenging computer vision problem,because static feature analysis of the text pictures is often inade-quate to account for high variance in handwriting styles across peopl...Accurate handwriting recognition has been a challenging computer vision problem,because static feature analysis of the text pictures is often inade-quate to account for high variance in handwriting styles across people and poor image quality of the handwritten text.Recently,by introducing machine learning,especially convolutional neural networks(CNNs),the recognition accuracy of various handwriting patterns is steadily improved.In this paper,a deep CNN model is developed to further improve the recognition rate of the MNIST hand-written digit dataset with a fast-converging rate in training.The proposed model comes with a multi-layer deep arrange structure,including 3 convolution and acti-vation layers for feature extraction and 2 fully connected layers(i.e.,dense layers)for classification.The model’s hyperparameters,such as the batch sizes,kernel sizes,batch normalization,activation function,and learning rate are optimized to enhance the recognition performance.The average classification accuracy of the proposed methodology is found to reach 99.82%on the training dataset and 99.40%on the testing dataset,making it a nearly error-free system for MNIST recognition.展开更多
With the continuous progress of The Times and the development of technology,the rise of network social media has also brought the“explosive”growth of image data.As one of the main ways of People’s Daily communicati...With the continuous progress of The Times and the development of technology,the rise of network social media has also brought the“explosive”growth of image data.As one of the main ways of People’s Daily communication,image is widely used as a carrier of communication because of its rich content,intuitive and other advantages.Image recognition based on convolution neural network is the first application in the field of image recognition.A series of algorithm operations such as image eigenvalue extraction,recognition and convolution are used to identify and analyze different images.The rapid development of artificial intelligence makes machine learning more and more important in its research field.Use algorithms to learn each piece of data and predict the outcome.This has become an important key to open the door of artificial intelligence.In machine vision,image recognition is the foundation,but how to associate the low-level information in the image with the high-level image semantics becomes the key problem of image recognition.Predecessors have provided many model algorithms,which have laid a solid foundation for the development of artificial intelligence and image recognition.The multi-level information fusion model based on the VGG16 model is an improvement on the fully connected neural network.Different from full connection network,convolutional neural network does not use full connection method in each layer of neurons of neural network,but USES some nodes for connection.Although this method reduces the computation time,due to the fact that the convolutional neural network model will lose some useful feature information in the process of propagation and calculation,this paper improves the model to be a multi-level information fusion of the convolution calculation method,and further recovers the discarded feature information,so as to improve the recognition rate of the image.VGG divides the network into five groups(mimicking the five layers of AlexNet),yet it USES 3*3 filters and combines them as a convolution sequence.Network deeper DCNN,channel number is bigger.The recognition rate of the model was verified by 0RL Face Database,BioID Face Database and CASIA Face Image Database.展开更多
Underwater target recognition is a key technology for underwater acoustic countermeasure.How to classify and recognize underwater targets according to the noise information of underwater targets has been a hot topic i...Underwater target recognition is a key technology for underwater acoustic countermeasure.How to classify and recognize underwater targets according to the noise information of underwater targets has been a hot topic in the field of underwater acoustic signals.In this paper,the deep learning model is applied to underwater target recognition.Improved anti-noise Power-Normalized Cepstral Coefficients(ia-PNCC)is proposed,based on PNCC applied to underwater noises.Multitaper and normalized Gammatone filter banks are applied to improve the anti-noise capacity.The method is combined with a convolutional neural network in order to recognize the underwater target.Experiment results show that the acoustic feature presented by ia-PNCC has lower noise and are wellsuited to underwater target recognition using a convolutional neural network.Compared with the combination of convolutional neural network with single acoustic feature,such as MFCC(Mel-scale Frequency Cepstral Coefficients)or LPCC(Linear Prediction Cepstral Coefficients),the combination of the ia-PNCC with a convolutional neural network offers better accuracy for underwater target recognition.展开更多
How to correctly acquire the appropriate features is a primary problem in network protocol recognition field.Aiming to avoid the trouble of artificially extracting features in traditional methods and improve recogniti...How to correctly acquire the appropriate features is a primary problem in network protocol recognition field.Aiming to avoid the trouble of artificially extracting features in traditional methods and improve recognition accuracy,a network protocol recognition method based on Convolutional Neural Network(CNN)is proposed.The method utilizes deep learning technique,and it processes network flows automatically.Firstly,normalization is performed on the intercepted network flows and they are mapped into two-dimensional matrix which will be used as the input of CNN.Then,an improved classification model named Ptr CNN is built,which can automatically extract the appropriate features of network protocols.Finally,the classification model is trained to recognize the network protocols.The proposed approach is compared with several machine learning methods.Experimental results show that the tailored CNN can not only improve protocol recognition accuracy but also ensure the fast convergence of classification model and reduce the classification time.展开更多
Scene recognition is a popular open problem in the computer vision field.Among lots of methods proposed in recent years,Convolutional Neural Network(CNN)based approaches achieve the best performance in scene recogniti...Scene recognition is a popular open problem in the computer vision field.Among lots of methods proposed in recent years,Convolutional Neural Network(CNN)based approaches achieve the best performance in scene recognition.We propose in this paper an advanced feature fusion algorithm using Multiple Convolutional Neural Network(Multi-CNN)for scene recognition.Unlike existing works that usually use individual convolutional neural network,a fusion of multiple different convolutional neural networks is applied for scene recognition.Firstly,we split training images in two directions and apply to three deep CNN model,and then extract features from the last full-connected(FC)layer and probabilistic layer on each model.Finally,feature vectors are fused with different fusion strategies in groups forwarded into SoftMax classifier.Our proposed algorithm is evaluated on three scene datasets for scene recognition.The experimental results demonstrate the effectiveness of proposed algorithm compared with other state-of-art approaches.展开更多
This paper presents a handwritten document recognition system based on the convolutional neural network technique.In today’s world,handwritten document recognition is rapidly attaining the attention of researchers du...This paper presents a handwritten document recognition system based on the convolutional neural network technique.In today’s world,handwritten document recognition is rapidly attaining the attention of researchers due to its promising behavior as assisting technology for visually impaired users.This technology is also helpful for the automatic data entry system.In the proposed systemprepared a dataset of English language handwritten character images.The proposed system has been trained for the large set of sample data and tested on the sample images of user-defined handwritten documents.In this research,multiple experiments get very worthy recognition results.The proposed systemwill first performimage pre-processing stages to prepare data for training using a convolutional neural network.After this processing,the input document is segmented using line,word and character segmentation.The proposed system get the accuracy during the character segmentation up to 86%.Then these segmented characters are sent to a convolutional neural network for their recognition.The recognition and segmentation technique proposed in this paper is providing the most acceptable accurate results on a given dataset.The proposed work approaches to the accuracy of the result during convolutional neural network training up to 93%,and for validation that accuracy slightly decreases with 90.42%.展开更多
This study proposes a convolutional neural network(CNN)-based identity recognition scheme using electrocardiogram(ECG)at different water temperatures(WTs)during bathing,aiming to explore the impact of ECG length on th...This study proposes a convolutional neural network(CNN)-based identity recognition scheme using electrocardiogram(ECG)at different water temperatures(WTs)during bathing,aiming to explore the impact of ECG length on the recognition rate.ECG data was collected using non-contact electrodes at five different WTs during bathing.Ten young student subjects(seven men and three women)participated in data collection.Three ECG recordings were collected at each preset bathtub WT for each subject.Each recording is 18 min long,with a sampling rate of 200 Hz.In total,150 ECG recordings and 150 WT recordings were collected.The R peaks were detected based on the processed ECG(baseline wandering eliminated,50-Hz hum removed,ECG smoothing and ECG normalization)and the QRS complex waves were segmented.These segmented waves were then transformed into binary images,which served as the datasets.For each subject,the training,validation,and test data were taken from the first,second,and third ECG recordings,respectively.The number of training and validation images was 84297 and 83734,respectively.In the test stage,the preliminary classification results were obtained using the trained CNN model,and the finer classification results were determined using the majority vote method based on the preliminary results.The validation rate was 98.71%.The recognition rates were 95.00%and 98.00%when the number of test heartbeats was 7 and 17,respectively,for each subject.展开更多
Image based individual dairy cattle recognition has gained much attention recently. In order to further improve the accuracy of individual dairy cattle recognition, an algorithm based on deep convolutional neural netw...Image based individual dairy cattle recognition has gained much attention recently. In order to further improve the accuracy of individual dairy cattle recognition, an algorithm based on deep convolutional neural network( DCNN) is proposed in this paper,which enables automatic feature extraction and classification that outperforms traditional hand craft features. Through making multigroup comparison experiments including different network layers,different sizes of convolution kernel and different feature dimensions in full connection layer,we demonstrate that the proposed method is suitable for dairy cattle classification. The experimental results show that the accuracy is significantly higher compared to two traditional image processing algorithms: scale invariant feature transform( SIFT) algorithm and bag of feature( BOF) model.展开更多
Biometric security systems based on facial characteristics face a challenging task due to variability in the intrapersonal facial appearance of subjects traced to factors such as pose, illumination, expression and agi...Biometric security systems based on facial characteristics face a challenging task due to variability in the intrapersonal facial appearance of subjects traced to factors such as pose, illumination, expression and aging. This paper innovates as it proposes a deep learning and set-based approach to face recognition subject to aging. The images for each subject taken at various times are treated as a single set, which is then compared to sets of images belonging to other subjects. Facial features are extracted using a convolutional neural network characteristic of deep learning. Our experimental results show that set-based recognition performs better than the singleton-based approach for both face identification and face verification. We also find that by using set-based recognition, it is easier to recognize older subjects from younger ones rather than younger subjects from older ones.展开更多
In many existing multi-view gait recognition methods based on images or video sequences,gait sequences are usually used to superimpose and synthesize images and construct energy-like template.However,information may b...In many existing multi-view gait recognition methods based on images or video sequences,gait sequences are usually used to superimpose and synthesize images and construct energy-like template.However,information may be lost during the process of compositing image and capture EMG signals.Errors and the recognition accuracy may be introduced and affected respectively by some factors such as period detection.To better solve the problems,a multi-view gait recognition method using deep convolutional neural network and channel attention mechanism is proposed.Firstly,the sliding time window method is used to capture EMG signals.Then,the back-propagation learning algorithm is used to train each layer of convolution,which improves the learning ability of the convolutional neural network.Finally,the channel attention mechanism is integrated into the neural network,which will improve the ability of expressing gait features.And a classifier is used to classify gait.As can be shown from experimental results on two public datasets,OULP and CASIA-B,the recognition rate of the proposed method can be achieved at 88.44%and 97.25%respectively.As can be shown from the comparative experimental results,the proposed method has better recognition effect than several other newer convolutional neural network methods.Therefore,the combination of convolutional neural network and channel attention mechanism is of great value for gait recognition.展开更多
In recent years,deep convolution neural network has exhibited excellent performance in computer vision and has a far-reaching impact.Traditional plant taxonomic identification requires high expertise,which is time-con...In recent years,deep convolution neural network has exhibited excellent performance in computer vision and has a far-reaching impact.Traditional plant taxonomic identification requires high expertise,which is time-consuming.Most nature reserves have problems such as incomplete species surveys,inaccurate taxonomic identification,and untimely updating of status data.Simple and accurate recognition of plant images can be achieved by applying convolutional neural network technology to explore the best network model.Taking 24 typical desert plant species that are widely distributed in the nature reserves in Xinjiang Uygur Autonomous Region of China as the research objects,this study established an image database and select the optimal network model for the image recognition of desert plant species to provide decision support for fine management in the nature reserves in Xinjiang,such as species investigation and monitoring,by using deep learning.Since desert plant species were not included in the public dataset,the images used in this study were mainly obtained through field shooting and downloaded from the Plant Photo Bank of China(PPBC).After the sorting process and statistical analysis,a total of 2331 plant images were finally collected(2071 images from field collection and 260 images from the PPBC),including 24 plant species belonging to 14 families and 22 genera.A large number of numerical experiments were also carried out to compare a series of 37 convolutional neural network models with good performance,from different perspectives,to find the optimal network model that is most suitable for the image recognition of desert plant species in Xinjiang.The results revealed 24 models with a recognition Accuracy,of greater than 70.000%.Among which,Residual Network X_8GF(RegNetX_8GF)performs the best,with Accuracy,Precision,Recall,and F1(which refers to the harmonic mean of the Precision and Recall values)values of 78.33%,77.65%,69.55%,and 71.26%,respectively.Considering the demand factors of hardware equipment and inference time,Mobile NetworkV2 achieves the best balance among the Accuracy,the number of parameters and the number of floating-point operations.The number of parameters for Mobile Network V2(MobileNetV2)is 1/16 of RegNetX_8GF,and the number of floating-point operations is 1/24.Our findings can facilitate efficient decision-making for the management of species survey,cataloging,inspection,and monitoring in the nature reserves in Xinjiang,providing a scientific basis for the protection and utilization of natural plant resources.展开更多
Facial Expression Recognition(FER)has been an interesting area of research in places where there is human-computer interaction.Human psychol-ogy,emotions and behaviors can be analyzed in FER.Classifiers used in FER hav...Facial Expression Recognition(FER)has been an interesting area of research in places where there is human-computer interaction.Human psychol-ogy,emotions and behaviors can be analyzed in FER.Classifiers used in FER have been perfect on normal faces but have been found to be constrained in occluded faces.Recently,Deep Learning Techniques(DLT)have gained popular-ity in applications of real-world problems including recognition of human emo-tions.The human face reflects emotional states and human intentions.An expression is the most natural and powerful way of communicating non-verbally.Systems which form communications between the two are termed Human Machine Interaction(HMI)systems.FER can improve HMI systems as human expressions convey useful information to an observer.This paper proposes a FER scheme called EECNN(Enhanced Convolution Neural Network with Atten-tion mechanism)to recognize seven types of human emotions with satisfying results in its experiments.Proposed EECNN achieved 89.8%accuracy in classi-fying the images.展开更多
The COVID-19 pandemic poses an additional serious public health threat due to little or no pre-existing human immunity,and developing a system to identify COVID-19 in its early stages will save millions of lives.This ...The COVID-19 pandemic poses an additional serious public health threat due to little or no pre-existing human immunity,and developing a system to identify COVID-19 in its early stages will save millions of lives.This study applied support vector machine(SVM),k-nearest neighbor(K-NN)and deep learning convolutional neural network(CNN)algorithms to classify and detect COVID-19 using chest X-ray radiographs.To test the proposed system,chest X-ray radiographs and CT images were collected from different standard databases,which contained 95 normal images,140 COVID-19 images and 10 SARS images.Two scenarios were considered to develop a system for predicting COVID-19.In the first scenario,the Gaussian filter was applied to remove noise from the chest X-ray radiograph images,and then the adaptive region growing technique was used to segment the region of interest from the chest X-ray radiographs.After segmentation,a hybrid feature extraction composed of 2D-DWT and gray level co-occurrence matrix was utilized to extract the features significant for detecting COVID-19.These features were processed using SVM and K-NN.In the second scenario,a CNN transfer model(ResNet 50)was used to detect COVID-19.The system was examined and evaluated through multiclass statistical analysis,and the empirical results of the analysis found significant values of 97.14%,99.34%,99.26%,99.26%and 99.40%for accuracy,specificity,sensitivity,recall and AUC,respectively.Thus,the CNN model showed significant success;it achieved optimal accuracy,effectiveness and robustness for detecting COVID-19.展开更多
Even though several advances have been made in recent years,handwritten script recognition is still a challenging task in the pattern recognition domain.This field has gained much interest lately due to its diverse ap...Even though several advances have been made in recent years,handwritten script recognition is still a challenging task in the pattern recognition domain.This field has gained much interest lately due to its diverse application potentials.Nowadays,different methods are available for automatic script recognition.Among most of the reported script recognition techniques,deep neural networks have achieved impressive results and outperformed the classical machine learning algorithms.However,the process of designing such networks right from scratch intuitively appears to incur a significant amount of trial and error,which renders them unfeasible.This approach often requires manual intervention with domain expertise which consumes substantial time and computational resources.To alleviate this shortcoming,this paper proposes a new neural architecture search approach based on meta-heuristic quantum particle swarm optimization(QPSO),which is capable of automatically evolving the meaningful convolutional neural network(CNN)topologies.The computational experiments have been conducted on eight different datasets belonging to three popular Indic scripts,namely Bangla,Devanagari,and Dogri,consisting of handwritten characters and digits.Empirically,the results imply that the proposed QPSO-CNN algorithm outperforms the classical and state-of-the-art methods with faster prediction and higher accuracy.展开更多
Objective In tongue diagnosis,the location,color,and distribution of spots can be used to speculate on the viscera and severity of the heat evil.This work focuses on the image analysis method of artificial intelligenc...Objective In tongue diagnosis,the location,color,and distribution of spots can be used to speculate on the viscera and severity of the heat evil.This work focuses on the image analysis method of artificial intelligence(AI)to study the spotted tongue recognition of traditional Chinese medicine(TCM).Methods A model of spotted tongue recognition and extraction is designed,which is based on the principle of image deep learning and instance segmentation.This model includes multiscale feature map generation,region proposal searching,and target region recognition.Firstly,deep convolution network is used to build multiscale low-and high-abstraction feature maps after which,target candidate box generation algorithm and selection strategy are used to select high-quality target candidate regions.Finally,classification network is used for classifying target regions and calculating target region pixels.As a result,the region segmentation of spotted tongue is obtained.Under non-standard illumination conditions,various tongue images were taken by mobile phones,and experiments were conducted.Results The spotted tongue recognition achieved an area under curve(AUC)of 92.40%,an accuracy of 84.30%with a sensitivity of 88.20%,a specificity of 94.19%,a recall of 88.20%,a regional pixel accuracy index pixel accuracy(PA)of 73.00%,a mean pixel accuracy(m PA)of73.00%,an intersection over union(Io U)of 60.00%,and a mean intersection over union(mIo U)of 56.00%.Conclusion The results of the study verify that the model is suitable for the application of the TCM tongue diagnosis system.Spotted tongue recognition via multiscale convolutional neural network(CNN)would help to improve spot classification and the accurate extraction of pixels of spot area as well as provide a practical method for intelligent tongue diagnosis of TCM.展开更多
基金the following funds:The Key Scientific Research Project of Anhui Provincial Research Preparation Plan in 2023(Nos.2023AH051806,2023AH052097,2023AH052103)Anhui Province Quality Engineering Project(Nos.2022sx099,2022cxtd097)+1 种基金University-Level Teaching and Research Key Projects(Nos.ch21jxyj01,XLZ-202208,XLZ-202106)Special Support Plan for Innovation and Entrepreneurship Leaders in Anhui Province。
文摘Convolutional neural networks struggle to accurately handle changes in angles and twists in the direction of images,which affects their ability to recognize patterns based on internal feature levels. In contrast, CapsNet overcomesthese limitations by vectorizing information through increased directionality and magnitude, ensuring that spatialinformation is not overlooked. Therefore, this study proposes a novel expression recognition technique calledCAPSULE-VGG, which combines the strengths of CapsNet and convolutional neural networks. By refining andintegrating features extracted by a convolutional neural network before introducing theminto CapsNet, ourmodelenhances facial recognition capabilities. Compared to traditional neural network models, our approach offersfaster training pace, improved convergence speed, and higher accuracy rates approaching stability. Experimentalresults demonstrate that our method achieves recognition rates of 74.14% for the FER2013 expression dataset and99.85% for the CK+ expression dataset. By contrasting these findings with those obtained using conventionalexpression recognition techniques and incorporating CapsNet’s advantages, we effectively address issues associatedwith convolutional neural networks while increasing expression identification accuracy.
基金National Natural Science Foundation of China under Grant No.61973037China Postdoctoral Science Foundation under Grant No.2022M720419。
文摘Automatic modulation recognition(AMR)of radiation source signals is a research focus in the field of cognitive radio.However,the AMR of radiation source signals at low SNRs still faces a great challenge.Therefore,the AMR method of radiation source signals based on two-dimensional data matrix and improved residual neural network is proposed in this paper.First,the time series of the radiation source signals are reconstructed into two-dimensional data matrix,which greatly simplifies the signal preprocessing process.Second,the depthwise convolution and large-size convolutional kernels based residual neural network(DLRNet)is proposed to improve the feature extraction capability of the AMR model.Finally,the model performs feature extraction and classification on the two-dimensional data matrix to obtain the recognition vector that represents the signal modulation type.Theoretical analysis and simulation results show that the AMR method based on two-dimensional data matrix and improved residual network can significantly improve the accuracy of the AMR method.The recognition accuracy of the proposed method maintains a high level greater than 90% even at -14 dB SNR.
文摘In recent years,wearable devices-based Human Activity Recognition(HAR)models have received significant attention.Previously developed HAR models use hand-crafted features to recognize human activities,leading to the extraction of basic features.The images captured by wearable sensors contain advanced features,allowing them to be analyzed by deep learning algorithms to enhance the detection and recognition of human actions.Poor lighting and limited sensor capabilities can impact data quality,making the recognition of human actions a challenging task.The unimodal-based HAR approaches are not suitable in a real-time environment.Therefore,an updated HAR model is developed using multiple types of data and an advanced deep-learning approach.Firstly,the required signals and sensor data are accumulated from the standard databases.From these signals,the wave features are retrieved.Then the extracted wave features and sensor data are given as the input to recognize the human activity.An Adaptive Hybrid Deep Attentive Network(AHDAN)is developed by incorporating a“1D Convolutional Neural Network(1DCNN)”with a“Gated Recurrent Unit(GRU)”for the human activity recognition process.Additionally,the Enhanced Archerfish Hunting Optimizer(EAHO)is suggested to fine-tune the network parameters for enhancing the recognition process.An experimental evaluation is performed on various deep learning networks and heuristic algorithms to confirm the effectiveness of the proposed HAR model.The EAHO-based HAR model outperforms traditional deep learning networks with an accuracy of 95.36,95.25 for recall,95.48 for specificity,and 95.47 for precision,respectively.The result proved that the developed model is effective in recognizing human action by taking less time.Additionally,it reduces the computation complexity and overfitting issue through using an optimization approach.
文摘The ever-growing available visual data(i.e.,uploaded videos and pictures by internet users)has attracted the research community’s attention in the computer vision field.Therefore,finding efficient solutions to extract knowledge from these sources is imperative.Recently,the BlazePose system has been released for skeleton extraction from images oriented to mobile devices.With this skeleton graph representation in place,a Spatial-Temporal Graph Convolutional Network can be implemented to predict the action.We hypothesize that just by changing the skeleton input data for a different set of joints that offers more information about the action of interest,it is possible to increase the performance of the Spatial-Temporal Graph Convolutional Network for HAR tasks.Hence,in this study,we present the first implementation of the BlazePose skeleton topology upon this architecture for action recognition.Moreover,we propose the Enhanced-BlazePose topology that can achieve better results than its predecessor.Additionally,we propose different skeleton detection thresholds that can improve the accuracy performance even further.We reached a top-1 accuracy performance of 40.1%on the Kinetics dataset.For the NTU-RGB+D dataset,we achieved 87.59%and 92.1%accuracy for Cross-Subject and Cross-View evaluation criteria,respectively.
基金supported by General Project of Philosophy and Social Science Research in Colleges and Universities in Jiangsu Province(2022SJYB0712)Research Development Fund for Young Teachers of Chengxian College of Southeast University(z0037)Special Project of Ideological and Political Education Reform and Research Course(yjgsz2206).
文摘This study aims to reduce the interference of ambient noise in mobile communication,improve the accuracy and authenticity of information transmitted by sound,and guarantee the accuracy of voice information deliv-ered by mobile communication.First,the principles and techniques of speech enhancement are analyzed,and a fast lateral recursive least square method(FLRLS method)is adopted to process sound data.Then,the convolutional neural networks(CNNs)-based noise recognition CNN(NR-CNN)algorithm and speech enhancement model are proposed.Finally,related experiments are designed to verify the performance of the proposed algorithm and model.The experimental results show that the noise classification accuracy of the NR-CNN noise recognition algorithm is higher than 99.82%,and the recall rate and F1 value are also higher than 99.92.The proposed sound enhance-ment model can effectively enhance the original sound in the case of noise interference.After the CNN is incorporated,the average value of all noisy sound perception quality evaluation system values is improved by over 21%compared with that of the traditional noise reduction method.The proposed algorithm can adapt to a variety of voice environments and can simultaneously enhance and reduce noise processing on a variety of different types of voice signals,and the processing effect is better than that of traditional sound enhancement models.In addition,the sound distortion index of the proposed speech enhancement model is inferior to that of the control group,indicating that the addition of the CNN neural network is less likely to cause sound signal distortion in various sound environments and shows superior robustness.In summary,the proposed CNN-based speech enhancement model shows significant sound enhancement effects,stable performance,and strong adapt-ability.This study provides a reference and basis for research applying neural networks in speech enhancement.
文摘Accurate handwriting recognition has been a challenging computer vision problem,because static feature analysis of the text pictures is often inade-quate to account for high variance in handwriting styles across people and poor image quality of the handwritten text.Recently,by introducing machine learning,especially convolutional neural networks(CNNs),the recognition accuracy of various handwriting patterns is steadily improved.In this paper,a deep CNN model is developed to further improve the recognition rate of the MNIST hand-written digit dataset with a fast-converging rate in training.The proposed model comes with a multi-layer deep arrange structure,including 3 convolution and acti-vation layers for feature extraction and 2 fully connected layers(i.e.,dense layers)for classification.The model’s hyperparameters,such as the batch sizes,kernel sizes,batch normalization,activation function,and learning rate are optimized to enhance the recognition performance.The average classification accuracy of the proposed methodology is found to reach 99.82%on the training dataset and 99.40%on the testing dataset,making it a nearly error-free system for MNIST recognition.
文摘With the continuous progress of The Times and the development of technology,the rise of network social media has also brought the“explosive”growth of image data.As one of the main ways of People’s Daily communication,image is widely used as a carrier of communication because of its rich content,intuitive and other advantages.Image recognition based on convolution neural network is the first application in the field of image recognition.A series of algorithm operations such as image eigenvalue extraction,recognition and convolution are used to identify and analyze different images.The rapid development of artificial intelligence makes machine learning more and more important in its research field.Use algorithms to learn each piece of data and predict the outcome.This has become an important key to open the door of artificial intelligence.In machine vision,image recognition is the foundation,but how to associate the low-level information in the image with the high-level image semantics becomes the key problem of image recognition.Predecessors have provided many model algorithms,which have laid a solid foundation for the development of artificial intelligence and image recognition.The multi-level information fusion model based on the VGG16 model is an improvement on the fully connected neural network.Different from full connection network,convolutional neural network does not use full connection method in each layer of neurons of neural network,but USES some nodes for connection.Although this method reduces the computation time,due to the fact that the convolutional neural network model will lose some useful feature information in the process of propagation and calculation,this paper improves the model to be a multi-level information fusion of the convolution calculation method,and further recovers the discarded feature information,so as to improve the recognition rate of the image.VGG divides the network into five groups(mimicking the five layers of AlexNet),yet it USES 3*3 filters and combines them as a convolution sequence.Network deeper DCNN,channel number is bigger.The recognition rate of the model was verified by 0RL Face Database,BioID Face Database and CASIA Face Image Database.
基金This work was funded by the National Natural Science Foundation of China under Grant(Nos.61772152,61502037)the Basic Research Project(Nos.JCKY2016206B001,JCKY2014206C002,JCKY2017604C010)and the Technical Foundation Project(No.JSQB2017206C002).
文摘Underwater target recognition is a key technology for underwater acoustic countermeasure.How to classify and recognize underwater targets according to the noise information of underwater targets has been a hot topic in the field of underwater acoustic signals.In this paper,the deep learning model is applied to underwater target recognition.Improved anti-noise Power-Normalized Cepstral Coefficients(ia-PNCC)is proposed,based on PNCC applied to underwater noises.Multitaper and normalized Gammatone filter banks are applied to improve the anti-noise capacity.The method is combined with a convolutional neural network in order to recognize the underwater target.Experiment results show that the acoustic feature presented by ia-PNCC has lower noise and are wellsuited to underwater target recognition using a convolutional neural network.Compared with the combination of convolutional neural network with single acoustic feature,such as MFCC(Mel-scale Frequency Cepstral Coefficients)or LPCC(Linear Prediction Cepstral Coefficients),the combination of the ia-PNCC with a convolutional neural network offers better accuracy for underwater target recognition.
基金supported by the National Key R&D Program of China(2017YFB0802900).
文摘How to correctly acquire the appropriate features is a primary problem in network protocol recognition field.Aiming to avoid the trouble of artificially extracting features in traditional methods and improve recognition accuracy,a network protocol recognition method based on Convolutional Neural Network(CNN)is proposed.The method utilizes deep learning technique,and it processes network flows automatically.Firstly,normalization is performed on the intercepted network flows and they are mapped into two-dimensional matrix which will be used as the input of CNN.Then,an improved classification model named Ptr CNN is built,which can automatically extract the appropriate features of network protocols.Finally,the classification model is trained to recognize the network protocols.The proposed approach is compared with several machine learning methods.Experimental results show that the tailored CNN can not only improve protocol recognition accuracy but also ensure the fast convergence of classification model and reduce the classification time.
文摘Scene recognition is a popular open problem in the computer vision field.Among lots of methods proposed in recent years,Convolutional Neural Network(CNN)based approaches achieve the best performance in scene recognition.We propose in this paper an advanced feature fusion algorithm using Multiple Convolutional Neural Network(Multi-CNN)for scene recognition.Unlike existing works that usually use individual convolutional neural network,a fusion of multiple different convolutional neural networks is applied for scene recognition.Firstly,we split training images in two directions and apply to three deep CNN model,and then extract features from the last full-connected(FC)layer and probabilistic layer on each model.Finally,feature vectors are fused with different fusion strategies in groups forwarded into SoftMax classifier.Our proposed algorithm is evaluated on three scene datasets for scene recognition.The experimental results demonstrate the effectiveness of proposed algorithm compared with other state-of-art approaches.
文摘This paper presents a handwritten document recognition system based on the convolutional neural network technique.In today’s world,handwritten document recognition is rapidly attaining the attention of researchers due to its promising behavior as assisting technology for visually impaired users.This technology is also helpful for the automatic data entry system.In the proposed systemprepared a dataset of English language handwritten character images.The proposed system has been trained for the large set of sample data and tested on the sample images of user-defined handwritten documents.In this research,multiple experiments get very worthy recognition results.The proposed systemwill first performimage pre-processing stages to prepare data for training using a convolutional neural network.After this processing,the input document is segmented using line,word and character segmentation.The proposed system get the accuracy during the character segmentation up to 86%.Then these segmented characters are sent to a convolutional neural network for their recognition.The recognition and segmentation technique proposed in this paper is providing the most acceptable accurate results on a given dataset.The proposed work approaches to the accuracy of the result during convolutional neural network training up to 93%,and for validation that accuracy slightly decreases with 90.42%.
基金This study is supported in part by the University of Aizu’s Competitive Research Fund(2020-P-24)。
文摘This study proposes a convolutional neural network(CNN)-based identity recognition scheme using electrocardiogram(ECG)at different water temperatures(WTs)during bathing,aiming to explore the impact of ECG length on the recognition rate.ECG data was collected using non-contact electrodes at five different WTs during bathing.Ten young student subjects(seven men and three women)participated in data collection.Three ECG recordings were collected at each preset bathtub WT for each subject.Each recording is 18 min long,with a sampling rate of 200 Hz.In total,150 ECG recordings and 150 WT recordings were collected.The R peaks were detected based on the processed ECG(baseline wandering eliminated,50-Hz hum removed,ECG smoothing and ECG normalization)and the QRS complex waves were segmented.These segmented waves were then transformed into binary images,which served as the datasets.For each subject,the training,validation,and test data were taken from the first,second,and third ECG recordings,respectively.The number of training and validation images was 84297 and 83734,respectively.In the test stage,the preliminary classification results were obtained using the trained CNN model,and the finer classification results were determined using the majority vote method based on the preliminary results.The validation rate was 98.71%.The recognition rates were 95.00%and 98.00%when the number of test heartbeats was 7 and 17,respectively,for each subject.
基金Science and Technology Support Plan Project of Tianjin Municipal Science and Technology Commission(No.15ZCZDNC00130)
文摘Image based individual dairy cattle recognition has gained much attention recently. In order to further improve the accuracy of individual dairy cattle recognition, an algorithm based on deep convolutional neural network( DCNN) is proposed in this paper,which enables automatic feature extraction and classification that outperforms traditional hand craft features. Through making multigroup comparison experiments including different network layers,different sizes of convolution kernel and different feature dimensions in full connection layer,we demonstrate that the proposed method is suitable for dairy cattle classification. The experimental results show that the accuracy is significantly higher compared to two traditional image processing algorithms: scale invariant feature transform( SIFT) algorithm and bag of feature( BOF) model.
文摘Biometric security systems based on facial characteristics face a challenging task due to variability in the intrapersonal facial appearance of subjects traced to factors such as pose, illumination, expression and aging. This paper innovates as it proposes a deep learning and set-based approach to face recognition subject to aging. The images for each subject taken at various times are treated as a single set, which is then compared to sets of images belonging to other subjects. Facial features are extracted using a convolutional neural network characteristic of deep learning. Our experimental results show that set-based recognition performs better than the singleton-based approach for both face identification and face verification. We also find that by using set-based recognition, it is easier to recognize older subjects from younger ones rather than younger subjects from older ones.
基金This work was supported by the Natural Science Foundation of China(No.61902133)Fujian natural science foundation project(No.2018J05106)Xiamen Collaborative Innovation projects of Produces study grinds(3502Z20173046)。
文摘In many existing multi-view gait recognition methods based on images or video sequences,gait sequences are usually used to superimpose and synthesize images and construct energy-like template.However,information may be lost during the process of compositing image and capture EMG signals.Errors and the recognition accuracy may be introduced and affected respectively by some factors such as period detection.To better solve the problems,a multi-view gait recognition method using deep convolutional neural network and channel attention mechanism is proposed.Firstly,the sliding time window method is used to capture EMG signals.Then,the back-propagation learning algorithm is used to train each layer of convolution,which improves the learning ability of the convolutional neural network.Finally,the channel attention mechanism is integrated into the neural network,which will improve the ability of expressing gait features.And a classifier is used to classify gait.As can be shown from experimental results on two public datasets,OULP and CASIA-B,the recognition rate of the proposed method can be achieved at 88.44%and 97.25%respectively.As can be shown from the comparative experimental results,the proposed method has better recognition effect than several other newer convolutional neural network methods.Therefore,the combination of convolutional neural network and channel attention mechanism is of great value for gait recognition.
基金supported by the West Light Foundation of the Chinese Academy of Sciences(2019-XBQNXZ-A-007)the National Natural Science Foundation of China(12071458,71731009).
文摘In recent years,deep convolution neural network has exhibited excellent performance in computer vision and has a far-reaching impact.Traditional plant taxonomic identification requires high expertise,which is time-consuming.Most nature reserves have problems such as incomplete species surveys,inaccurate taxonomic identification,and untimely updating of status data.Simple and accurate recognition of plant images can be achieved by applying convolutional neural network technology to explore the best network model.Taking 24 typical desert plant species that are widely distributed in the nature reserves in Xinjiang Uygur Autonomous Region of China as the research objects,this study established an image database and select the optimal network model for the image recognition of desert plant species to provide decision support for fine management in the nature reserves in Xinjiang,such as species investigation and monitoring,by using deep learning.Since desert plant species were not included in the public dataset,the images used in this study were mainly obtained through field shooting and downloaded from the Plant Photo Bank of China(PPBC).After the sorting process and statistical analysis,a total of 2331 plant images were finally collected(2071 images from field collection and 260 images from the PPBC),including 24 plant species belonging to 14 families and 22 genera.A large number of numerical experiments were also carried out to compare a series of 37 convolutional neural network models with good performance,from different perspectives,to find the optimal network model that is most suitable for the image recognition of desert plant species in Xinjiang.The results revealed 24 models with a recognition Accuracy,of greater than 70.000%.Among which,Residual Network X_8GF(RegNetX_8GF)performs the best,with Accuracy,Precision,Recall,and F1(which refers to the harmonic mean of the Precision and Recall values)values of 78.33%,77.65%,69.55%,and 71.26%,respectively.Considering the demand factors of hardware equipment and inference time,Mobile NetworkV2 achieves the best balance among the Accuracy,the number of parameters and the number of floating-point operations.The number of parameters for Mobile Network V2(MobileNetV2)is 1/16 of RegNetX_8GF,and the number of floating-point operations is 1/24.Our findings can facilitate efficient decision-making for the management of species survey,cataloging,inspection,and monitoring in the nature reserves in Xinjiang,providing a scientific basis for the protection and utilization of natural plant resources.
文摘Facial Expression Recognition(FER)has been an interesting area of research in places where there is human-computer interaction.Human psychol-ogy,emotions and behaviors can be analyzed in FER.Classifiers used in FER have been perfect on normal faces but have been found to be constrained in occluded faces.Recently,Deep Learning Techniques(DLT)have gained popular-ity in applications of real-world problems including recognition of human emo-tions.The human face reflects emotional states and human intentions.An expression is the most natural and powerful way of communicating non-verbally.Systems which form communications between the two are termed Human Machine Interaction(HMI)systems.FER can improve HMI systems as human expressions convey useful information to an observer.This paper proposes a FER scheme called EECNN(Enhanced Convolution Neural Network with Atten-tion mechanism)to recognize seven types of human emotions with satisfying results in its experiments.Proposed EECNN achieved 89.8%accuracy in classi-fying the images.
文摘The COVID-19 pandemic poses an additional serious public health threat due to little or no pre-existing human immunity,and developing a system to identify COVID-19 in its early stages will save millions of lives.This study applied support vector machine(SVM),k-nearest neighbor(K-NN)and deep learning convolutional neural network(CNN)algorithms to classify and detect COVID-19 using chest X-ray radiographs.To test the proposed system,chest X-ray radiographs and CT images were collected from different standard databases,which contained 95 normal images,140 COVID-19 images and 10 SARS images.Two scenarios were considered to develop a system for predicting COVID-19.In the first scenario,the Gaussian filter was applied to remove noise from the chest X-ray radiograph images,and then the adaptive region growing technique was used to segment the region of interest from the chest X-ray radiographs.After segmentation,a hybrid feature extraction composed of 2D-DWT and gray level co-occurrence matrix was utilized to extract the features significant for detecting COVID-19.These features were processed using SVM and K-NN.In the second scenario,a CNN transfer model(ResNet 50)was used to detect COVID-19.The system was examined and evaluated through multiclass statistical analysis,and the empirical results of the analysis found significant values of 97.14%,99.34%,99.26%,99.26%and 99.40%for accuracy,specificity,sensitivity,recall and AUC,respectively.Thus,the CNN model showed significant success;it achieved optimal accuracy,effectiveness and robustness for detecting COVID-19.
文摘Even though several advances have been made in recent years,handwritten script recognition is still a challenging task in the pattern recognition domain.This field has gained much interest lately due to its diverse application potentials.Nowadays,different methods are available for automatic script recognition.Among most of the reported script recognition techniques,deep neural networks have achieved impressive results and outperformed the classical machine learning algorithms.However,the process of designing such networks right from scratch intuitively appears to incur a significant amount of trial and error,which renders them unfeasible.This approach often requires manual intervention with domain expertise which consumes substantial time and computational resources.To alleviate this shortcoming,this paper proposes a new neural architecture search approach based on meta-heuristic quantum particle swarm optimization(QPSO),which is capable of automatically evolving the meaningful convolutional neural network(CNN)topologies.The computational experiments have been conducted on eight different datasets belonging to three popular Indic scripts,namely Bangla,Devanagari,and Dogri,consisting of handwritten characters and digits.Empirically,the results imply that the proposed QPSO-CNN algorithm outperforms the classical and state-of-the-art methods with faster prediction and higher accuracy.
基金Anhui Province College Natural Science Fund Key Project of China(KJ2020ZD77)the Project of Education Department of Anhui Province(KJ2020A0379)。
文摘Objective In tongue diagnosis,the location,color,and distribution of spots can be used to speculate on the viscera and severity of the heat evil.This work focuses on the image analysis method of artificial intelligence(AI)to study the spotted tongue recognition of traditional Chinese medicine(TCM).Methods A model of spotted tongue recognition and extraction is designed,which is based on the principle of image deep learning and instance segmentation.This model includes multiscale feature map generation,region proposal searching,and target region recognition.Firstly,deep convolution network is used to build multiscale low-and high-abstraction feature maps after which,target candidate box generation algorithm and selection strategy are used to select high-quality target candidate regions.Finally,classification network is used for classifying target regions and calculating target region pixels.As a result,the region segmentation of spotted tongue is obtained.Under non-standard illumination conditions,various tongue images were taken by mobile phones,and experiments were conducted.Results The spotted tongue recognition achieved an area under curve(AUC)of 92.40%,an accuracy of 84.30%with a sensitivity of 88.20%,a specificity of 94.19%,a recall of 88.20%,a regional pixel accuracy index pixel accuracy(PA)of 73.00%,a mean pixel accuracy(m PA)of73.00%,an intersection over union(Io U)of 60.00%,and a mean intersection over union(mIo U)of 56.00%.Conclusion The results of the study verify that the model is suitable for the application of the TCM tongue diagnosis system.Spotted tongue recognition via multiscale convolutional neural network(CNN)would help to improve spot classification and the accurate extraction of pixels of spot area as well as provide a practical method for intelligent tongue diagnosis of TCM.