Convolutional neural networks struggle to accurately handle changes in angles and twists in the direction of images,which affects their ability to recognize patterns based on internal feature levels. In contrast, Caps...Convolutional neural networks struggle to accurately handle changes in angles and twists in the direction of images,which affects their ability to recognize patterns based on internal feature levels. In contrast, CapsNet overcomesthese limitations by vectorizing information through increased directionality and magnitude, ensuring that spatialinformation is not overlooked. Therefore, this study proposes a novel expression recognition technique calledCAPSULE-VGG, which combines the strengths of CapsNet and convolutional neural networks. By refining andintegrating features extracted by a convolutional neural network before introducing theminto CapsNet, ourmodelenhances facial recognition capabilities. Compared to traditional neural network models, our approach offersfaster training pace, improved convergence speed, and higher accuracy rates approaching stability. Experimentalresults demonstrate that our method achieves recognition rates of 74.14% for the FER2013 expression dataset and99.85% for the CK+ expression dataset. By contrasting these findings with those obtained using conventionalexpression recognition techniques and incorporating CapsNet’s advantages, we effectively address issues associatedwith convolutional neural networks while increasing expression identification accuracy.展开更多
In the coal mining industry,the gangue separation phase imposes a key challenge due to the high visual similaritybetween coal and gangue.Recently,separation methods have become more intelligent and efficient,using new...In the coal mining industry,the gangue separation phase imposes a key challenge due to the high visual similaritybetween coal and gangue.Recently,separation methods have become more intelligent and efficient,using newtechnologies and applying different features for recognition.One such method exploits the difference in substancedensity,leading to excellent coal/gangue recognition.Therefore,this study uses density differences to distinguishcoal from gangue by performing volume prediction on the samples.Our training samples maintain a record of3-side images as input,volume,and weight as the ground truth for the classification.The prediction process relieson a Convolutional neural network(CGVP-CNN)model that receives an input of a 3-side image and then extractsthe needed features to estimate an approximation for the volume.The classification was comparatively performedvia ten different classifiers,namely,K-Nearest Neighbors(KNN),Linear Support Vector Machines(Linear SVM),Radial Basis Function(RBF)SVM,Gaussian Process,Decision Tree,Random Forest,Multi-Layer Perceptron(MLP),Adaptive Boosting(AdaBosst),Naive Bayes,and Quadratic Discriminant Analysis(QDA).After severalexperiments on testing and training data,results yield a classification accuracy of 100%,92%,95%,96%,100%,100%,100%,96%,81%,and 92%,respectively.The test reveals the best timing with KNN,which maintained anaccuracy level of 100%.Assessing themodel generalization capability to newdata is essential to ensure the efficiencyof the model,so by applying a cross-validation experiment,the model generalization was measured.The useddataset was isolated based on the volume values to ensure the model generalization not only on new images of thesame volume but with a volume outside the trained range.Then,the predicted volume values were passed to theclassifiers group,where classification reported accuracy was found to be(100%,100%,100%,98%,88%,87%,100%,87%,97%,100%),respectively.Although obtaining a classification with high accuracy is the main motive,this workhas a remarkable reduction in the data preprocessing time compared to related works.The CGVP-CNN modelmanaged to reduce the data preprocessing time of previous works to 0.017 s while maintaining high classificationaccuracy using the estimated volume value.展开更多
Automatic modulation recognition(AMR)of radiation source signals is a research focus in the field of cognitive radio.However,the AMR of radiation source signals at low SNRs still faces a great challenge.Therefore,the ...Automatic modulation recognition(AMR)of radiation source signals is a research focus in the field of cognitive radio.However,the AMR of radiation source signals at low SNRs still faces a great challenge.Therefore,the AMR method of radiation source signals based on two-dimensional data matrix and improved residual neural network is proposed in this paper.First,the time series of the radiation source signals are reconstructed into two-dimensional data matrix,which greatly simplifies the signal preprocessing process.Second,the depthwise convolution and large-size convolutional kernels based residual neural network(DLRNet)is proposed to improve the feature extraction capability of the AMR model.Finally,the model performs feature extraction and classification on the two-dimensional data matrix to obtain the recognition vector that represents the signal modulation type.Theoretical analysis and simulation results show that the AMR method based on two-dimensional data matrix and improved residual network can significantly improve the accuracy of the AMR method.The recognition accuracy of the proposed method maintains a high level greater than 90% even at -14 dB SNR.展开更多
Hyperspectral image classification stands as a pivotal task within the field of remote sensing,yet achieving highprecision classification remains a significant challenge.In response to this challenge,a Spectral Convol...Hyperspectral image classification stands as a pivotal task within the field of remote sensing,yet achieving highprecision classification remains a significant challenge.In response to this challenge,a Spectral Convolutional Neural Network model based on Adaptive Fick’s Law Algorithm(AFLA-SCNN)is proposed.The Adaptive Fick’s Law Algorithm(AFLA)constitutes a novel metaheuristic algorithm introduced herein,encompassing three new strategies:Adaptive weight factor,Gaussian mutation,and probability update policy.With adaptive weight factor,the algorithmcan adjust theweights according to the change in the number of iterations to improve the performance of the algorithm.Gaussianmutation helps the algorithm avoid falling into local optimal solutions and improves the searchability of the algorithm.The probability update strategy helps to improve the exploitability and adaptability of the algorithm.Within the AFLA-SCNN model,AFLA is employed to optimize two hyperparameters in the SCNN model,namely,“numEpochs”and“miniBatchSize”,to attain their optimal values.AFLA’s performance is initially validated across 28 functions in 10D,30D,and 50D for CEC2013 and 29 functions in 10D,30D,and 50D for CEC2017.Experimental results indicate AFLA’s marked performance superiority over nine other prominent optimization algorithms.Subsequently,the AFLA-SCNN model was compared with the Spectral Convolutional Neural Network model based on Fick’s Law Algorithm(FLA-SCNN),Spectral Convolutional Neural Network model based on Harris Hawks Optimization(HHO-SCNN),Spectral Convolutional Neural Network model based onDifferential Evolution(DE-SCNN),SpectralConvolutionalNeuralNetwork(SCNN)model,and SupportVector Machines(SVM)model using the Indian Pines dataset and PaviaUniversity dataset.The experimental results show that the AFLA-SCNN model outperforms other models in terms of Accuracy,Precision,Recall,and F1-score on Indian Pines and Pavia University.Among them,the Accuracy of the AFLA-SCNN model on Indian Pines reached 99.875%,and the Accuracy on PaviaUniversity reached 98.022%.In conclusion,our proposed AFLA-SCNN model is deemed to significantly enhance the precision of hyperspectral image classification.展开更多
Convolutional neural networks (CNNs) are widely used in image classification tasks, but their increasing model size and computation make them challenging to implement on embedded systems with constrained hardware reso...Convolutional neural networks (CNNs) are widely used in image classification tasks, but their increasing model size and computation make them challenging to implement on embedded systems with constrained hardware resources. To address this issue, the MobileNetV1 network was developed, which employs depthwise convolution to reduce network complexity. MobileNetV1 employs a stride of 2 in several convolutional layers to decrease the spatial resolution of feature maps, thereby lowering computational costs. However, this stride setting can lead to a loss of spatial information, particularly affecting the detection and representation of smaller objects or finer details in images. To maintain the trade-off between complexity and model performance, a lightweight convolutional neural network with hierarchical multi-scale feature fusion based on the MobileNetV1 network is proposed. The network consists of two main subnetworks. The first subnetwork uses a depthwise dilated separable convolution (DDSC) layer to learn imaging features with fewer parameters, which results in a lightweight and computationally inexpensive network. Furthermore, depthwise dilated convolution in DDSC layer effectively expands the field of view of filters, allowing them to incorporate a larger context. The second subnetwork is a hierarchical multi-scale feature fusion (HMFF) module that uses parallel multi-resolution branches architecture to process the input feature map in order to extract the multi-scale feature information of the input image. Experimental results on the CIFAR-10, Malaria, and KvasirV1 datasets demonstrate that the proposed method is efficient, reducing the network parameters and computational cost by 65.02% and 39.78%, respectively, while maintaining the network performance compared to the MobileNetV1 baseline.展开更多
Automatic modulation classification(AMC)aims at identifying the modulation of the received signals,which is a significant approach to identifying the target in military and civil applications.In this paper,a novel dat...Automatic modulation classification(AMC)aims at identifying the modulation of the received signals,which is a significant approach to identifying the target in military and civil applications.In this paper,a novel data-driven framework named convolutional and transformer-based deep neural network(CTDNN)is proposed to improve the classification performance.CTDNN can be divided into four modules,i.e.,convolutional neural network(CNN)backbone,transition module,transformer module,and final classifier.In the CNN backbone,a wide and deep convolution structure is designed,which consists of 1×15 convolution kernels and intensive cross-layer connections instead of traditional 1×3 kernels and sequential connections.In the transition module,a 1×1 convolution layer is utilized to compress the channels of the previous multi-scale CNN features.In the transformer module,three self-attention layers are designed for extracting global features and generating the classification vector.In the classifier,the final decision is made based on the maximum a posterior probability.Extensive simulations are conducted,and the result shows that our proposed CTDNN can achieve superior classification performance than traditional deep models.展开更多
Accurate handwriting recognition has been a challenging computer vision problem,because static feature analysis of the text pictures is often inade-quate to account for high variance in handwriting styles across peopl...Accurate handwriting recognition has been a challenging computer vision problem,because static feature analysis of the text pictures is often inade-quate to account for high variance in handwriting styles across people and poor image quality of the handwritten text.Recently,by introducing machine learning,especially convolutional neural networks(CNNs),the recognition accuracy of various handwriting patterns is steadily improved.In this paper,a deep CNN model is developed to further improve the recognition rate of the MNIST hand-written digit dataset with a fast-converging rate in training.The proposed model comes with a multi-layer deep arrange structure,including 3 convolution and acti-vation layers for feature extraction and 2 fully connected layers(i.e.,dense layers)for classification.The model’s hyperparameters,such as the batch sizes,kernel sizes,batch normalization,activation function,and learning rate are optimized to enhance the recognition performance.The average classification accuracy of the proposed methodology is found to reach 99.82%on the training dataset and 99.40%on the testing dataset,making it a nearly error-free system for MNIST recognition.展开更多
In recent years,wearable devices-based Human Activity Recognition(HAR)models have received significant attention.Previously developed HAR models use hand-crafted features to recognize human activities,leading to the e...In recent years,wearable devices-based Human Activity Recognition(HAR)models have received significant attention.Previously developed HAR models use hand-crafted features to recognize human activities,leading to the extraction of basic features.The images captured by wearable sensors contain advanced features,allowing them to be analyzed by deep learning algorithms to enhance the detection and recognition of human actions.Poor lighting and limited sensor capabilities can impact data quality,making the recognition of human actions a challenging task.The unimodal-based HAR approaches are not suitable in a real-time environment.Therefore,an updated HAR model is developed using multiple types of data and an advanced deep-learning approach.Firstly,the required signals and sensor data are accumulated from the standard databases.From these signals,the wave features are retrieved.Then the extracted wave features and sensor data are given as the input to recognize the human activity.An Adaptive Hybrid Deep Attentive Network(AHDAN)is developed by incorporating a“1D Convolutional Neural Network(1DCNN)”with a“Gated Recurrent Unit(GRU)”for the human activity recognition process.Additionally,the Enhanced Archerfish Hunting Optimizer(EAHO)is suggested to fine-tune the network parameters for enhancing the recognition process.An experimental evaluation is performed on various deep learning networks and heuristic algorithms to confirm the effectiveness of the proposed HAR model.The EAHO-based HAR model outperforms traditional deep learning networks with an accuracy of 95.36,95.25 for recall,95.48 for specificity,and 95.47 for precision,respectively.The result proved that the developed model is effective in recognizing human action by taking less time.Additionally,it reduces the computation complexity and overfitting issue through using an optimization approach.展开更多
Achieving accurate classification of colorectal polyps during colonoscopy can avoid unnecessary endoscopic biopsy or resection.This study aimed to develop a deep learning model that can automatically classify colorect...Achieving accurate classification of colorectal polyps during colonoscopy can avoid unnecessary endoscopic biopsy or resection.This study aimed to develop a deep learning model that can automatically classify colorectal polyps histologically on white-light and narrow-band imaging(NBI)colonoscopy images based on World Health Organization(WHO)and Workgroup serrAted polypS and Polyposis(WASP)classification criteria for colorectal polyps.White-light and NBI colonoscopy images of colorectal polyps exhibiting pathological results were firstly collected and classified into four categories:conventional adenoma,hyperplastic polyp,sessile serrated adenoma/polyp(SSAP)and normal,among which conventional adenoma could be further divided into three sub-categories of tubular adenoma,villous adenoma and villioustublar adenoma,subsequently the images were re-classified into six categories.In this paper,we proposed a novel convolutional neural network termed Polyp-DedNet for the four-and six-category classification tasks of colorectal polyps.Based on the existing classification network ResNet50,Polyp-DedNet adopted dilated convolution to retain more high-dimensional spatial information and an Efficient Channel Attention(ECA)module to improve the classification performance further.To eliminate gridding artifacts caused by dilated convolutions,traditional convolutional layers were used instead of the max pooling layer,and two convolutional layers with progressively decreasing dilation were added at the end of the network.Due to the inevitable imbalance of medical image data,a regularization method DropBlock and a Class-Balanced(CB)Loss were performed to prevent network overfitting.Furthermore,the 5-fold cross-validation was adopted to estimate the performance of Polyp-DedNet for the multi-classification task of colorectal polyps.Mean accuracies of the proposed Polyp-DedNet for the four-and six-category classifications of colorectal polyps were 89.91%±0.92%and 85.13%±1.10%,respectively.The metrics of precision,recall and F1-score were also improved by 1%∼2%compared to the baseline ResNet50.The proposed Polyp-DedNet presented state-of-the-art performance for colorectal polyp classifying on white-light and NBI colonoscopy images,highlighting its considerable potential as an AI-assistant system for accurate colorectal polyp diagnosis in colonoscopy.展开更多
Amodel that can obtain rapid and accurate detection of coronavirus disease 2019(COVID-19)plays a significant role in treating and preventing the spread of disease transmission.However,designing such amodel that can ba...Amodel that can obtain rapid and accurate detection of coronavirus disease 2019(COVID-19)plays a significant role in treating and preventing the spread of disease transmission.However,designing such amodel that can balance the detection accuracy andweight parameters ofmemorywell to deploy a mobile device is challenging.Taking this point into account,this paper fuses the convolutional neural network and residual learning operations to build a multi-class classification model,which improves COVID-19 pneumonia detection performance and keeps a trade-off between the weight parameters and accuracy.The convolutional neural network can extract the COVID-19 feature information by repeated convolutional operations.The residual learning operations alleviate the gradient problems caused by stacking convolutional layers and enhance the ability of feature extraction.The ability further enables the proposed model to acquire effective feature information at a lowcost,which canmake ourmodel keep smallweight parameters.Extensive validation and comparison with other models of COVID-19 pneumonia detection on the well-known COVIDx dataset show that(1)the sensitivity of COVID-19 pneumonia detection is improved from 88.2%(non-COVID-19)and 77.5%(COVID-19)to 95.3%(non-COVID-19)and 96.5%(COVID-19),respectively.The positive predictive value is also respectively increased from72.8%(non-COVID-19)and 89.0%(COVID-19)to 88.8%(non-COVID-19)and 95.1%(COVID-19).(2)Compared with the weight parameters of the COVIDNet-small network,the value of the proposed model is 13 M,which is slightly higher than that(11.37 M)of the COVIDNet-small network.But,the corresponding accuracy is improved from 85.2%to 93.0%.The above results illustrate the proposed model can gain an efficient balance between accuracy and weight parameters.展开更多
This study aims to reduce the interference of ambient noise in mobile communication,improve the accuracy and authenticity of information transmitted by sound,and guarantee the accuracy of voice information deliv-ered ...This study aims to reduce the interference of ambient noise in mobile communication,improve the accuracy and authenticity of information transmitted by sound,and guarantee the accuracy of voice information deliv-ered by mobile communication.First,the principles and techniques of speech enhancement are analyzed,and a fast lateral recursive least square method(FLRLS method)is adopted to process sound data.Then,the convolutional neural networks(CNNs)-based noise recognition CNN(NR-CNN)algorithm and speech enhancement model are proposed.Finally,related experiments are designed to verify the performance of the proposed algorithm and model.The experimental results show that the noise classification accuracy of the NR-CNN noise recognition algorithm is higher than 99.82%,and the recall rate and F1 value are also higher than 99.92.The proposed sound enhance-ment model can effectively enhance the original sound in the case of noise interference.After the CNN is incorporated,the average value of all noisy sound perception quality evaluation system values is improved by over 21%compared with that of the traditional noise reduction method.The proposed algorithm can adapt to a variety of voice environments and can simultaneously enhance and reduce noise processing on a variety of different types of voice signals,and the processing effect is better than that of traditional sound enhancement models.In addition,the sound distortion index of the proposed speech enhancement model is inferior to that of the control group,indicating that the addition of the CNN neural network is less likely to cause sound signal distortion in various sound environments and shows superior robustness.In summary,the proposed CNN-based speech enhancement model shows significant sound enhancement effects,stable performance,and strong adapt-ability.This study provides a reference and basis for research applying neural networks in speech enhancement.展开更多
Use of deep learning algorithms for the investigation and analysis of medical images has emerged as a powerful technique.The increase in retinal dis-eases is alarming as it may lead to permanent blindness if left untr...Use of deep learning algorithms for the investigation and analysis of medical images has emerged as a powerful technique.The increase in retinal dis-eases is alarming as it may lead to permanent blindness if left untreated.Automa-tion of the diagnosis process of retinal diseases not only assists ophthalmologists in correct decision-making but saves time also.Several researchers have worked on automated retinal disease classification but restricted either to hand-crafted fea-ture selection or binary classification.This paper presents a deep learning-based approach for the automated classification of multiple retinal diseases using fundus images.For this research,the data has been collected and combined from three distinct sources.The images are preprocessed for enhancing the details.Six layers of the convolutional neural network(CNN)are used for the automated feature extraction and classification of 20 retinal diseases.It is observed that the results are reliant on the number of classes.For binary classification(healthy vs.unhealthy),up to 100%accuracy has been achieved.When 16 classes are used(treating stages of a disease as a single class),93.3%accuracy,92%sensitivity and 93%specificity have been obtained respectively.For 20 classes(treating stages of the disease as separate classes),the accuracy,sensitivity and specificity have dropped to 92.4%,92%and 92%respectively.展开更多
The quality of maize seeds affects the outcome of planting and harvesting,so seed quality inspection has become very important.Traditional seed quality detection methods are labor-intensive and time-consuming,whereas ...The quality of maize seeds affects the outcome of planting and harvesting,so seed quality inspection has become very important.Traditional seed quality detection methods are labor-intensive and time-consuming,whereas seed quality detection using computer vision techniques is efficient and accurate.In this study,we conducted migration learning training in AlexNet,VGG11 and ShuffleNetV2 network models respectively,and found that ShuffleNetV2 has a high accuracy rate for maize seed classification and recognition by comparing various metrics.In this study,the features of the seed images were extracted through image pre-processing methods,and then the AlexNet,VGG11 and ShuffleNetV2 models were used for training and classification respectively.A total of 2081 seed images containing four varieties were used for training and testing.The experimental results showed that ShuffleNetV2 could efficiently distinguish different varieties of maize seeds with the highest classification accuracy of 100%,where the parameter size of the model was at 20.65 MB and the response time for a single image was at 0.45 s.Therefore,the method is of high practicality and extension value.展开更多
This paper presents a handwritten document recognition system based on the convolutional neural network technique.In today’s world,handwritten document recognition is rapidly attaining the attention of researchers du...This paper presents a handwritten document recognition system based on the convolutional neural network technique.In today’s world,handwritten document recognition is rapidly attaining the attention of researchers due to its promising behavior as assisting technology for visually impaired users.This technology is also helpful for the automatic data entry system.In the proposed systemprepared a dataset of English language handwritten character images.The proposed system has been trained for the large set of sample data and tested on the sample images of user-defined handwritten documents.In this research,multiple experiments get very worthy recognition results.The proposed systemwill first performimage pre-processing stages to prepare data for training using a convolutional neural network.After this processing,the input document is segmented using line,word and character segmentation.The proposed system get the accuracy during the character segmentation up to 86%.Then these segmented characters are sent to a convolutional neural network for their recognition.The recognition and segmentation technique proposed in this paper is providing the most acceptable accurate results on a given dataset.The proposed work approaches to the accuracy of the result during convolutional neural network training up to 93%,and for validation that accuracy slightly decreases with 90.42%.展开更多
Biometric security systems based on facial characteristics face a challenging task due to variability in the intrapersonal facial appearance of subjects traced to factors such as pose, illumination, expression and agi...Biometric security systems based on facial characteristics face a challenging task due to variability in the intrapersonal facial appearance of subjects traced to factors such as pose, illumination, expression and aging. This paper innovates as it proposes a deep learning and set-based approach to face recognition subject to aging. The images for each subject taken at various times are treated as a single set, which is then compared to sets of images belonging to other subjects. Facial features are extracted using a convolutional neural network characteristic of deep learning. Our experimental results show that set-based recognition performs better than the singleton-based approach for both face identification and face verification. We also find that by using set-based recognition, it is easier to recognize older subjects from younger ones rather than younger subjects from older ones.展开更多
Magnetic Resonance Imaging (MRI) is an important diagnostic technique for early detection of brain Tumor and the classification of brain Tumor from MRI image is a challenging research work because of its different sha...Magnetic Resonance Imaging (MRI) is an important diagnostic technique for early detection of brain Tumor and the classification of brain Tumor from MRI image is a challenging research work because of its different shapes, location and image intensities. For successful classification, the segmentation method is required to separate Tumor. Then important features are extracted from the segmented Tumor that is used to classify the Tumor. In this work, an efficient multilevel segmentation method is developed combining optimal thresholding and watershed segmentation technique followed by a morphological operation to separate the Tumor. Convolutional Neural Network (CNN) is then applied for feature extraction and finally, the Kernel Support Vector Machine (KSVM) is utilized for resultant classification that is justified by our experimental evaluation. Experimental results show that the proposed method effectively detect and classify the Tumor as cancerous or non-cancerous with promising accuracy.展开更多
In many existing multi-view gait recognition methods based on images or video sequences,gait sequences are usually used to superimpose and synthesize images and construct energy-like template.However,information may b...In many existing multi-view gait recognition methods based on images or video sequences,gait sequences are usually used to superimpose and synthesize images and construct energy-like template.However,information may be lost during the process of compositing image and capture EMG signals.Errors and the recognition accuracy may be introduced and affected respectively by some factors such as period detection.To better solve the problems,a multi-view gait recognition method using deep convolutional neural network and channel attention mechanism is proposed.Firstly,the sliding time window method is used to capture EMG signals.Then,the back-propagation learning algorithm is used to train each layer of convolution,which improves the learning ability of the convolutional neural network.Finally,the channel attention mechanism is integrated into the neural network,which will improve the ability of expressing gait features.And a classifier is used to classify gait.As can be shown from experimental results on two public datasets,OULP and CASIA-B,the recognition rate of the proposed method can be achieved at 88.44%and 97.25%respectively.As can be shown from the comparative experimental results,the proposed method has better recognition effect than several other newer convolutional neural network methods.Therefore,the combination of convolutional neural network and channel attention mechanism is of great value for gait recognition.展开更多
In recent years,deep convolution neural network has exhibited excellent performance in computer vision and has a far-reaching impact.Traditional plant taxonomic identification requires high expertise,which is time-con...In recent years,deep convolution neural network has exhibited excellent performance in computer vision and has a far-reaching impact.Traditional plant taxonomic identification requires high expertise,which is time-consuming.Most nature reserves have problems such as incomplete species surveys,inaccurate taxonomic identification,and untimely updating of status data.Simple and accurate recognition of plant images can be achieved by applying convolutional neural network technology to explore the best network model.Taking 24 typical desert plant species that are widely distributed in the nature reserves in Xinjiang Uygur Autonomous Region of China as the research objects,this study established an image database and select the optimal network model for the image recognition of desert plant species to provide decision support for fine management in the nature reserves in Xinjiang,such as species investigation and monitoring,by using deep learning.Since desert plant species were not included in the public dataset,the images used in this study were mainly obtained through field shooting and downloaded from the Plant Photo Bank of China(PPBC).After the sorting process and statistical analysis,a total of 2331 plant images were finally collected(2071 images from field collection and 260 images from the PPBC),including 24 plant species belonging to 14 families and 22 genera.A large number of numerical experiments were also carried out to compare a series of 37 convolutional neural network models with good performance,from different perspectives,to find the optimal network model that is most suitable for the image recognition of desert plant species in Xinjiang.The results revealed 24 models with a recognition Accuracy,of greater than 70.000%.Among which,Residual Network X_8GF(RegNetX_8GF)performs the best,with Accuracy,Precision,Recall,and F1(which refers to the harmonic mean of the Precision and Recall values)values of 78.33%,77.65%,69.55%,and 71.26%,respectively.Considering the demand factors of hardware equipment and inference time,Mobile NetworkV2 achieves the best balance among the Accuracy,the number of parameters and the number of floating-point operations.The number of parameters for Mobile Network V2(MobileNetV2)is 1/16 of RegNetX_8GF,and the number of floating-point operations is 1/24.Our findings can facilitate efficient decision-making for the management of species survey,cataloging,inspection,and monitoring in the nature reserves in Xinjiang,providing a scientific basis for the protection and utilization of natural plant resources.展开更多
The problem of domestic refuse is becoming more and more serious with the use of all kinds of equipment in medical institutions.This matter arouses people’s attention.Traditional artificial waste classification is su...The problem of domestic refuse is becoming more and more serious with the use of all kinds of equipment in medical institutions.This matter arouses people’s attention.Traditional artificial waste classification is subjective and cannot be put accurately;moreover,the working environment of sorting is poor and the efficiency is low.Therefore,automated and effective sorting is needed.In view of the current development of deep learning,it can provide a good auxiliary role for classification and realize automatic classification.In this paper,the ResNet-50 convolutional neural network based on the transfer learning method is applied to design the image classifier to obtain the domestic refuse classification with high accuracy.By comparing the method designed in this paper with back propagation neural network and convolutional neural network,it is concluded that the CNN based on transfer learning method applied in this paper with higher accuracy rate and lower false detection rate.Further,under the shortage situation of data samples,the method with transfer learning and ResNet-50 training model is effective to improve the accuracy of image classification.展开更多
Even though several advances have been made in recent years,handwritten script recognition is still a challenging task in the pattern recognition domain.This field has gained much interest lately due to its diverse ap...Even though several advances have been made in recent years,handwritten script recognition is still a challenging task in the pattern recognition domain.This field has gained much interest lately due to its diverse application potentials.Nowadays,different methods are available for automatic script recognition.Among most of the reported script recognition techniques,deep neural networks have achieved impressive results and outperformed the classical machine learning algorithms.However,the process of designing such networks right from scratch intuitively appears to incur a significant amount of trial and error,which renders them unfeasible.This approach often requires manual intervention with domain expertise which consumes substantial time and computational resources.To alleviate this shortcoming,this paper proposes a new neural architecture search approach based on meta-heuristic quantum particle swarm optimization(QPSO),which is capable of automatically evolving the meaningful convolutional neural network(CNN)topologies.The computational experiments have been conducted on eight different datasets belonging to three popular Indic scripts,namely Bangla,Devanagari,and Dogri,consisting of handwritten characters and digits.Empirically,the results imply that the proposed QPSO-CNN algorithm outperforms the classical and state-of-the-art methods with faster prediction and higher accuracy.展开更多
基金the following funds:The Key Scientific Research Project of Anhui Provincial Research Preparation Plan in 2023(Nos.2023AH051806,2023AH052097,2023AH052103)Anhui Province Quality Engineering Project(Nos.2022sx099,2022cxtd097)+1 种基金University-Level Teaching and Research Key Projects(Nos.ch21jxyj01,XLZ-202208,XLZ-202106)Special Support Plan for Innovation and Entrepreneurship Leaders in Anhui Province。
文摘Convolutional neural networks struggle to accurately handle changes in angles and twists in the direction of images,which affects their ability to recognize patterns based on internal feature levels. In contrast, CapsNet overcomesthese limitations by vectorizing information through increased directionality and magnitude, ensuring that spatialinformation is not overlooked. Therefore, this study proposes a novel expression recognition technique calledCAPSULE-VGG, which combines the strengths of CapsNet and convolutional neural networks. By refining andintegrating features extracted by a convolutional neural network before introducing theminto CapsNet, ourmodelenhances facial recognition capabilities. Compared to traditional neural network models, our approach offersfaster training pace, improved convergence speed, and higher accuracy rates approaching stability. Experimentalresults demonstrate that our method achieves recognition rates of 74.14% for the FER2013 expression dataset and99.85% for the CK+ expression dataset. By contrasting these findings with those obtained using conventionalexpression recognition techniques and incorporating CapsNet’s advantages, we effectively address issues associatedwith convolutional neural networks while increasing expression identification accuracy.
基金the National Natural Science Foundation of China under Grant No.52274159 received by E.Hu,https://www.nsfc.gov.cn/Grant No.52374165 received by E.Hu,https://www.nsfc.gov.cn/the China National Coal Group Key Technology Project Grant No.(20221CY001)received by Z.Guan,and E.Hu,https://www.chinacoal.com/.
文摘In the coal mining industry,the gangue separation phase imposes a key challenge due to the high visual similaritybetween coal and gangue.Recently,separation methods have become more intelligent and efficient,using newtechnologies and applying different features for recognition.One such method exploits the difference in substancedensity,leading to excellent coal/gangue recognition.Therefore,this study uses density differences to distinguishcoal from gangue by performing volume prediction on the samples.Our training samples maintain a record of3-side images as input,volume,and weight as the ground truth for the classification.The prediction process relieson a Convolutional neural network(CGVP-CNN)model that receives an input of a 3-side image and then extractsthe needed features to estimate an approximation for the volume.The classification was comparatively performedvia ten different classifiers,namely,K-Nearest Neighbors(KNN),Linear Support Vector Machines(Linear SVM),Radial Basis Function(RBF)SVM,Gaussian Process,Decision Tree,Random Forest,Multi-Layer Perceptron(MLP),Adaptive Boosting(AdaBosst),Naive Bayes,and Quadratic Discriminant Analysis(QDA).After severalexperiments on testing and training data,results yield a classification accuracy of 100%,92%,95%,96%,100%,100%,100%,96%,81%,and 92%,respectively.The test reveals the best timing with KNN,which maintained anaccuracy level of 100%.Assessing themodel generalization capability to newdata is essential to ensure the efficiencyof the model,so by applying a cross-validation experiment,the model generalization was measured.The useddataset was isolated based on the volume values to ensure the model generalization not only on new images of thesame volume but with a volume outside the trained range.Then,the predicted volume values were passed to theclassifiers group,where classification reported accuracy was found to be(100%,100%,100%,98%,88%,87%,100%,87%,97%,100%),respectively.Although obtaining a classification with high accuracy is the main motive,this workhas a remarkable reduction in the data preprocessing time compared to related works.The CGVP-CNN modelmanaged to reduce the data preprocessing time of previous works to 0.017 s while maintaining high classificationaccuracy using the estimated volume value.
基金National Natural Science Foundation of China under Grant No.61973037China Postdoctoral Science Foundation under Grant No.2022M720419。
文摘Automatic modulation recognition(AMR)of radiation source signals is a research focus in the field of cognitive radio.However,the AMR of radiation source signals at low SNRs still faces a great challenge.Therefore,the AMR method of radiation source signals based on two-dimensional data matrix and improved residual neural network is proposed in this paper.First,the time series of the radiation source signals are reconstructed into two-dimensional data matrix,which greatly simplifies the signal preprocessing process.Second,the depthwise convolution and large-size convolutional kernels based residual neural network(DLRNet)is proposed to improve the feature extraction capability of the AMR model.Finally,the model performs feature extraction and classification on the two-dimensional data matrix to obtain the recognition vector that represents the signal modulation type.Theoretical analysis and simulation results show that the AMR method based on two-dimensional data matrix and improved residual network can significantly improve the accuracy of the AMR method.The recognition accuracy of the proposed method maintains a high level greater than 90% even at -14 dB SNR.
基金Natural Science Foundation of Shandong Province,China(Grant No.ZR202111230202).
文摘Hyperspectral image classification stands as a pivotal task within the field of remote sensing,yet achieving highprecision classification remains a significant challenge.In response to this challenge,a Spectral Convolutional Neural Network model based on Adaptive Fick’s Law Algorithm(AFLA-SCNN)is proposed.The Adaptive Fick’s Law Algorithm(AFLA)constitutes a novel metaheuristic algorithm introduced herein,encompassing three new strategies:Adaptive weight factor,Gaussian mutation,and probability update policy.With adaptive weight factor,the algorithmcan adjust theweights according to the change in the number of iterations to improve the performance of the algorithm.Gaussianmutation helps the algorithm avoid falling into local optimal solutions and improves the searchability of the algorithm.The probability update strategy helps to improve the exploitability and adaptability of the algorithm.Within the AFLA-SCNN model,AFLA is employed to optimize two hyperparameters in the SCNN model,namely,“numEpochs”and“miniBatchSize”,to attain their optimal values.AFLA’s performance is initially validated across 28 functions in 10D,30D,and 50D for CEC2013 and 29 functions in 10D,30D,and 50D for CEC2017.Experimental results indicate AFLA’s marked performance superiority over nine other prominent optimization algorithms.Subsequently,the AFLA-SCNN model was compared with the Spectral Convolutional Neural Network model based on Fick’s Law Algorithm(FLA-SCNN),Spectral Convolutional Neural Network model based on Harris Hawks Optimization(HHO-SCNN),Spectral Convolutional Neural Network model based onDifferential Evolution(DE-SCNN),SpectralConvolutionalNeuralNetwork(SCNN)model,and SupportVector Machines(SVM)model using the Indian Pines dataset and PaviaUniversity dataset.The experimental results show that the AFLA-SCNN model outperforms other models in terms of Accuracy,Precision,Recall,and F1-score on Indian Pines and Pavia University.Among them,the Accuracy of the AFLA-SCNN model on Indian Pines reached 99.875%,and the Accuracy on PaviaUniversity reached 98.022%.In conclusion,our proposed AFLA-SCNN model is deemed to significantly enhance the precision of hyperspectral image classification.
文摘Convolutional neural networks (CNNs) are widely used in image classification tasks, but their increasing model size and computation make them challenging to implement on embedded systems with constrained hardware resources. To address this issue, the MobileNetV1 network was developed, which employs depthwise convolution to reduce network complexity. MobileNetV1 employs a stride of 2 in several convolutional layers to decrease the spatial resolution of feature maps, thereby lowering computational costs. However, this stride setting can lead to a loss of spatial information, particularly affecting the detection and representation of smaller objects or finer details in images. To maintain the trade-off between complexity and model performance, a lightweight convolutional neural network with hierarchical multi-scale feature fusion based on the MobileNetV1 network is proposed. The network consists of two main subnetworks. The first subnetwork uses a depthwise dilated separable convolution (DDSC) layer to learn imaging features with fewer parameters, which results in a lightweight and computationally inexpensive network. Furthermore, depthwise dilated convolution in DDSC layer effectively expands the field of view of filters, allowing them to incorporate a larger context. The second subnetwork is a hierarchical multi-scale feature fusion (HMFF) module that uses parallel multi-resolution branches architecture to process the input feature map in order to extract the multi-scale feature information of the input image. Experimental results on the CIFAR-10, Malaria, and KvasirV1 datasets demonstrate that the proposed method is efficient, reducing the network parameters and computational cost by 65.02% and 39.78%, respectively, while maintaining the network performance compared to the MobileNetV1 baseline.
基金supported in part by the National Natural Science Foundation of China under Grant(62171045,62201090)in part by the National Key Research and Development Program of China under Grants(2020YFB1807602,2019YFB1804404).
文摘Automatic modulation classification(AMC)aims at identifying the modulation of the received signals,which is a significant approach to identifying the target in military and civil applications.In this paper,a novel data-driven framework named convolutional and transformer-based deep neural network(CTDNN)is proposed to improve the classification performance.CTDNN can be divided into four modules,i.e.,convolutional neural network(CNN)backbone,transition module,transformer module,and final classifier.In the CNN backbone,a wide and deep convolution structure is designed,which consists of 1×15 convolution kernels and intensive cross-layer connections instead of traditional 1×3 kernels and sequential connections.In the transition module,a 1×1 convolution layer is utilized to compress the channels of the previous multi-scale CNN features.In the transformer module,three self-attention layers are designed for extracting global features and generating the classification vector.In the classifier,the final decision is made based on the maximum a posterior probability.Extensive simulations are conducted,and the result shows that our proposed CTDNN can achieve superior classification performance than traditional deep models.
文摘Accurate handwriting recognition has been a challenging computer vision problem,because static feature analysis of the text pictures is often inade-quate to account for high variance in handwriting styles across people and poor image quality of the handwritten text.Recently,by introducing machine learning,especially convolutional neural networks(CNNs),the recognition accuracy of various handwriting patterns is steadily improved.In this paper,a deep CNN model is developed to further improve the recognition rate of the MNIST hand-written digit dataset with a fast-converging rate in training.The proposed model comes with a multi-layer deep arrange structure,including 3 convolution and acti-vation layers for feature extraction and 2 fully connected layers(i.e.,dense layers)for classification.The model’s hyperparameters,such as the batch sizes,kernel sizes,batch normalization,activation function,and learning rate are optimized to enhance the recognition performance.The average classification accuracy of the proposed methodology is found to reach 99.82%on the training dataset and 99.40%on the testing dataset,making it a nearly error-free system for MNIST recognition.
文摘In recent years,wearable devices-based Human Activity Recognition(HAR)models have received significant attention.Previously developed HAR models use hand-crafted features to recognize human activities,leading to the extraction of basic features.The images captured by wearable sensors contain advanced features,allowing them to be analyzed by deep learning algorithms to enhance the detection and recognition of human actions.Poor lighting and limited sensor capabilities can impact data quality,making the recognition of human actions a challenging task.The unimodal-based HAR approaches are not suitable in a real-time environment.Therefore,an updated HAR model is developed using multiple types of data and an advanced deep-learning approach.Firstly,the required signals and sensor data are accumulated from the standard databases.From these signals,the wave features are retrieved.Then the extracted wave features and sensor data are given as the input to recognize the human activity.An Adaptive Hybrid Deep Attentive Network(AHDAN)is developed by incorporating a“1D Convolutional Neural Network(1DCNN)”with a“Gated Recurrent Unit(GRU)”for the human activity recognition process.Additionally,the Enhanced Archerfish Hunting Optimizer(EAHO)is suggested to fine-tune the network parameters for enhancing the recognition process.An experimental evaluation is performed on various deep learning networks and heuristic algorithms to confirm the effectiveness of the proposed HAR model.The EAHO-based HAR model outperforms traditional deep learning networks with an accuracy of 95.36,95.25 for recall,95.48 for specificity,and 95.47 for precision,respectively.The result proved that the developed model is effective in recognizing human action by taking less time.Additionally,it reduces the computation complexity and overfitting issue through using an optimization approach.
基金funded by the Research Fund for Foundation of Hebei University(DXK201914)the President of Hebei University(XZJJ201914)+3 种基金the Post-graduate’s Innovation Fund Project of Hebei University(HBU2022SS003)the Special Project for Cultivating College Students’Scientific and Technological Innovation Ability in Hebei Province(22E50041D)Guangdong Basic and Applied Basic Research Foundation(2021A1515011654)the Fundamental Research Funds for the Central Universities of China(20720210117).
文摘Achieving accurate classification of colorectal polyps during colonoscopy can avoid unnecessary endoscopic biopsy or resection.This study aimed to develop a deep learning model that can automatically classify colorectal polyps histologically on white-light and narrow-band imaging(NBI)colonoscopy images based on World Health Organization(WHO)and Workgroup serrAted polypS and Polyposis(WASP)classification criteria for colorectal polyps.White-light and NBI colonoscopy images of colorectal polyps exhibiting pathological results were firstly collected and classified into four categories:conventional adenoma,hyperplastic polyp,sessile serrated adenoma/polyp(SSAP)and normal,among which conventional adenoma could be further divided into three sub-categories of tubular adenoma,villous adenoma and villioustublar adenoma,subsequently the images were re-classified into six categories.In this paper,we proposed a novel convolutional neural network termed Polyp-DedNet for the four-and six-category classification tasks of colorectal polyps.Based on the existing classification network ResNet50,Polyp-DedNet adopted dilated convolution to retain more high-dimensional spatial information and an Efficient Channel Attention(ECA)module to improve the classification performance further.To eliminate gridding artifacts caused by dilated convolutions,traditional convolutional layers were used instead of the max pooling layer,and two convolutional layers with progressively decreasing dilation were added at the end of the network.Due to the inevitable imbalance of medical image data,a regularization method DropBlock and a Class-Balanced(CB)Loss were performed to prevent network overfitting.Furthermore,the 5-fold cross-validation was adopted to estimate the performance of Polyp-DedNet for the multi-classification task of colorectal polyps.Mean accuracies of the proposed Polyp-DedNet for the four-and six-category classifications of colorectal polyps were 89.91%±0.92%and 85.13%±1.10%,respectively.The metrics of precision,recall and F1-score were also improved by 1%∼2%compared to the baseline ResNet50.The proposed Polyp-DedNet presented state-of-the-art performance for colorectal polyp classifying on white-light and NBI colonoscopy images,highlighting its considerable potential as an AI-assistant system for accurate colorectal polyp diagnosis in colonoscopy.
基金This work was supported in part by the science and technology research project of Henan Provincial Department of science and technology(No.222102110366)the Science and Technology Innovation Team of Henan University(No.22IRTSTHN016)the grants from the teaching reform research and practice project of higher education in Henan Province in 2021[2021SJGLX502].
文摘Amodel that can obtain rapid and accurate detection of coronavirus disease 2019(COVID-19)plays a significant role in treating and preventing the spread of disease transmission.However,designing such amodel that can balance the detection accuracy andweight parameters ofmemorywell to deploy a mobile device is challenging.Taking this point into account,this paper fuses the convolutional neural network and residual learning operations to build a multi-class classification model,which improves COVID-19 pneumonia detection performance and keeps a trade-off between the weight parameters and accuracy.The convolutional neural network can extract the COVID-19 feature information by repeated convolutional operations.The residual learning operations alleviate the gradient problems caused by stacking convolutional layers and enhance the ability of feature extraction.The ability further enables the proposed model to acquire effective feature information at a lowcost,which canmake ourmodel keep smallweight parameters.Extensive validation and comparison with other models of COVID-19 pneumonia detection on the well-known COVIDx dataset show that(1)the sensitivity of COVID-19 pneumonia detection is improved from 88.2%(non-COVID-19)and 77.5%(COVID-19)to 95.3%(non-COVID-19)and 96.5%(COVID-19),respectively.The positive predictive value is also respectively increased from72.8%(non-COVID-19)and 89.0%(COVID-19)to 88.8%(non-COVID-19)and 95.1%(COVID-19).(2)Compared with the weight parameters of the COVIDNet-small network,the value of the proposed model is 13 M,which is slightly higher than that(11.37 M)of the COVIDNet-small network.But,the corresponding accuracy is improved from 85.2%to 93.0%.The above results illustrate the proposed model can gain an efficient balance between accuracy and weight parameters.
基金supported by General Project of Philosophy and Social Science Research in Colleges and Universities in Jiangsu Province(2022SJYB0712)Research Development Fund for Young Teachers of Chengxian College of Southeast University(z0037)Special Project of Ideological and Political Education Reform and Research Course(yjgsz2206).
文摘This study aims to reduce the interference of ambient noise in mobile communication,improve the accuracy and authenticity of information transmitted by sound,and guarantee the accuracy of voice information deliv-ered by mobile communication.First,the principles and techniques of speech enhancement are analyzed,and a fast lateral recursive least square method(FLRLS method)is adopted to process sound data.Then,the convolutional neural networks(CNNs)-based noise recognition CNN(NR-CNN)algorithm and speech enhancement model are proposed.Finally,related experiments are designed to verify the performance of the proposed algorithm and model.The experimental results show that the noise classification accuracy of the NR-CNN noise recognition algorithm is higher than 99.82%,and the recall rate and F1 value are also higher than 99.92.The proposed sound enhance-ment model can effectively enhance the original sound in the case of noise interference.After the CNN is incorporated,the average value of all noisy sound perception quality evaluation system values is improved by over 21%compared with that of the traditional noise reduction method.The proposed algorithm can adapt to a variety of voice environments and can simultaneously enhance and reduce noise processing on a variety of different types of voice signals,and the processing effect is better than that of traditional sound enhancement models.In addition,the sound distortion index of the proposed speech enhancement model is inferior to that of the control group,indicating that the addition of the CNN neural network is less likely to cause sound signal distortion in various sound environments and shows superior robustness.In summary,the proposed CNN-based speech enhancement model shows significant sound enhancement effects,stable performance,and strong adapt-ability.This study provides a reference and basis for research applying neural networks in speech enhancement.
文摘Use of deep learning algorithms for the investigation and analysis of medical images has emerged as a powerful technique.The increase in retinal dis-eases is alarming as it may lead to permanent blindness if left untreated.Automa-tion of the diagnosis process of retinal diseases not only assists ophthalmologists in correct decision-making but saves time also.Several researchers have worked on automated retinal disease classification but restricted either to hand-crafted fea-ture selection or binary classification.This paper presents a deep learning-based approach for the automated classification of multiple retinal diseases using fundus images.For this research,the data has been collected and combined from three distinct sources.The images are preprocessed for enhancing the details.Six layers of the convolutional neural network(CNN)are used for the automated feature extraction and classification of 20 retinal diseases.It is observed that the results are reliant on the number of classes.For binary classification(healthy vs.unhealthy),up to 100%accuracy has been achieved.When 16 classes are used(treating stages of a disease as a single class),93.3%accuracy,92%sensitivity and 93%specificity have been obtained respectively.For 20 classes(treating stages of the disease as separate classes),the accuracy,sensitivity and specificity have dropped to 92.4%,92%and 92%respectively.
文摘The quality of maize seeds affects the outcome of planting and harvesting,so seed quality inspection has become very important.Traditional seed quality detection methods are labor-intensive and time-consuming,whereas seed quality detection using computer vision techniques is efficient and accurate.In this study,we conducted migration learning training in AlexNet,VGG11 and ShuffleNetV2 network models respectively,and found that ShuffleNetV2 has a high accuracy rate for maize seed classification and recognition by comparing various metrics.In this study,the features of the seed images were extracted through image pre-processing methods,and then the AlexNet,VGG11 and ShuffleNetV2 models were used for training and classification respectively.A total of 2081 seed images containing four varieties were used for training and testing.The experimental results showed that ShuffleNetV2 could efficiently distinguish different varieties of maize seeds with the highest classification accuracy of 100%,where the parameter size of the model was at 20.65 MB and the response time for a single image was at 0.45 s.Therefore,the method is of high practicality and extension value.
文摘This paper presents a handwritten document recognition system based on the convolutional neural network technique.In today’s world,handwritten document recognition is rapidly attaining the attention of researchers due to its promising behavior as assisting technology for visually impaired users.This technology is also helpful for the automatic data entry system.In the proposed systemprepared a dataset of English language handwritten character images.The proposed system has been trained for the large set of sample data and tested on the sample images of user-defined handwritten documents.In this research,multiple experiments get very worthy recognition results.The proposed systemwill first performimage pre-processing stages to prepare data for training using a convolutional neural network.After this processing,the input document is segmented using line,word and character segmentation.The proposed system get the accuracy during the character segmentation up to 86%.Then these segmented characters are sent to a convolutional neural network for their recognition.The recognition and segmentation technique proposed in this paper is providing the most acceptable accurate results on a given dataset.The proposed work approaches to the accuracy of the result during convolutional neural network training up to 93%,and for validation that accuracy slightly decreases with 90.42%.
文摘Biometric security systems based on facial characteristics face a challenging task due to variability in the intrapersonal facial appearance of subjects traced to factors such as pose, illumination, expression and aging. This paper innovates as it proposes a deep learning and set-based approach to face recognition subject to aging. The images for each subject taken at various times are treated as a single set, which is then compared to sets of images belonging to other subjects. Facial features are extracted using a convolutional neural network characteristic of deep learning. Our experimental results show that set-based recognition performs better than the singleton-based approach for both face identification and face verification. We also find that by using set-based recognition, it is easier to recognize older subjects from younger ones rather than younger subjects from older ones.
文摘Magnetic Resonance Imaging (MRI) is an important diagnostic technique for early detection of brain Tumor and the classification of brain Tumor from MRI image is a challenging research work because of its different shapes, location and image intensities. For successful classification, the segmentation method is required to separate Tumor. Then important features are extracted from the segmented Tumor that is used to classify the Tumor. In this work, an efficient multilevel segmentation method is developed combining optimal thresholding and watershed segmentation technique followed by a morphological operation to separate the Tumor. Convolutional Neural Network (CNN) is then applied for feature extraction and finally, the Kernel Support Vector Machine (KSVM) is utilized for resultant classification that is justified by our experimental evaluation. Experimental results show that the proposed method effectively detect and classify the Tumor as cancerous or non-cancerous with promising accuracy.
基金This work was supported by the Natural Science Foundation of China(No.61902133)Fujian natural science foundation project(No.2018J05106)Xiamen Collaborative Innovation projects of Produces study grinds(3502Z20173046)。
文摘In many existing multi-view gait recognition methods based on images or video sequences,gait sequences are usually used to superimpose and synthesize images and construct energy-like template.However,information may be lost during the process of compositing image and capture EMG signals.Errors and the recognition accuracy may be introduced and affected respectively by some factors such as period detection.To better solve the problems,a multi-view gait recognition method using deep convolutional neural network and channel attention mechanism is proposed.Firstly,the sliding time window method is used to capture EMG signals.Then,the back-propagation learning algorithm is used to train each layer of convolution,which improves the learning ability of the convolutional neural network.Finally,the channel attention mechanism is integrated into the neural network,which will improve the ability of expressing gait features.And a classifier is used to classify gait.As can be shown from experimental results on two public datasets,OULP and CASIA-B,the recognition rate of the proposed method can be achieved at 88.44%and 97.25%respectively.As can be shown from the comparative experimental results,the proposed method has better recognition effect than several other newer convolutional neural network methods.Therefore,the combination of convolutional neural network and channel attention mechanism is of great value for gait recognition.
基金supported by the West Light Foundation of the Chinese Academy of Sciences(2019-XBQNXZ-A-007)the National Natural Science Foundation of China(12071458,71731009).
文摘In recent years,deep convolution neural network has exhibited excellent performance in computer vision and has a far-reaching impact.Traditional plant taxonomic identification requires high expertise,which is time-consuming.Most nature reserves have problems such as incomplete species surveys,inaccurate taxonomic identification,and untimely updating of status data.Simple and accurate recognition of plant images can be achieved by applying convolutional neural network technology to explore the best network model.Taking 24 typical desert plant species that are widely distributed in the nature reserves in Xinjiang Uygur Autonomous Region of China as the research objects,this study established an image database and select the optimal network model for the image recognition of desert plant species to provide decision support for fine management in the nature reserves in Xinjiang,such as species investigation and monitoring,by using deep learning.Since desert plant species were not included in the public dataset,the images used in this study were mainly obtained through field shooting and downloaded from the Plant Photo Bank of China(PPBC).After the sorting process and statistical analysis,a total of 2331 plant images were finally collected(2071 images from field collection and 260 images from the PPBC),including 24 plant species belonging to 14 families and 22 genera.A large number of numerical experiments were also carried out to compare a series of 37 convolutional neural network models with good performance,from different perspectives,to find the optimal network model that is most suitable for the image recognition of desert plant species in Xinjiang.The results revealed 24 models with a recognition Accuracy,of greater than 70.000%.Among which,Residual Network X_8GF(RegNetX_8GF)performs the best,with Accuracy,Precision,Recall,and F1(which refers to the harmonic mean of the Precision and Recall values)values of 78.33%,77.65%,69.55%,and 71.26%,respectively.Considering the demand factors of hardware equipment and inference time,Mobile NetworkV2 achieves the best balance among the Accuracy,the number of parameters and the number of floating-point operations.The number of parameters for Mobile Network V2(MobileNetV2)is 1/16 of RegNetX_8GF,and the number of floating-point operations is 1/24.Our findings can facilitate efficient decision-making for the management of species survey,cataloging,inspection,and monitoring in the nature reserves in Xinjiang,providing a scientific basis for the protection and utilization of natural plant resources.
基金This work was supported in part by the National Natural Science Foundation of China under Grant 61806028,Grant 61672437 and Grant 61702428Sichuan Science and Technology Program under Grants 21ZDYF2484,2021YFN0104,21GJHZ0061,21ZDYF3629,21ZDYF2907,21ZDYF0418,21YYJC1827,21ZDYF3537,2019YJ0356the Chinese Scholarship Council under Grants 202008510036,201908515022.
文摘The problem of domestic refuse is becoming more and more serious with the use of all kinds of equipment in medical institutions.This matter arouses people’s attention.Traditional artificial waste classification is subjective and cannot be put accurately;moreover,the working environment of sorting is poor and the efficiency is low.Therefore,automated and effective sorting is needed.In view of the current development of deep learning,it can provide a good auxiliary role for classification and realize automatic classification.In this paper,the ResNet-50 convolutional neural network based on the transfer learning method is applied to design the image classifier to obtain the domestic refuse classification with high accuracy.By comparing the method designed in this paper with back propagation neural network and convolutional neural network,it is concluded that the CNN based on transfer learning method applied in this paper with higher accuracy rate and lower false detection rate.Further,under the shortage situation of data samples,the method with transfer learning and ResNet-50 training model is effective to improve the accuracy of image classification.
文摘Even though several advances have been made in recent years,handwritten script recognition is still a challenging task in the pattern recognition domain.This field has gained much interest lately due to its diverse application potentials.Nowadays,different methods are available for automatic script recognition.Among most of the reported script recognition techniques,deep neural networks have achieved impressive results and outperformed the classical machine learning algorithms.However,the process of designing such networks right from scratch intuitively appears to incur a significant amount of trial and error,which renders them unfeasible.This approach often requires manual intervention with domain expertise which consumes substantial time and computational resources.To alleviate this shortcoming,this paper proposes a new neural architecture search approach based on meta-heuristic quantum particle swarm optimization(QPSO),which is capable of automatically evolving the meaningful convolutional neural network(CNN)topologies.The computational experiments have been conducted on eight different datasets belonging to three popular Indic scripts,namely Bangla,Devanagari,and Dogri,consisting of handwritten characters and digits.Empirically,the results imply that the proposed QPSO-CNN algorithm outperforms the classical and state-of-the-art methods with faster prediction and higher accuracy.