The use of privacy-enhanced facial recognition has increased in response to growing concerns about data securityand privacy in the digital age. This trend is spurred by rising demand for face recognition technology in...The use of privacy-enhanced facial recognition has increased in response to growing concerns about data securityand privacy in the digital age. This trend is spurred by rising demand for face recognition technology in a varietyof industries, including access control, law enforcement, surveillance, and internet communication. However,the growing usage of face recognition technology has created serious concerns about data monitoring and userprivacy preferences, especially in context-aware systems. In response to these problems, this study provides a novelframework that integrates sophisticated approaches such as Generative Adversarial Networks (GANs), Blockchain,and distributed computing to solve privacy concerns while maintaining exact face recognition. The framework’spainstaking design and execution strive to strike a compromise between precise face recognition and protectingpersonal data integrity in an increasingly interconnected environment. Using cutting-edge tools like Dlib for faceanalysis,Ray Cluster for distributed computing, and Blockchain for decentralized identity verification, the proposedsystem provides scalable and secure facial analysis while protecting user privacy. The study’s contributions includethe creation of a sustainable and scalable solution for privacy-aware face recognition, the implementation of flexibleprivacy computing approaches based on Blockchain networks, and the demonstration of higher performanceover previous methods. Specifically, the proposed StyleGAN model has an outstanding accuracy rate of 93.84%while processing high-resolution images from the CelebA-HQ dataset, beating other evaluated models such asProgressive GAN 90.27%, CycleGAN 89.80%, and MGAN 80.80%. With improvements in accuracy, speed, andprivacy protection, the framework has great promise for practical use in a variety of fields that need face recognitiontechnology. This study paves the way for future research in privacy-enhanced face recognition systems, emphasizingthe significance of using cutting-edge technology to meet rising privacy issues in digital identity.展开更多
Adversarial attacks have been posing significant security concerns to intelligent systems,such as speaker recognition systems(SRSs).Most attacks assume the neural networks in the systems are known beforehand,while bla...Adversarial attacks have been posing significant security concerns to intelligent systems,such as speaker recognition systems(SRSs).Most attacks assume the neural networks in the systems are known beforehand,while black-box attacks are proposed without such information to meet practical situations.Existing black-box attacks improve trans-ferability by integrating multiple models or training on multiple datasets,but these methods are costly.Motivated by the optimisation strategy with spatial information on the perturbed paths and samples,we propose a Dual Spatial Momentum Iterative Fast Gradient Sign Method(DS-MI-FGSM)to improve the transferability of black-box at-tacks against SRSs.Specifically,DS-MI-FGSM only needs a single data and one model as the input;by extending to the data and model neighbouring spaces,it generates adver-sarial examples against the integrating models.To reduce the risk of overfitting,DS-MI-FGSM also introduces gradient masking to improve transferability.The authors conduct extensive experiments regarding the speaker recognition task,and the results demonstrate the effectiveness of their method,which can achieve up to 92%attack success rate on the victim model in black-box scenarios with only one known model.展开更多
The development of scientific inquiry and research has yielded numerous benefits in the realm of intelligent traffic control systems, particularly in the realm of automatic license plate recognition for vehicles. The ...The development of scientific inquiry and research has yielded numerous benefits in the realm of intelligent traffic control systems, particularly in the realm of automatic license plate recognition for vehicles. The design of license plate recognition algorithms has undergone digitalization through the utilization of neural networks. In contemporary times, there is a growing demand for vehicle surveillance due to the need for efficient vehicle processing and traffic management. The design, development, and implementation of a license plate recognition system hold significant social, economic, and academic importance. The study aims to present contemporary methodologies and empirical findings pertaining to automated license plate recognition. The primary focus of the automatic license plate recognition algorithm was on image extraction, character segmentation, and recognition. The task of character segmentation has been identified as the most challenging function based on my observations. The license plate recognition project that we designed demonstrated the effectiveness of this method across various observed conditions. Particularly in low-light environments, such as during periods of limited illumination or inclement weather characterized by precipitation. The method has been subjected to testing using a sample size of fifty images, resulting in a 100% accuracy rate. The findings of this study demonstrate the project’s ability to effectively determine the optimal outcomes of simulations.展开更多
The challenge faced by the visually impaired persons in their day-today lives is to interpret text from documents.In this context,to help these people,the objective of this work is to develop an efficient text recogni...The challenge faced by the visually impaired persons in their day-today lives is to interpret text from documents.In this context,to help these people,the objective of this work is to develop an efficient text recognition system that allows the isolation,the extraction,and the recognition of text in the case of documents having a textured background,a degraded aspect of colors,and of poor quality,and to synthesize it into speech.This system basically consists of three algorithms:a text localization and detection algorithm based on mathematical morphology method(MMM);a text extraction algorithm based on the gamma correction method(GCM);and an optical character recognition(OCR)algorithm for text recognition.A detailed complexity study of the different blocks of this text recognition system has been realized.Following this study,an acceleration of the GCM algorithm(AGCM)is proposed.The AGCM algorithm has reduced the complexity in the text recognition system by 70%and kept the same quality of text recognition as that of the original method.To assist visually impaired persons,a graphical interface of the entire text recognition chain has been developed,allowing the capture of images from a camera,rapid and intuitive visualization of the recognized text from this image,and text-to-speech synthesis.Our text recognition system provides an improvement of 6.8%for the recognition rate and 7.6%for the F-measure relative to GCM and AGCM algorithms.展开更多
The Electron Cyclotron Resonance(ECR)ion source is a critical device for producing highly charged ion beams in various applications.Analyzing the charge-state distribution of the ion beams is essential,but the manual ...The Electron Cyclotron Resonance(ECR)ion source is a critical device for producing highly charged ion beams in various applications.Analyzing the charge-state distribution of the ion beams is essential,but the manual analysis is labor-intensive and prone to inaccuracies due to impurity ions.An automatic spectrum recognition system based on intelligent algorithms was proposed for rapid and accurate chargestate analysis of ECR ion sources.The system employs an adaptive window-length Savitzky-Golay(SG)filtering algorithm,an improved automatic multiscale peak detection(AMPD)algorithm,and a greedy matching algorithm based on the relative distance to accurately match different peaks in the spectra with the corresponding charge-state ion species.Additionally,a user-friendly operator interface was developed for ease of use.Extensive testing on the online ECR ion source platform demonstrates that the system achieves high accuracy,with an average root mean square error of less than 0.1 A for identifying charge-state spectra of ECR ion sources.Moreover,the system minimizes the stand-ard deviation of the first-order derivative of the smoothed signal to 81.1846 A.These results indicate the capability of the designed system to identify ion beam spectra with mass numbers less than Xe,including Xe itself.The proposed automatic spectrum recognition system represents a significant advancement in ECR ion source analysis,offering a rapid and accurate approach for charge-state analysis while enhancing supply efficiency.The exceptional performance and successful imple-mentation of the proposed system on multiple ECR ion source platforms at IMPCAS highlight its potential for widespread adoption in ECR ion source research and applications.展开更多
A deep fusion model is proposed for facial expression-based human-computer Interaction system.Initially,image preprocessing,i.e.,the extraction of the facial region from the input image is utilized.Thereafter,the extr...A deep fusion model is proposed for facial expression-based human-computer Interaction system.Initially,image preprocessing,i.e.,the extraction of the facial region from the input image is utilized.Thereafter,the extraction of more discriminative and distinctive deep learning features is achieved using extracted facial regions.To prevent overfitting,in-depth features of facial images are extracted and assigned to the proposed convolutional neural network(CNN)models.Various CNN models are then trained.Finally,the performance of each CNN model is fused to obtain the final decision for the seven basic classes of facial expressions,i.e.,fear,disgust,anger,surprise,sadness,happiness,neutral.For experimental purposes,three benchmark datasets,i.e.,SFEW,CK+,and KDEF are utilized.The performance of the proposed systemis compared with some state-of-the-artmethods concerning each dataset.Extensive performance analysis reveals that the proposed system outperforms the competitive methods in terms of various performance metrics.Finally,the proposed deep fusion model is being utilized to control a music player using the recognized emotions of the users.展开更多
Speech signals play an essential role in communication and provide an efficient way to exchange information between humans and machines.Speech Emotion Recognition(SER)is one of the critical sources for human evaluatio...Speech signals play an essential role in communication and provide an efficient way to exchange information between humans and machines.Speech Emotion Recognition(SER)is one of the critical sources for human evaluation,which is applicable in many real-world applications such as healthcare,call centers,robotics,safety,and virtual reality.This work developed a novel TCN-based emotion recognition system using speech signals through a spatial-temporal convolution network to recognize the speaker’s emotional state.The authors designed a Temporal Convolutional Network(TCN)core block to recognize long-term dependencies in speech signals and then feed these temporal cues to a dense network to fuse the spatial features and recognize global information for final classification.The proposed network extracts valid sequential cues automatically from speech signals,which performed better than state-of-the-art(SOTA)and traditional machine learning algorithms.Results of the proposed method show a high recognition rate compared with SOTAmethods.The final unweighted accuracy of 80.84%,and 92.31%,for interactive emotional dyadic motion captures(IEMOCAP)and berlin emotional dataset(EMO-DB),indicate the robustness and efficiency of the designed model.展开更多
Hand Gesture Recognition(HGR)is a promising research area with an extensive range of applications,such as surgery,video game techniques,and sign language translation,where sign language is a complicated structured for...Hand Gesture Recognition(HGR)is a promising research area with an extensive range of applications,such as surgery,video game techniques,and sign language translation,where sign language is a complicated structured form of hand gestures.The fundamental building blocks of structured expressions in sign language are the arrangement of the fingers,the orientation of the hand,and the hand’s position concerning the body.The importance of HGR has increased due to the increasing number of touchless applications and the rapid growth of the hearing-impaired population.Therefore,real-time HGR is one of the most effective interaction methods between computers and humans.Developing a user-free interface with good recognition performance should be the goal of real-time HGR systems.Nowadays,Convolutional Neural Network(CNN)shows great recognition rates for different image-level classification tasks.It is challenging to train deep CNN networks like VGG-16,VGG-19,Inception-v3,and Efficientnet-B0 from scratch because only some significant labeled image datasets are available for static hand gesture images.However,an efficient and robust hand gesture recognition system of sign language employing finetuned Inception-v3 and Efficientnet-Bo network is proposed to identify hand gestures using a comparative small HGR dataset.Experiments show that Inception-v3 achieved 90%accuracy and 0.93%precision,0.91%recall,and 0.90%f1-score,respectively,while EfficientNet-B0 achieved 99%accuracy and 0.98%,0.97%,0.98%,precision,recall,and f1-score respectively.展开更多
Activity recognition of indoor occupants using indirect sensing with less privacy violation is one of the hot research topics. This paper proposes a CO<sub>2</sub> sensor-based indoor occupant activity mon...Activity recognition of indoor occupants using indirect sensing with less privacy violation is one of the hot research topics. This paper proposes a CO<sub>2</sub> sensor-based indoor occupant activity monitoring system. Using the IoT sensor node that contains CO<sub>2</sub> sensors, the measured CO<sub>2</sub> concentrations in three locations (laboratory, office, and bedroom) were stored in a cloud server for up to 35 days starting July 1, 2023. The CO<sub>2</sub> measurements stored at 30-second intervals were statistically processed to produce a heat-mapped display of the hourly average or maximum CO<sub>2</sub> concentration. From the heatmap visualizations of CO<sub>2</sub> concentration, the proposed system estimated meeting, heating water using a portable stove, and sleep for the occupants’ activity recognition.展开更多
Speech recognition is a hot topic in the field of artificial intelligence.Generally,speech recognition models can only run on large servers or dedicated chips.This paper presents a keyword speech recognition system ba...Speech recognition is a hot topic in the field of artificial intelligence.Generally,speech recognition models can only run on large servers or dedicated chips.This paper presents a keyword speech recognition system based on a neural network and a conventional STM32 chip.To address the limited Flash and ROM resources on the STM32 MCU chip,the deployment of the speech recognition model is optimized to meet the requirements of keyword recognition.Firstly,the audio information obtained through sensors is subjected to MFCC(Mel Frequency Cepstral Coefficient)feature extraction,and the extracted MFCC features are input into a CNN(Convolutional Neural Network)for deep feature extraction.Then,the features are input into a fully connected layer,and finally,the speech keyword is classified and predicted.Deploying the model to the STM32F429,the prediction model achieves an accuracy of 90.58%,a decrease of less than 1%compared to the accuracy of 91.49%running on a computer,with good performance.展开更多
Being aimed at the weakness of short range target′s threshold value recognition system,the double passage And Gate recognition system was put forward on the correlativity of target signals and randomness of noise ...Being aimed at the weakness of short range target′s threshold value recognition system,the double passage And Gate recognition system was put forward on the correlativity of target signals and randomness of noise signals Through state analysis and inference of state transition probability,both the reliability and early burst probability of the system were obtained in theory展开更多
Perceptual auditory filter banks such as Bark-scale filter bank are widely used as front-end processing in speech recognition systems.However,the problem of the design of optimized filter banks that provide higher acc...Perceptual auditory filter banks such as Bark-scale filter bank are widely used as front-end processing in speech recognition systems.However,the problem of the design of optimized filter banks that provide higher accuracy in recognition tasks is still open.Owing to spectral analysis in feature extraction,an adaptive bands filter bank (ABFB) is presented.The design adopts flexible bandwidths and center frequencies for the frequency responses of the filters and utilizes genetic algorithm (GA) to optimize the design parameters.The optimization process is realized by combining the front-end filter bank with the back-end recognition network in the performance evaluation loop.The deployment of ABFB together with zero-crossing peak amplitude (ZCPA) feature as a front process for radial basis function (RBF) system shows significant improvement in robustness compared with the Bark-scale filter bank.In ABFB,several sub-bands are still more concentrated toward lower frequency but their exact locations are determined by the performance rather than the perceptual criteria.For the ease of optimization,only symmetrical bands are considered here,which still provide satisfactory results.展开更多
This paper presents a new pattern recognition system for Chinese spirit identification by using the polymer quartz piezoelectric crystal sensor based e-nose. The sensors are designed based on quartz crystal microbala...This paper presents a new pattern recognition system for Chinese spirit identification by using the polymer quartz piezoelectric crystal sensor based e-nose. The sensors are designed based on quartz crystal microbalance(QCM) principle,and they could capture different vibration frequency signal values for Chinese spirit identification. For each sensor in an8-channel sensor array, seven characteristic values of the original vibration frequency signal values, i.e., average value(A),root-mean-square value(RMS), shape factor value(S_f), crest factor value(C_f), impulse factor value(I_f), clearance factor value(CL_f), kurtosis factor value(K_v) are first extracted. Then the dimension of the characteristic values is reduced by the principle components analysis(PCA) method. Finally the back propagation(BP) neutral network algorithm is used to recognize Chinese spirits. The experimental results show that the recognition rate of six kinds of Chinese spirits is 93.33% and our proposed new pattern recognition system can identify Chinese spirits effectively.展开更多
We present a new pattern recognition system based on moving average and linear discriminant analysis (LDA), which can be used to process the original signal of the new polymer quartz piezoelectric crystal air-sensit...We present a new pattern recognition system based on moving average and linear discriminant analysis (LDA), which can be used to process the original signal of the new polymer quartz piezoelectric crystal air-sensitive sensor system we designed, called the new e-nose. Using the new e-nose, we obtain the template datum of Chinese spirits via a new pattern recognition system. To verify the effectiveness of the new pattern recognition system, we select three kinds of Chinese spirits to test, our results confirm that the new pattern recognition system can perfectly identify and distinguish between the Chinese spirits.展开更多
The COVID-19 pandemic poses an additional serious public health threat due to little or no pre-existing human immunity,and developing a system to identify COVID-19 in its early stages will save millions of lives.This ...The COVID-19 pandemic poses an additional serious public health threat due to little or no pre-existing human immunity,and developing a system to identify COVID-19 in its early stages will save millions of lives.This study applied support vector machine(SVM),k-nearest neighbor(K-NN)and deep learning convolutional neural network(CNN)algorithms to classify and detect COVID-19 using chest X-ray radiographs.To test the proposed system,chest X-ray radiographs and CT images were collected from different standard databases,which contained 95 normal images,140 COVID-19 images and 10 SARS images.Two scenarios were considered to develop a system for predicting COVID-19.In the first scenario,the Gaussian filter was applied to remove noise from the chest X-ray radiograph images,and then the adaptive region growing technique was used to segment the region of interest from the chest X-ray radiographs.After segmentation,a hybrid feature extraction composed of 2D-DWT and gray level co-occurrence matrix was utilized to extract the features significant for detecting COVID-19.These features were processed using SVM and K-NN.In the second scenario,a CNN transfer model(ResNet 50)was used to detect COVID-19.The system was examined and evaluated through multiclass statistical analysis,and the empirical results of the analysis found significant values of 97.14%,99.34%,99.26%,99.26%and 99.40%for accuracy,specificity,sensitivity,recall and AUC,respectively.Thus,the CNN model showed significant success;it achieved optimal accuracy,effectiveness and robustness for detecting COVID-19.展开更多
The license plate recognition system(LPRS)has been widely adopted in daily life due to its efficiency and high accuracy.Deep neural networks are commonly used in the LPRS to improve the recognition accuracy.However,re...The license plate recognition system(LPRS)has been widely adopted in daily life due to its efficiency and high accuracy.Deep neural networks are commonly used in the LPRS to improve the recognition accuracy.However,researchers have found that deep neural networks have their own security problems that may lead to unexpected results.Specifically,they can be easily attacked by the adversarial examples that are generated by adding small perturbations to the original images,resulting in incorrect license plate recognition.There are some classic methods to generate adversarial examples,but they cannot be adopted on LPRS directly.In this paper,we modify some classic methods to generate adversarial examples that could mislead the LPRS.We conduct extensive evaluations on the HyperLPR system and the results show that the system could be easily attacked by such adversarial examples.In addition,we show that the generated images could also attack the black-box systems;we show some examples that the Baidu LPR system also makes incorrect recognitions.We hope this paper could help improve the LPRS by realizing the existence of such adversarial attacks.展开更多
In order to improve the resource allocation mechanism of artificial immune recognition system(AIRS) and decrease the memory cells,a fuzzy logic resource allocation and memory cell pruning based AIRS(FPAIRS) is propose...In order to improve the resource allocation mechanism of artificial immune recognition system(AIRS) and decrease the memory cells,a fuzzy logic resource allocation and memory cell pruning based AIRS(FPAIRS) is proposed.In FPAIRS,the fuzzy logic is determined by a parameter,thus,the optimal fuzzy logics for different problems can be located through changing the parameter value.At the same time,the memory cells of low fitness scores are pruned to improve the classifier.This classifier was compared with other classifiers on six UCI datasets classification performance.The results show that the accuracies reached by FPAIRS are higher than or comparable to the accuracies of other classifiers,and the memory cells decrease when compared with the memory cells of AIRS.The results show that the algorithm is a high-performance classifier.展开更多
This study describes the development of a simple biometric facial recognition system, BFMT, which is designed for use in identifying individuals within a given population. The system is based on digital signatures der...This study describes the development of a simple biometric facial recognition system, BFMT, which is designed for use in identifying individuals within a given population. The system is based on digital signatures derived from facial images of human subjects. The results of the study demonstrate that a particular set of facial features from a simple two-dimensional image can yield a unique digital signature which can be used to identify a subject from a limited population within a controlled environment. The simplicity of the model upon which the system is based can result in commercial facial recognition systems that are more cost-effective to develop than those currently on the market.展开更多
Designing accurate and time-efficient real-time traffic sign recognition systems is a crucial part of developing the intelligent vehicle which is the main agent in the intelligent transportation system.Traffic sign re...Designing accurate and time-efficient real-time traffic sign recognition systems is a crucial part of developing the intelligent vehicle which is the main agent in the intelligent transportation system.Traffic sign recognition systems consist of an initial detection phase where images transportaand colors are segmented and fed to the recognition phase.The most challenging process in such systems in terms of time consumption is the detection phase.The trade off in previous studies,which proposed different methods for detecting traffic signs,is between accuracy and computation time,Therefore,this paper presents a novel accurate and time-efficient color segmentation approach based on logistic regression.We used RGB color space as the domain to extract the features of our hypothesis;this has boosted the speed of our approach since no color conversion is needed.Our trained segmentation classifier was tested on 1000 traffic sign images taken in different lighting conditions.The results show that our approach segmented 974 of these images correctly and in a time less than one-fifth of the time needed by any other robust segmentation method.展开更多
In forest variety registration, visual traits of the plants appearance are widely used to discern different tree species. The new recognition system of leaf image strategy which based on neural network established to ...In forest variety registration, visual traits of the plants appearance are widely used to discern different tree species. The new recognition system of leaf image strategy which based on neural network established to administrate a hierarchical list of leaf images, some sorts of edge detection can be performed to identify the individual tokens of every image and the frame of the leaf can be got to differentiate the tree species. An approach based on back-propagation neuronal network is proposed and the programming language for the implementation is also Riven by using Java. The numerical simulations results have shown that the proposed leaf strategt is effective and feasible.展开更多
文摘The use of privacy-enhanced facial recognition has increased in response to growing concerns about data securityand privacy in the digital age. This trend is spurred by rising demand for face recognition technology in a varietyof industries, including access control, law enforcement, surveillance, and internet communication. However,the growing usage of face recognition technology has created serious concerns about data monitoring and userprivacy preferences, especially in context-aware systems. In response to these problems, this study provides a novelframework that integrates sophisticated approaches such as Generative Adversarial Networks (GANs), Blockchain,and distributed computing to solve privacy concerns while maintaining exact face recognition. The framework’spainstaking design and execution strive to strike a compromise between precise face recognition and protectingpersonal data integrity in an increasingly interconnected environment. Using cutting-edge tools like Dlib for faceanalysis,Ray Cluster for distributed computing, and Blockchain for decentralized identity verification, the proposedsystem provides scalable and secure facial analysis while protecting user privacy. The study’s contributions includethe creation of a sustainable and scalable solution for privacy-aware face recognition, the implementation of flexibleprivacy computing approaches based on Blockchain networks, and the demonstration of higher performanceover previous methods. Specifically, the proposed StyleGAN model has an outstanding accuracy rate of 93.84%while processing high-resolution images from the CelebA-HQ dataset, beating other evaluated models such asProgressive GAN 90.27%, CycleGAN 89.80%, and MGAN 80.80%. With improvements in accuracy, speed, andprivacy protection, the framework has great promise for practical use in a variety of fields that need face recognitiontechnology. This study paves the way for future research in privacy-enhanced face recognition systems, emphasizingthe significance of using cutting-edge technology to meet rising privacy issues in digital identity.
基金The Major Key Project of PCL,Grant/Award Number:PCL2022A03National Natural Science Foundation of China,Grant/Award Numbers:61976064,62372137Zhejiang Provincial Natural Science Foundation of China,Grant/Award Number:LZ22F020007。
文摘Adversarial attacks have been posing significant security concerns to intelligent systems,such as speaker recognition systems(SRSs).Most attacks assume the neural networks in the systems are known beforehand,while black-box attacks are proposed without such information to meet practical situations.Existing black-box attacks improve trans-ferability by integrating multiple models or training on multiple datasets,but these methods are costly.Motivated by the optimisation strategy with spatial information on the perturbed paths and samples,we propose a Dual Spatial Momentum Iterative Fast Gradient Sign Method(DS-MI-FGSM)to improve the transferability of black-box at-tacks against SRSs.Specifically,DS-MI-FGSM only needs a single data and one model as the input;by extending to the data and model neighbouring spaces,it generates adver-sarial examples against the integrating models.To reduce the risk of overfitting,DS-MI-FGSM also introduces gradient masking to improve transferability.The authors conduct extensive experiments regarding the speaker recognition task,and the results demonstrate the effectiveness of their method,which can achieve up to 92%attack success rate on the victim model in black-box scenarios with only one known model.
文摘The development of scientific inquiry and research has yielded numerous benefits in the realm of intelligent traffic control systems, particularly in the realm of automatic license plate recognition for vehicles. The design of license plate recognition algorithms has undergone digitalization through the utilization of neural networks. In contemporary times, there is a growing demand for vehicle surveillance due to the need for efficient vehicle processing and traffic management. The design, development, and implementation of a license plate recognition system hold significant social, economic, and academic importance. The study aims to present contemporary methodologies and empirical findings pertaining to automated license plate recognition. The primary focus of the automatic license plate recognition algorithm was on image extraction, character segmentation, and recognition. The task of character segmentation has been identified as the most challenging function based on my observations. The license plate recognition project that we designed demonstrated the effectiveness of this method across various observed conditions. Particularly in low-light environments, such as during periods of limited illumination or inclement weather characterized by precipitation. The method has been subjected to testing using a sample size of fifty images, resulting in a 100% accuracy rate. The findings of this study demonstrate the project’s ability to effectively determine the optimal outcomes of simulations.
基金This work was funded by the Deanship of Scientific Research at Jouf University under Grant Number(DSR2022-RG-0114).
文摘The challenge faced by the visually impaired persons in their day-today lives is to interpret text from documents.In this context,to help these people,the objective of this work is to develop an efficient text recognition system that allows the isolation,the extraction,and the recognition of text in the case of documents having a textured background,a degraded aspect of colors,and of poor quality,and to synthesize it into speech.This system basically consists of three algorithms:a text localization and detection algorithm based on mathematical morphology method(MMM);a text extraction algorithm based on the gamma correction method(GCM);and an optical character recognition(OCR)algorithm for text recognition.A detailed complexity study of the different blocks of this text recognition system has been realized.Following this study,an acceleration of the GCM algorithm(AGCM)is proposed.The AGCM algorithm has reduced the complexity in the text recognition system by 70%and kept the same quality of text recognition as that of the original method.To assist visually impaired persons,a graphical interface of the entire text recognition chain has been developed,allowing the capture of images from a camera,rapid and intuitive visualization of the recognized text from this image,and text-to-speech synthesis.Our text recognition system provides an improvement of 6.8%for the recognition rate and 7.6%for the F-measure relative to GCM and AGCM algorithms.
文摘The Electron Cyclotron Resonance(ECR)ion source is a critical device for producing highly charged ion beams in various applications.Analyzing the charge-state distribution of the ion beams is essential,but the manual analysis is labor-intensive and prone to inaccuracies due to impurity ions.An automatic spectrum recognition system based on intelligent algorithms was proposed for rapid and accurate chargestate analysis of ECR ion sources.The system employs an adaptive window-length Savitzky-Golay(SG)filtering algorithm,an improved automatic multiscale peak detection(AMPD)algorithm,and a greedy matching algorithm based on the relative distance to accurately match different peaks in the spectra with the corresponding charge-state ion species.Additionally,a user-friendly operator interface was developed for ease of use.Extensive testing on the online ECR ion source platform demonstrates that the system achieves high accuracy,with an average root mean square error of less than 0.1 A for identifying charge-state spectra of ECR ion sources.Moreover,the system minimizes the stand-ard deviation of the first-order derivative of the smoothed signal to 81.1846 A.These results indicate the capability of the designed system to identify ion beam spectra with mass numbers less than Xe,including Xe itself.The proposed automatic spectrum recognition system represents a significant advancement in ECR ion source analysis,offering a rapid and accurate approach for charge-state analysis while enhancing supply efficiency.The exceptional performance and successful imple-mentation of the proposed system on multiple ECR ion source platforms at IMPCAS highlight its potential for widespread adoption in ECR ion source research and applications.
基金supported by the Researchers Supporting Project (No.RSP-2021/395),King Saud University,Riyadh,Saudi Arabia.
文摘A deep fusion model is proposed for facial expression-based human-computer Interaction system.Initially,image preprocessing,i.e.,the extraction of the facial region from the input image is utilized.Thereafter,the extraction of more discriminative and distinctive deep learning features is achieved using extracted facial regions.To prevent overfitting,in-depth features of facial images are extracted and assigned to the proposed convolutional neural network(CNN)models.Various CNN models are then trained.Finally,the performance of each CNN model is fused to obtain the final decision for the seven basic classes of facial expressions,i.e.,fear,disgust,anger,surprise,sadness,happiness,neutral.For experimental purposes,three benchmark datasets,i.e.,SFEW,CK+,and KDEF are utilized.The performance of the proposed systemis compared with some state-of-the-artmethods concerning each dataset.Extensive performance analysis reveals that the proposed system outperforms the competitive methods in terms of various performance metrics.Finally,the proposed deep fusion model is being utilized to control a music player using the recognized emotions of the users.
文摘Speech signals play an essential role in communication and provide an efficient way to exchange information between humans and machines.Speech Emotion Recognition(SER)is one of the critical sources for human evaluation,which is applicable in many real-world applications such as healthcare,call centers,robotics,safety,and virtual reality.This work developed a novel TCN-based emotion recognition system using speech signals through a spatial-temporal convolution network to recognize the speaker’s emotional state.The authors designed a Temporal Convolutional Network(TCN)core block to recognize long-term dependencies in speech signals and then feed these temporal cues to a dense network to fuse the spatial features and recognize global information for final classification.The proposed network extracts valid sequential cues automatically from speech signals,which performed better than state-of-the-art(SOTA)and traditional machine learning algorithms.Results of the proposed method show a high recognition rate compared with SOTAmethods.The final unweighted accuracy of 80.84%,and 92.31%,for interactive emotional dyadic motion captures(IEMOCAP)and berlin emotional dataset(EMO-DB),indicate the robustness and efficiency of the designed model.
基金This research work was supported by the National Research Foundation of Korea(NRF)grant funded by the Korean government(MSIT)(NRF-2022R1A2C1004657).
文摘Hand Gesture Recognition(HGR)is a promising research area with an extensive range of applications,such as surgery,video game techniques,and sign language translation,where sign language is a complicated structured form of hand gestures.The fundamental building blocks of structured expressions in sign language are the arrangement of the fingers,the orientation of the hand,and the hand’s position concerning the body.The importance of HGR has increased due to the increasing number of touchless applications and the rapid growth of the hearing-impaired population.Therefore,real-time HGR is one of the most effective interaction methods between computers and humans.Developing a user-free interface with good recognition performance should be the goal of real-time HGR systems.Nowadays,Convolutional Neural Network(CNN)shows great recognition rates for different image-level classification tasks.It is challenging to train deep CNN networks like VGG-16,VGG-19,Inception-v3,and Efficientnet-B0 from scratch because only some significant labeled image datasets are available for static hand gesture images.However,an efficient and robust hand gesture recognition system of sign language employing finetuned Inception-v3 and Efficientnet-Bo network is proposed to identify hand gestures using a comparative small HGR dataset.Experiments show that Inception-v3 achieved 90%accuracy and 0.93%precision,0.91%recall,and 0.90%f1-score,respectively,while EfficientNet-B0 achieved 99%accuracy and 0.98%,0.97%,0.98%,precision,recall,and f1-score respectively.
文摘Activity recognition of indoor occupants using indirect sensing with less privacy violation is one of the hot research topics. This paper proposes a CO<sub>2</sub> sensor-based indoor occupant activity monitoring system. Using the IoT sensor node that contains CO<sub>2</sub> sensors, the measured CO<sub>2</sub> concentrations in three locations (laboratory, office, and bedroom) were stored in a cloud server for up to 35 days starting July 1, 2023. The CO<sub>2</sub> measurements stored at 30-second intervals were statistically processed to produce a heat-mapped display of the hourly average or maximum CO<sub>2</sub> concentration. From the heatmap visualizations of CO<sub>2</sub> concentration, the proposed system estimated meeting, heating water using a portable stove, and sleep for the occupants’ activity recognition.
文摘Speech recognition is a hot topic in the field of artificial intelligence.Generally,speech recognition models can only run on large servers or dedicated chips.This paper presents a keyword speech recognition system based on a neural network and a conventional STM32 chip.To address the limited Flash and ROM resources on the STM32 MCU chip,the deployment of the speech recognition model is optimized to meet the requirements of keyword recognition.Firstly,the audio information obtained through sensors is subjected to MFCC(Mel Frequency Cepstral Coefficient)feature extraction,and the extracted MFCC features are input into a CNN(Convolutional Neural Network)for deep feature extraction.Then,the features are input into a fully connected layer,and finally,the speech keyword is classified and predicted.Deploying the model to the STM32F429,the prediction model achieves an accuracy of 90.58%,a decrease of less than 1%compared to the accuracy of 91.49%running on a computer,with good performance.
文摘Being aimed at the weakness of short range target′s threshold value recognition system,the double passage And Gate recognition system was put forward on the correlativity of target signals and randomness of noise signals Through state analysis and inference of state transition probability,both the reliability and early burst probability of the system were obtained in theory
基金Project(61072087) supported by the National Natural Science Foundation of ChinaProject(20093048) supported by Shanxi ProvincialGraduate Innovation Fund of China
文摘Perceptual auditory filter banks such as Bark-scale filter bank are widely used as front-end processing in speech recognition systems.However,the problem of the design of optimized filter banks that provide higher accuracy in recognition tasks is still open.Owing to spectral analysis in feature extraction,an adaptive bands filter bank (ABFB) is presented.The design adopts flexible bandwidths and center frequencies for the frequency responses of the filters and utilizes genetic algorithm (GA) to optimize the design parameters.The optimization process is realized by combining the front-end filter bank with the back-end recognition network in the performance evaluation loop.The deployment of ABFB together with zero-crossing peak amplitude (ZCPA) feature as a front process for radial basis function (RBF) system shows significant improvement in robustness compared with the Bark-scale filter bank.In ABFB,several sub-bands are still more concentrated toward lower frequency but their exact locations are determined by the performance rather than the perceptual criteria.For the ease of optimization,only symmetrical bands are considered here,which still provide satisfactory results.
基金Project supported by the National High Technology Research and Development Program of China(Grant No.2013AA030901)the Fundamental Research Funds for the Central Universities,China(Grant No.FRF-TP-14-120A2)
文摘This paper presents a new pattern recognition system for Chinese spirit identification by using the polymer quartz piezoelectric crystal sensor based e-nose. The sensors are designed based on quartz crystal microbalance(QCM) principle,and they could capture different vibration frequency signal values for Chinese spirit identification. For each sensor in an8-channel sensor array, seven characteristic values of the original vibration frequency signal values, i.e., average value(A),root-mean-square value(RMS), shape factor value(S_f), crest factor value(C_f), impulse factor value(I_f), clearance factor value(CL_f), kurtosis factor value(K_v) are first extracted. Then the dimension of the characteristic values is reduced by the principle components analysis(PCA) method. Finally the back propagation(BP) neutral network algorithm is used to recognize Chinese spirits. The experimental results show that the recognition rate of six kinds of Chinese spirits is 93.33% and our proposed new pattern recognition system can identify Chinese spirits effectively.
基金Project supported by the National High Technology Research and Development Program of China(Grant No.2013AA030901)
文摘We present a new pattern recognition system based on moving average and linear discriminant analysis (LDA), which can be used to process the original signal of the new polymer quartz piezoelectric crystal air-sensitive sensor system we designed, called the new e-nose. Using the new e-nose, we obtain the template datum of Chinese spirits via a new pattern recognition system. To verify the effectiveness of the new pattern recognition system, we select three kinds of Chinese spirits to test, our results confirm that the new pattern recognition system can perfectly identify and distinguish between the Chinese spirits.
文摘The COVID-19 pandemic poses an additional serious public health threat due to little or no pre-existing human immunity,and developing a system to identify COVID-19 in its early stages will save millions of lives.This study applied support vector machine(SVM),k-nearest neighbor(K-NN)and deep learning convolutional neural network(CNN)algorithms to classify and detect COVID-19 using chest X-ray radiographs.To test the proposed system,chest X-ray radiographs and CT images were collected from different standard databases,which contained 95 normal images,140 COVID-19 images and 10 SARS images.Two scenarios were considered to develop a system for predicting COVID-19.In the first scenario,the Gaussian filter was applied to remove noise from the chest X-ray radiograph images,and then the adaptive region growing technique was used to segment the region of interest from the chest X-ray radiographs.After segmentation,a hybrid feature extraction composed of 2D-DWT and gray level co-occurrence matrix was utilized to extract the features significant for detecting COVID-19.These features were processed using SVM and K-NN.In the second scenario,a CNN transfer model(ResNet 50)was used to detect COVID-19.The system was examined and evaluated through multiclass statistical analysis,and the empirical results of the analysis found significant values of 97.14%,99.34%,99.26%,99.26%and 99.40%for accuracy,specificity,sensitivity,recall and AUC,respectively.Thus,the CNN model showed significant success;it achieved optimal accuracy,effectiveness and robustness for detecting COVID-19.
基金This work is supported by the National Natural Science Foundation of China under Grant Nos.U1636215,61902082the Guangdong Key R&D Program of China 2019B010136003National Key R&D Program of China 2019YFB1706003.
文摘The license plate recognition system(LPRS)has been widely adopted in daily life due to its efficiency and high accuracy.Deep neural networks are commonly used in the LPRS to improve the recognition accuracy.However,researchers have found that deep neural networks have their own security problems that may lead to unexpected results.Specifically,they can be easily attacked by the adversarial examples that are generated by adding small perturbations to the original images,resulting in incorrect license plate recognition.There are some classic methods to generate adversarial examples,but they cannot be adopted on LPRS directly.In this paper,we modify some classic methods to generate adversarial examples that could mislead the LPRS.We conduct extensive evaluations on the HyperLPR system and the results show that the system could be easily attacked by such adversarial examples.In addition,we show that the generated images could also attack the black-box systems;we show some examples that the Baidu LPR system also makes incorrect recognitions.We hope this paper could help improve the LPRS by realizing the existence of such adversarial attacks.
基金Project(61170199)supported by the National Natural Science Foundation of ChinaProject(11A004)support by the Scientific Research Fund of Education Department of Hunan Province,China
文摘In order to improve the resource allocation mechanism of artificial immune recognition system(AIRS) and decrease the memory cells,a fuzzy logic resource allocation and memory cell pruning based AIRS(FPAIRS) is proposed.In FPAIRS,the fuzzy logic is determined by a parameter,thus,the optimal fuzzy logics for different problems can be located through changing the parameter value.At the same time,the memory cells of low fitness scores are pruned to improve the classifier.This classifier was compared with other classifiers on six UCI datasets classification performance.The results show that the accuracies reached by FPAIRS are higher than or comparable to the accuracies of other classifiers,and the memory cells decrease when compared with the memory cells of AIRS.The results show that the algorithm is a high-performance classifier.
文摘This study describes the development of a simple biometric facial recognition system, BFMT, which is designed for use in identifying individuals within a given population. The system is based on digital signatures derived from facial images of human subjects. The results of the study demonstrate that a particular set of facial features from a simple two-dimensional image can yield a unique digital signature which can be used to identify a subject from a limited population within a controlled environment. The simplicity of the model upon which the system is based can result in commercial facial recognition systems that are more cost-effective to develop than those currently on the market.
文摘Designing accurate and time-efficient real-time traffic sign recognition systems is a crucial part of developing the intelligent vehicle which is the main agent in the intelligent transportation system.Traffic sign recognition systems consist of an initial detection phase where images transportaand colors are segmented and fed to the recognition phase.The most challenging process in such systems in terms of time consumption is the detection phase.The trade off in previous studies,which proposed different methods for detecting traffic signs,is between accuracy and computation time,Therefore,this paper presents a novel accurate and time-efficient color segmentation approach based on logistic regression.We used RGB color space as the domain to extract the features of our hypothesis;this has boosted the speed of our approach since no color conversion is needed.Our trained segmentation classifier was tested on 1000 traffic sign images taken in different lighting conditions.The results show that our approach segmented 974 of these images correctly and in a time less than one-fifth of the time needed by any other robust segmentation method.
基金Foundation project: This paper was supported by National Natural Science Foundation of China (No. 30371126).
文摘In forest variety registration, visual traits of the plants appearance are widely used to discern different tree species. The new recognition system of leaf image strategy which based on neural network established to administrate a hierarchical list of leaf images, some sorts of edge detection can be performed to identify the individual tokens of every image and the frame of the leaf can be got to differentiate the tree species. An approach based on back-propagation neuronal network is proposed and the programming language for the implementation is also Riven by using Java. The numerical simulations results have shown that the proposed leaf strategt is effective and feasible.