期刊文献+
共找到305,835篇文章
< 1 2 250 >
每页显示 20 50 100
An Approach for Human Posture Recognition Based on the Fusion PSE-CNN-BiGRU Model
1
作者 Xianghong Cao Xinyu Wang +2 位作者 Xin Geng Donghui Wu Houru An 《Computer Modeling in Engineering & Sciences》 SCIE EI 2024年第7期385-408,共24页
This study proposes a pose estimation-convolutional neural network-bidirectional gated recurrent unit(PSECNN-BiGRU)fusion model for human posture recognition to address low accuracy issues in abnormal posture recognit... This study proposes a pose estimation-convolutional neural network-bidirectional gated recurrent unit(PSECNN-BiGRU)fusion model for human posture recognition to address low accuracy issues in abnormal posture recognition due to the loss of some feature information and the deterioration of comprehensive performance in model detection in complex home environments.Firstly,the deep convolutional network is integrated with the Mediapipe framework to extract high-precision,multi-dimensional information from the key points of the human skeleton,thereby obtaining a human posture feature set.Thereafter,a double-layer BiGRU algorithm is utilized to extract multi-layer,bidirectional temporal features from the human posture feature set,and a CNN network with an exponential linear unit(ELU)activation function is adopted to perform deep convolution of the feature map to extract the spatial feature of the human posture.Furthermore,a squeeze and excitation networks(SENet)module is introduced to adaptively learn the importance weights of each channel,enhancing the network’s focus on important features.Finally,comparative experiments are performed on available datasets,including the public human activity recognition using smartphone dataset(UCIHAR),the public human activity recognition 70 plus dataset(HAR70PLUS),and the independently developed home abnormal behavior recognition dataset(HABRD)created by the authors’team.The results show that the average accuracy of the proposed PSE-CNN-BiGRU fusion model for human posture recognition is 99.56%,89.42%,and 98.90%,respectively,which are 5.24%,5.83%,and 3.19%higher than the average accuracy of the five models proposed in the comparative literature,including CNN,GRU,and others.The F1-score for abnormal posture recognition reaches 98.84%(heartache),97.18%(fall),99.6%(bellyache),and 98.27%(climbing)on the self-builtHABRDdataset,thus verifying the effectiveness,generalization,and robustness of the proposed model in enhancing human posture recognition. 展开更多
关键词 Posture recognition mediapipe BiGRU CNN ELU ATTENTION
下载PDF
Cybernet Model:A New Deep Learning Model for Cyber DDoS Attacks Detection and Recognition
2
作者 Azar Abid Salih Maiwan Bahjat Abdulrazaq 《Computers, Materials & Continua》 SCIE EI 2024年第1期1275-1295,共21页
Cyberspace is extremely dynamic,with new attacks arising daily.Protecting cybersecurity controls is vital for network security.Deep Learning(DL)models find widespread use across various fields,with cybersecurity being... Cyberspace is extremely dynamic,with new attacks arising daily.Protecting cybersecurity controls is vital for network security.Deep Learning(DL)models find widespread use across various fields,with cybersecurity being one of the most crucial due to their rapid cyberattack detection capabilities on networks and hosts.The capabilities of DL in feature learning and analyzing extensive data volumes lead to the recognition of network traffic patterns.This study presents novel lightweight DL models,known as Cybernet models,for the detection and recognition of various cyber Distributed Denial of Service(DDoS)attacks.These models were constructed to have a reasonable number of learnable parameters,i.e.,less than 225,000,hence the name“lightweight.”This not only helps reduce the number of computations required but also results in faster training and inference times.Additionally,these models were designed to extract features in parallel from 1D Convolutional Neural Networks(CNN)and Long Short-Term Memory(LSTM),which makes them unique compared to earlier existing architectures and results in better performance measures.To validate their robustness and effectiveness,they were tested on the CIC-DDoS2019 dataset,which is an imbalanced and large dataset that contains different types of DDoS attacks.Experimental results revealed that bothmodels yielded promising results,with 99.99% for the detectionmodel and 99.76% for the recognition model in terms of accuracy,precision,recall,and F1 score.Furthermore,they outperformed the existing state-of-the-art models proposed for the same task.Thus,the proposed models can be used in cyber security research domains to successfully identify different types of attacks with a high detection and recognition rate. 展开更多
关键词 Deep learning CNN LSTM Cybernet model DDoS recognition
下载PDF
RoBGP:A Chinese Nested Biomedical Named Entity Recognition Model Based on RoBERTa and Global Pointer
3
作者 Xiaohui Cui Chao Song +4 位作者 Dongmei Li Xiaolong Qu Jiao Long Yu Yang Hanchao Zhang 《Computers, Materials & Continua》 SCIE EI 2024年第3期3603-3618,共16页
Named Entity Recognition(NER)stands as a fundamental task within the field of biomedical text mining,aiming to extract specific types of entities such as genes,proteins,and diseases from complex biomedical texts and c... Named Entity Recognition(NER)stands as a fundamental task within the field of biomedical text mining,aiming to extract specific types of entities such as genes,proteins,and diseases from complex biomedical texts and categorize them into predefined entity types.This process can provide basic support for the automatic construction of knowledge bases.In contrast to general texts,biomedical texts frequently contain numerous nested entities and local dependencies among these entities,presenting significant challenges to prevailing NER models.To address these issues,we propose a novel Chinese nested biomedical NER model based on RoBERTa and Global Pointer(RoBGP).Our model initially utilizes the RoBERTa-wwm-ext-large pretrained language model to dynamically generate word-level initial vectors.It then incorporates a Bidirectional Long Short-Term Memory network for capturing bidirectional semantic information,effectively addressing the issue of long-distance dependencies.Furthermore,the Global Pointer model is employed to comprehensively recognize all nested entities in the text.We conduct extensive experiments on the Chinese medical dataset CMeEE and the results demonstrate the superior performance of RoBGP over several baseline models.This research confirms the effectiveness of RoBGP in Chinese biomedical NER,providing reliable technical support for biomedical information extraction and knowledge base construction. 展开更多
关键词 BIOMEDICINE knowledge base named entity recognition pretrained language model global pointer
下载PDF
3D Road Network Modeling and Road Structure Recognition in Internet of Vehicles
4
作者 Dun Cao Jia Ru +3 位作者 Jian Qin Amr Tolba Jin Wang Min Zhu 《Computer Modeling in Engineering & Sciences》 SCIE EI 2024年第2期1365-1384,共20页
Internet of Vehicles (IoV) is a new system that enables individual vehicles to connect with nearby vehicles,people, transportation infrastructure, and networks, thereby realizing amore intelligent and efficient transp... Internet of Vehicles (IoV) is a new system that enables individual vehicles to connect with nearby vehicles,people, transportation infrastructure, and networks, thereby realizing amore intelligent and efficient transportationsystem. The movement of vehicles and the three-dimensional (3D) nature of the road network cause the topologicalstructure of IoV to have the high space and time complexity.Network modeling and structure recognition for 3Droads can benefit the description of topological changes for IoV. This paper proposes a 3Dgeneral roadmodel basedon discrete points of roads obtained from GIS. First, the constraints imposed by 3D roads on moving vehicles areanalyzed. Then the effects of road curvature radius (Ra), longitudinal slope (Slo), and length (Len) on speed andacceleration are studied. Finally, a general 3D road network model based on road section features is established.This paper also presents intersection and road section recognition methods based on the structural features ofthe 3D road network model and the road features. Real GIS data from a specific region of Beijing is adopted tocreate the simulation scenario, and the simulation results validate the general 3D road network model and therecognitionmethod. Therefore, thiswork makes contributions to the field of intelligent transportation by providinga comprehensive approach tomodeling the 3Droad network and its topological changes in achieving efficient trafficflowand improved road safety. 展开更多
关键词 Internet of vehicles road networks 3D road model structure recognition GIS
下载PDF
Exploring Sequential Feature Selection in Deep Bi-LSTM Models for Speech Emotion Recognition
5
作者 Fatma Harby Mansor Alohali +1 位作者 Adel Thaljaoui Amira Samy Talaat 《Computers, Materials & Continua》 SCIE EI 2024年第2期2689-2719,共31页
Machine Learning(ML)algorithms play a pivotal role in Speech Emotion Recognition(SER),although they encounter a formidable obstacle in accurately discerning a speaker’s emotional state.The examination of the emotiona... Machine Learning(ML)algorithms play a pivotal role in Speech Emotion Recognition(SER),although they encounter a formidable obstacle in accurately discerning a speaker’s emotional state.The examination of the emotional states of speakers holds significant importance in a range of real-time applications,including but not limited to virtual reality,human-robot interaction,emergency centers,and human behavior assessment.Accurately identifying emotions in the SER process relies on extracting relevant information from audio inputs.Previous studies on SER have predominantly utilized short-time characteristics such as Mel Frequency Cepstral Coefficients(MFCCs)due to their ability to capture the periodic nature of audio signals effectively.Although these traits may improve their ability to perceive and interpret emotional depictions appropriately,MFCCS has some limitations.So this study aims to tackle the aforementioned issue by systematically picking multiple audio cues,enhancing the classifier model’s efficacy in accurately discerning human emotions.The utilized dataset is taken from the EMO-DB database,preprocessing input speech is done using a 2D Convolution Neural Network(CNN)involves applying convolutional operations to spectrograms as they afford a visual representation of the way the audio signal frequency content changes over time.The next step is the spectrogram data normalization which is crucial for Neural Network(NN)training as it aids in faster convergence.Then the five auditory features MFCCs,Chroma,Mel-Spectrogram,Contrast,and Tonnetz are extracted from the spectrogram sequentially.The attitude of feature selection is to retain only dominant features by excluding the irrelevant ones.In this paper,the Sequential Forward Selection(SFS)and Sequential Backward Selection(SBS)techniques were employed for multiple audio cues features selection.Finally,the feature sets composed from the hybrid feature extraction methods are fed into the deep Bidirectional Long Short Term Memory(Bi-LSTM)network to discern emotions.Since the deep Bi-LSTM can hierarchically learn complex features and increases model capacity by achieving more robust temporal modeling,it is more effective than a shallow Bi-LSTM in capturing the intricate tones of emotional content existent in speech signals.The effectiveness and resilience of the proposed SER model were evaluated by experiments,comparing it to state-of-the-art SER techniques.The results indicated that the model achieved accuracy rates of 90.92%,93%,and 92%over the Ryerson Audio-Visual Database of Emotional Speech and Song(RAVDESS),Berlin Database of Emotional Speech(EMO-DB),and The Interactive Emotional Dyadic Motion Capture(IEMOCAP)datasets,respectively.These findings signify a prominent enhancement in the ability to emotional depictions identification in speech,showcasing the potential of the proposed model in advancing the SER field. 展开更多
关键词 Artificial intelligence application multi features sequential selection speech emotion recognition deep Bi-LSTM
下载PDF
Design and Implementation of Hand Gesture Detection System Using HM Model for Sign Language Recognition Development
6
作者 Sharmin Akter Milu Azmath Fathima +2 位作者 Tanmay Talukder Inzamamul Islam Md. Ismail Siddiqi Emon 《Journal of Data Analysis and Information Processing》 2024年第2期139-150,共12页
Gesture detection is the primary and most significant step for sign language detection and sign language is the communication medium for people with speaking and hearing disabilities. This paper presents a novel metho... Gesture detection is the primary and most significant step for sign language detection and sign language is the communication medium for people with speaking and hearing disabilities. This paper presents a novel method for dynamic hand gesture detection using Hidden Markov Models (HMMs) where we detect different English alphabet letters by tracing hand movements. The process involves skin color-based segmentation for hand isolation in video frames, followed by morphological operations to enhance image trajectories. Our system employs hand tracking and trajectory smoothing techniques, such as the Kalman filter, to monitor hand movements and refine gesture paths. Quantized sequences are then analyzed using the Baum-Welch Re-estimation Algorithm, an HMM-based approach. A maximum likelihood classifier is used to identify the most probable letter from the test sequences. Our method demonstrates significant improvements over traditional recognition techniques in real-time, automatic hand gesture recognition, particularly in its ability to distinguish complex gestures. The experimental results confirm the effectiveness of our approach in enhancing gesture-based sign language detection to alleviate the barrier between the deaf and hard-of-hearing community and general people. 展开更多
关键词 Hand Gesture recognition System
下载PDF
Analysis of the Design and Implementation of a GIS System Incorporating Intelligent Recognition Models
7
作者 Baoshan Zeng 《Journal of Electronic Research and Application》 2024年第2期62-67,共6页
The rapid economic growth,urbanization,and industrialization have led to a scarcity of land resources in coastal areas,exacerbating the conflict between humans and the environment.In order to promote economic developm... The rapid economic growth,urbanization,and industrialization have led to a scarcity of land resources in coastal areas,exacerbating the conflict between humans and the environment.In order to promote economic development,attention has turned to the sea,and various coastal engineering projects have been undertaken,sparking a wave of land reclamation.However,while these efforts bring economic and social benefits,they also have implications for ecological relationships.To respond to and plan for changes in the coastline and land cover in a timely manner,this paper proposes and constructs a GIS system that integrates remote sensing image recognition models.The system combines geographic information system development technology with image recognition technology,streamlining the processing and identification of image data.This approach is particularly advantageous for marine management departments in their long-term monitoring and dynamic management of coastal lines,ensuring a more effective and efficient response. 展开更多
关键词 GIS Image recognition Image data System construction
下载PDF
Diffraction deep neural network based orbital angular momentum mode recognition scheme in oceanic turbulence
8
作者 詹海潮 陈兵 +3 位作者 彭怡翔 王乐 王文鼐 赵生妹 《Chinese Physics B》 SCIE EI CAS CSCD 2023年第4期364-369,共6页
Orbital angular momentum(OAM)has the characteristics of mutual orthogonality between modes,and has been applied to underwater wireless optical communication(UWOC)systems to increase the channel capacity.In this work,w... Orbital angular momentum(OAM)has the characteristics of mutual orthogonality between modes,and has been applied to underwater wireless optical communication(UWOC)systems to increase the channel capacity.In this work,we propose a diffractive deep neural network(DDNN)based OAM mode recognition scheme,where the DDNN is trained to capture the features of the intensity distribution of the OAM modes and output the corresponding azimuthal indices and radial indices.The results show that the proposed scheme can recognize the azimuthal indices and radial indices of the OAM modes accurately and quickly.In addition,the proposed scheme can resist weak oceanic turbulence(OT),and exhibit excellent ability to recognize OAM modes in a strong OT environment.The DDNN-based OAM mode recognition scheme has potential applications in UWOC systems. 展开更多
关键词 orbital angular momentum diffractive deep neural network mode recognition oceanic turbulence
下载PDF
A Federated Named Entity Recognition Model with Explicit Relation for Power Grid 被引量:1
9
作者 Jingtang Luo Shiying Yao +2 位作者 Changming Zhao Jie Xu Jim Feng 《Computers, Materials & Continua》 SCIE EI 2023年第5期4207-4216,共10页
The power grid operation process is complex,and many operation process data involve national security,business secrets,and user privacy.Meanwhile,labeled datasets may exist in many different operation platforms,but th... The power grid operation process is complex,and many operation process data involve national security,business secrets,and user privacy.Meanwhile,labeled datasets may exist in many different operation platforms,but they cannot be directly shared since power grid data is highly privacysensitive.How to use these multi-source heterogeneous data as much as possible to build a power grid knowledge map under the premise of protecting privacy security has become an urgent problem in developing smart grid.Therefore,this paper proposes federated learning named entity recognition method for the power grid field,aiming to solve the problem of building a named entity recognition model covering the entire power grid process training by data with different security requirements.We decompose the named entity recognition(NER)model FLAT(Chinese NER Using Flat-Lattice Transformer)in each platform into a global part and a local part.The local part is used to capture the characteristics of the local data in each platform and is updated using locally labeled data.The global part is learned across different operation platforms to capture the shared NER knowledge.Its local gradients fromdifferent platforms are aggregated to update the global model,which is further delivered to each platform to update their global part.Experiments on two publicly available Chinese datasets and one power grid dataset validate the effectiveness of our method. 展开更多
关键词 Power grid named entity recognition federal learning
下载PDF
The Efficacy of Deep Learning-Based Mixed Model for Speech Emotion Recognition 被引量:1
10
作者 Mohammad Amaz Uddin Mohammad Salah Uddin Chowdury +2 位作者 Mayeen Uddin Khandaker Nissren Tamam Abdelmoneim Sulieman 《Computers, Materials & Continua》 SCIE EI 2023年第1期1709-1722,共14页
Human speech indirectly represents the mental state or emotion of others.The use of Artificial Intelligence(AI)-based techniques may bring revolution in this modern era by recognizing emotion from speech.In this study... Human speech indirectly represents the mental state or emotion of others.The use of Artificial Intelligence(AI)-based techniques may bring revolution in this modern era by recognizing emotion from speech.In this study,we introduced a robust method for emotion recognition from human speech using a well-performed preprocessing technique together with the deep learning-based mixed model consisting of Long Short-Term Memory(LSTM)and Convolutional Neural Network(CNN).About 2800 audio files were extracted from the Toronto emotional speech set(TESS)database for this study.A high pass and Savitzky Golay Filter have been used to obtain noise-free as well as smooth audio data.A total of seven types of emotions;Angry,Disgust,Fear,Happy,Neutral,Pleasant-surprise,and Sad were used in this study.Energy,Fundamental frequency,and Mel Frequency Cepstral Coefficient(MFCC)have been used to extract the emotion features,and these features resulted in 97.5%accuracy in the mixed LSTM+CNN model.This mixed model is found to be performed better than the usual state-of-the-art models in emotion recognition from speech.It also indicates that this mixed model could be effectively utilized in advanced research dealing with sound processing. 展开更多
关键词 Emotion recognition Savitzky Golay fundamental frequency MFCC neural networks
下载PDF
Speech Recognition via CTC-CNN Model
11
作者 Wen-Tsai Sung Hao-WeiKang Sung-Jung Hsiao 《Computers, Materials & Continua》 SCIE EI 2023年第9期3833-3858,共26页
In the speech recognition system,the acoustic model is an important underlying model,and its accuracy directly affects the performance of the entire system.This paper introduces the construction and training process o... In the speech recognition system,the acoustic model is an important underlying model,and its accuracy directly affects the performance of the entire system.This paper introduces the construction and training process of the acoustic model in detail and studies the Connectionist temporal classification(CTC)algorithm,which plays an important role in the end-to-end framework,established a convolutional neural network(CNN)combined with an acoustic model of Connectionist temporal classification to improve the accuracy of speech recognition.This study uses a sound sensor,ReSpeakerMic Array v2.0.1,to convert the collected speech signals into text or corresponding speech signals to improve communication and reduce noise and hardware interference.The baseline acousticmodel in this study faces challenges such as long training time,high error rate,and a certain degree of overfitting.The model is trained through continuous design and improvement of the relevant parameters of the acousticmodel,and finally the performance is selected according to the evaluation index.Excellentmodel,which reduces the error rate to about 18%,thus improving the accuracy rate.Finally,comparative verificationwas carried out from the selection of acoustic feature parameters,the selection of modeling units,and the speaker’s speech rate,which further verified the excellent performance of the CTCCNN_5+BN+Residual model structure.In terms of experiments,to train and verify the CTC-CNN baseline acoustic model,this study uses THCHS-30 and ST-CMDS speech data sets as training data sets,and after 54 epochs of training,the word error rate of the acoustic model training set is 31%,the word error rate of the test set is stable at about 43%.This experiment also considers the surrounding environmental noise.Under the noise level of 80∼90 dB,the accuracy rate is 88.18%,which is the worst performance among all levels.In contrast,at 40–60 dB,the accuracy was as high as 97.33%due to less noise pollution. 展开更多
关键词 Artificial intelligence speech recognition speech to text convolutional neural network automatic speech recognition
下载PDF
Modified Wild Horse Optimization with Deep Learning Enabled Symmetric Human Activity Recognition Model
12
作者 Bareen Shamsaldeen Tahir Zainab Salih Ageed +1 位作者 Sheren Sadiq Hasan Subhi R.M.Zeebaree 《Computers, Materials & Continua》 SCIE EI 2023年第5期4009-4024,共16页
Traditional indoor human activity recognition(HAR)is a timeseries data classification problem and needs feature extraction.Presently,considerable attention has been given to the domain ofHARdue to the enormous amount ... Traditional indoor human activity recognition(HAR)is a timeseries data classification problem and needs feature extraction.Presently,considerable attention has been given to the domain ofHARdue to the enormous amount of its real-time uses in real-time applications,namely surveillance by authorities,biometric user identification,and health monitoring of older people.The extensive usage of the Internet of Things(IoT)and wearable sensor devices has made the topic of HAR a vital subject in ubiquitous and mobile computing.The more commonly utilized inference and problemsolving technique in the HAR system have recently been deep learning(DL).The study develops aModifiedWild Horse Optimization withDLAided Symmetric Human Activity Recognition(MWHODL-SHAR)model.The major intention of the MWHODL-SHAR model lies in recognition of symmetric activities,namely jogging,walking,standing,sitting,etc.In the presented MWHODL-SHAR technique,the human activities data is pre-processed in various stages to make it compatible for further processing.A convolution neural network with an attention-based long short-term memory(CNNALSTM)model is applied for activity recognition.The MWHO algorithm is utilized as a hyperparameter tuning strategy to improve the detection rate of the CNN-ALSTM algorithm.The experimental validation of the MWHODL-SHAR technique is simulated using a benchmark dataset.An extensive comparison study revealed the betterment of theMWHODL-SHAR technique over other recent approaches. 展开更多
关键词 Human activity recognition SYMMETRY deep learning machine learning pattern recognition time series classification
下载PDF
Human-Computer Interaction Using Deep Fusion Model-Based Facial Expression Recognition System
13
作者 Saiyed Umer Ranjeet Kumar Rout +3 位作者 Shailendra Tiwari Ahmad Ali AlZubi Jazem Mutared Alanazi Kulakov Yurii 《Computer Modeling in Engineering & Sciences》 SCIE EI 2023年第5期1165-1185,共21页
A deep fusion model is proposed for facial expression-based human-computer Interaction system.Initially,image preprocessing,i.e.,the extraction of the facial region from the input image is utilized.Thereafter,the extr... A deep fusion model is proposed for facial expression-based human-computer Interaction system.Initially,image preprocessing,i.e.,the extraction of the facial region from the input image is utilized.Thereafter,the extraction of more discriminative and distinctive deep learning features is achieved using extracted facial regions.To prevent overfitting,in-depth features of facial images are extracted and assigned to the proposed convolutional neural network(CNN)models.Various CNN models are then trained.Finally,the performance of each CNN model is fused to obtain the final decision for the seven basic classes of facial expressions,i.e.,fear,disgust,anger,surprise,sadness,happiness,neutral.For experimental purposes,three benchmark datasets,i.e.,SFEW,CK+,and KDEF are utilized.The performance of the proposed systemis compared with some state-of-the-artmethods concerning each dataset.Extensive performance analysis reveals that the proposed system outperforms the competitive methods in terms of various performance metrics.Finally,the proposed deep fusion model is being utilized to control a music player using the recognized emotions of the users. 展开更多
关键词 Deep learning facial expression emotions recognition CNN
下载PDF
Crop Disease Recognition Based on Improved Model-Agnostic Meta-Learning
14
作者 Xiuli Si Biao Hong +1 位作者 Yuanhui Hu Lidong Chu 《Computers, Materials & Continua》 SCIE EI 2023年第6期6101-6118,共18页
Currently,one of the most severe problems in the agricultural industry is the effect of diseases and pests on global crop production and economic development.Therefore,further research in the field of crop disease and... Currently,one of the most severe problems in the agricultural industry is the effect of diseases and pests on global crop production and economic development.Therefore,further research in the field of crop disease and pest detection is necessary to address the mentioned problem.Aiming to identify the diseased crops and insect pests timely and accurately and perform appropriate prevention measures to reduce the associated losses,this article proposes a Model-Agnostic Meta-Learning(MAML)attention model based on the meta-learning paradigm.The proposed model combines meta-learning with basic learning and adopts an Efficient Channel Attention(ECA)mod-ule.The module follows the local cross-channel interactive strategy of non-dimensional reduction to strengthen the weight parameters corresponding to certain disease characteristics.The proposed meta-learning-based algorithm has the advantage of strong generalization capability and,by integrating the ECA module in the original model,can achieve more efficient detection in new tasks.The proposed model is verified by experiments,and the experimental results show that compared with the original MAML model,the proposed improved MAML-Attention model has a better performance by 1.8–9.31 percentage points in different classification tasks;the maximum accuracy is increased by 1.15–8.2 percentage points.The experimental results verify the strong generalization ability and good robustness of the proposed MAML-Attention model.Compared to the other few-shot methods,the proposed MAML-Attention performs better. 展开更多
关键词 META-LEARNING disease image recognition deep learning attention mechanism
下载PDF
ASL Recognition by the Layered Learning Model Using Clustered Groups
15
作者 Jungsoo Shin Jaehee Jung 《Computer Systems Science & Engineering》 SCIE EI 2023年第4期51-68,共18页
American Sign Language(ASL)images can be used as a communication tool by determining numbers and letters using the shape of the fingers.Particularly,ASL can have an key role in communication for hearing-impaired perso... American Sign Language(ASL)images can be used as a communication tool by determining numbers and letters using the shape of the fingers.Particularly,ASL can have an key role in communication for hearing-impaired persons and conveying information to other persons,because sign language is their only channel of expression.Representative ASL recognition methods primarily adopt images,sensors,and pose-based recognition techniques,and employ various gestures together with hand-shapes.This study briefly reviews these attempts at ASL recognition and provides an improved ASL classification model that attempts to develop a deep learning method with meta-layers.In the proposed model,the collected ASL images were clustered based on similarities in shape,and clustered group classification was first performed,followed by reclassification within the group.The experiments were conducted with various groups using different learning layers to improve the accuracy of individual image recognition.After selecting the optimized group,we proposed a meta-layered learning model with the highest recognition rate using a deep learning method of image processing.The proposed model exhibited an improved performance compared with the general classification model. 展开更多
关键词 American sign language deep learning recognition CNN ResNet clustered group
下载PDF
Recognition model and algorithm of projectiles by combining particle swarm optimization support vector and spatial-temporal constrain
16
作者 Han-shan Li 《Defence Technology(防务技术)》 SCIE EI CAS CSCD 2023年第9期273-283,共11页
In order to improve the recognition rate and accuracy rate of projectiles in six sky-screens intersection test system,this work proposes a new recognition method of projectiles by combining particle swarm optimization... In order to improve the recognition rate and accuracy rate of projectiles in six sky-screens intersection test system,this work proposes a new recognition method of projectiles by combining particle swarm optimization support vector and spatial-temporal constrain of six sky-screens detection sensor.Based on the measurement principle of the six sky-screens intersection test system and the characteristics of the output signal of the sky-screen,we analyze the existing problems regarding the recognition of projectiles.In order to optimize the projectile recognition effect,we use the support vector machine and basic particle swarm algorithm to form a new recognition algorithm.We set up the particle swarm algorithm optimization support vector projectile information recognition model that conforms to the six sky-screens intersection test system.We also construct a spatial-temporal constrain matching model based on the spatial geometric relationship of six sky-screen intersection,and form a new projectile signal recognition algorithm with six sky-screens spatial-temporal information constraints under the signal classification mechanism of particle swarm optimization algorithm support vector machine.Based on experiments,we obtain the optimal penalty and kernel function radius parameters in the PSO-SVM algorithm;we adjust the parameters of the support vector machine model,train the test signal data of every sky-screen,and gain the projectile signal classification results.Afterwards,according to the signal classification results,we calculate the coordinate parameters of the real projectile by using the spatial-temporal constrain of six sky-screens detection sensor,which verifies the feasibility of the proposed algorithm. 展开更多
关键词 Six sky-screens intersection test system Pattern recognition Particle swarm optimization Support vector machine PROJECTILE
下载PDF
Performance Analysis of a Chunk-Based Speech Emotion Recognition Model Using RNN
17
作者 Hyun-Sam Shin Jun-Ki Hong 《Intelligent Automation & Soft Computing》 SCIE 2023年第4期235-248,共14页
Recently,artificial-intelligence-based automatic customer response sys-tem has been widely used instead of customer service representatives.Therefore,it is important for automatic customer service to promptly recognize... Recently,artificial-intelligence-based automatic customer response sys-tem has been widely used instead of customer service representatives.Therefore,it is important for automatic customer service to promptly recognize emotions in a customer’s voice to provide the appropriate service accordingly.Therefore,we analyzed the performance of the emotion recognition(ER)accuracy as a function of the simulation time using the proposed chunk-based speech ER(CSER)model.The proposed CSER model divides voice signals into 3-s long chunks to effi-ciently recognize characteristically inherent emotions in the customer’s voice.We evaluated the performance of the ER of voice signal chunks by applying four RNN techniques—long short-term memory(LSTM),bidirectional-LSTM,gated recurrent units(GRU),and bidirectional-GRU—to the proposed CSER model individually to assess its ER accuracy and time efficiency.The results reveal that GRU shows the best time efficiency in recognizing emotions from speech signals in terms of accuracy as a function of simulation time. 展开更多
关键词 RNN speech emotion recognition attention mechanism time efficiency
下载PDF
TC-Net:A Modest&Lightweight Emotion Recognition System Using Temporal Convolution Network
18
作者 Muhammad Ishaq Mustaqeem Khan Soonil Kwon 《Computer Systems Science & Engineering》 SCIE EI 2023年第9期3355-3369,共15页
Speech signals play an essential role in communication and provide an efficient way to exchange information between humans and machines.Speech Emotion Recognition(SER)is one of the critical sources for human evaluatio... Speech signals play an essential role in communication and provide an efficient way to exchange information between humans and machines.Speech Emotion Recognition(SER)is one of the critical sources for human evaluation,which is applicable in many real-world applications such as healthcare,call centers,robotics,safety,and virtual reality.This work developed a novel TCN-based emotion recognition system using speech signals through a spatial-temporal convolution network to recognize the speaker’s emotional state.The authors designed a Temporal Convolutional Network(TCN)core block to recognize long-term dependencies in speech signals and then feed these temporal cues to a dense network to fuse the spatial features and recognize global information for final classification.The proposed network extracts valid sequential cues automatically from speech signals,which performed better than state-of-the-art(SOTA)and traditional machine learning algorithms.Results of the proposed method show a high recognition rate compared with SOTAmethods.The final unweighted accuracy of 80.84%,and 92.31%,for interactive emotional dyadic motion captures(IEMOCAP)and berlin emotional dataset(EMO-DB),indicate the robustness and efficiency of the designed model. 展开更多
关键词 Affective computing deep learning emotion recognition speech signal temporal convolutional network
下载PDF
Earthworm Optimization with Improved SqueezeNet Enabled Facial Expression Recognition Model
19
作者 N.Sharmili Saud Yonbawi +5 位作者 Sultan Alahmari E.Laxmi Lydia Mohamad Khairi Ishak Hend Khalid Alkahtani Ayman Aljarbouh Samih M.Mostafa 《Computer Systems Science & Engineering》 SCIE EI 2023年第8期2247-2262,共16页
Facial expression recognition(FER)remains a hot research area among computer vision researchers and still becomes a challenge because of high intraclass variations.Conventional techniques for this problem depend on ha... Facial expression recognition(FER)remains a hot research area among computer vision researchers and still becomes a challenge because of high intraclass variations.Conventional techniques for this problem depend on hand-crafted features,namely,LBP,SIFT,and HOG,along with that a classifier trained on a database of videos or images.Many execute perform well on image datasets captured in a controlled condition;however not perform well in the more challenging dataset,which has partial faces and image variation.Recently,many studies presented an endwise structure for facial expression recognition by utilizing DL methods.Therefore,this study develops an earthworm optimization with an improved SqueezeNet-based FER(EWOISN-FER)model.The presented EWOISN-FER model primarily applies the contrast-limited adaptive histogram equalization(CLAHE)technique as a pre-processing step.In addition,the improved SqueezeNet model is exploited to derive an optimal set of feature vectors,and the hyperparameter tuning process is performed by the stochastic gradient boosting(SGB)model.Finally,EWO with sparse autoencoder(SAE)is employed for the FER process,and the EWO algorithm appropriately chooses the SAE parameters.Awide-ranging experimental analysis is carried out to examine the performance of the proposed model.The experimental outcomes indicate the supremacy of the presented EWOISN-FER technique. 展开更多
关键词 Facial expression recognition deep learning computer vision earthworm optimization hyperparameter optimization
下载PDF
A Robust Conformer-Based Speech Recognition Model for Mandarin Air Traffic Control
20
作者 Peiyuan Jiang Weijun Pan +2 位作者 Jian Zhang Teng Wang Junxiang Huang 《Computers, Materials & Continua》 SCIE EI 2023年第10期911-940,共30页
This study aims to address the deviation in downstream tasks caused by inaccurate recognition results when applying Automatic Speech Recognition(ASR)technology in the Air Traffic Control(ATC)field.This paper presents ... This study aims to address the deviation in downstream tasks caused by inaccurate recognition results when applying Automatic Speech Recognition(ASR)technology in the Air Traffic Control(ATC)field.This paper presents a novel cascaded model architecture,namely Conformer-CTC/Attention-T5(CCAT),to build a highly accurate and robust ATC speech recognition model.To tackle the challenges posed by noise and fast speech rate in ATC,the Conformer model is employed to extract robust and discriminative speech representations from raw waveforms.On the decoding side,the Attention mechanism is integrated to facilitate precise alignment between input features and output characters.The Text-To-Text Transfer Transformer(T5)language model is also introduced to handle particular pronunciations and code-mixing issues,providing more accurate and concise textual output for downstream tasks.To enhance the model’s robustness,transfer learning and data augmentation techniques are utilized in the training strategy.The model’s performance is optimized by performing hyperparameter tunings,such as adjusting the number of attention heads,encoder layers,and the weights of the loss function.The experimental results demonstrate the significant contributions of data augmentation,hyperparameter tuning,and error correction models to the overall model performance.On the Our ATC Corpus dataset,the proposed model achieves a Character Error Rate(CER)of 3.44%,representing a 3.64%improvement compared to the baseline model.Moreover,the effectiveness of the proposed model is validated on two publicly available datasets.On the AISHELL-1 dataset,the CCAT model achieves a CER of 3.42%,showcasing a 1.23%improvement over the baseline model.Similarly,on the LibriSpeech dataset,the CCAT model achieves a Word Error Rate(WER)of 5.27%,demonstrating a performance improvement of 7.67%compared to the baseline model.Additionally,this paper proposes an evaluation criterion for assessing the robustness of ATC speech recognition systems.In robustness evaluation experiments based on this criterion,the proposed model demonstrates a performance improvement of 22%compared to the baseline model. 展开更多
关键词 Air traffic control automatic speech recognition CONFORMER robustness evaluation T5 error correction model
下载PDF
上一页 1 2 250 下一页 到第
使用帮助 返回顶部