This research aims to enhance Clinical Decision Support Systems(CDSS)within Wireless Body Area Networks(WBANs)by leveraging advanced machine learning techniques.Specifically,we target the challenges of accurate diagno...This research aims to enhance Clinical Decision Support Systems(CDSS)within Wireless Body Area Networks(WBANs)by leveraging advanced machine learning techniques.Specifically,we target the challenges of accurate diagnosis in medical imaging and sequential data analysis using Recurrent Neural Networks(RNNs)with Long Short-Term Memory(LSTM)layers and echo state cells.These models are tailored to improve diagnostic precision,particularly for conditions like rotator cuff tears in osteoporosis patients and gastrointestinal diseases.Traditional diagnostic methods and existing CDSS frameworks often fall short in managing complex,sequential medical data,struggling with long-term dependencies and data imbalances,resulting in suboptimal accuracy and delayed decisions.Our goal is to develop Artificial Intelligence(AI)models that address these shortcomings,offering robust,real-time diagnostic support.We propose a hybrid RNN model that integrates SimpleRNN,LSTM layers,and echo state cells to manage long-term dependencies effectively.Additionally,we introduce CG-Net,a novel Convolutional Neural Network(CNN)framework for gastrointestinal disease classification,which outperforms traditional CNN models.We further enhance model performance through data augmentation and transfer learning,improving generalization and robustness against data scarcity and imbalance.Comprehensive validation,including 5-fold cross-validation and metrics such as accuracy,precision,recall,F1-score,and Area Under the Curve(AUC),confirms the models’reliability.Moreover,SHapley Additive exPlanations(SHAP)and Local Interpretable Model-agnostic Explanations(LIME)are employed to improve model interpretability.Our findings show that the proposed models significantly enhance diagnostic accuracy and efficiency,offering substantial advancements in WBANs and CDSS.展开更多
Non-blind audio bandwidth extension is a standard technique within contemporary audio codecs to efficiently code audio signals at low bitrates. In existing methods, in most cases high frequencies signal is usually gen...Non-blind audio bandwidth extension is a standard technique within contemporary audio codecs to efficiently code audio signals at low bitrates. In existing methods, in most cases high frequencies signal is usually generated by a duplication of the corresponding low frequencies and some parameters of high frequencies. However, the perception quality of coding will significantly degrade if the correlation between high frequencies and low frequencies becomes weak. In this paper, we quantitatively analyse the correlation via computing mutual information value. The analysis results show the correlation also exists in low frequency signal of the context dependent frames besides the current frame. In order to improve the perception quality of coding, we propose a novel method of high frequency coarse spectrum generation to improve the conventional replication method. In the proposed method, the coarse high frequency spectrums are generated by a nonlinear mapping model using deep recurrent neural network. The experiments confirm that the proposed method shows better performance than the reference methods.展开更多
Accurately predicting fluid forces acting on the sur-face of a structure is crucial in engineering design.However,this task becomes particularly challenging in turbulent flow,due to the complex and irregular changes i...Accurately predicting fluid forces acting on the sur-face of a structure is crucial in engineering design.However,this task becomes particularly challenging in turbulent flow,due to the complex and irregular changes in the flow field.In this study,we propose a novel deep learning method,named mapping net-work-coordinated stacked gated recurrent units(MSU),for pre-dicting pressure on a circular cylinder from velocity data.Specifi-cally,our coordinated learning strategy is designed to extract the most critical velocity point for prediction,a process that has not been explored before.In our experiments,MSU extracts one point from a velocity field containing 121 points and utilizes this point to accurately predict 100 pressure points on the cylinder.This method significantly reduces the workload of data measure-ment in practical engineering applications.Our experimental results demonstrate that MSU predictions are highly similar to the real turbulent data in both spatio-temporal and individual aspects.Furthermore,the comparison results show that MSU predicts more precise results,even outperforming models that use all velocity field points.Compared with state-of-the-art methods,MSU has an average improvement of more than 45%in various indicators such as root mean square error(RMSE).Through comprehensive and authoritative physical verification,we estab-lished that MSU’s prediction results closely align with pressure field data obtained in real turbulence fields.This confirmation underscores the considerable potential of MSU for practical applications in real engineering scenarios.The code is available at https://github.com/zhangzm0128/MSU.展开更多
The regulatory role of the Micro-RNAs (miRNAs) in the messenger RNAs (mRNAs) gene expression is well understood by the biologists since some decades, even though the delving into specific aspects is in progress. Clust...The regulatory role of the Micro-RNAs (miRNAs) in the messenger RNAs (mRNAs) gene expression is well understood by the biologists since some decades, even though the delving into specific aspects is in progress. Clustering is a cornerstone in bioinformatics research, offering a potent computational tool for analyzing diverse types of data encountered in genomics and related fields. MiRNA clustering plays a pivotal role in deciphering the intricate regulatory roles of miRNAs in biological systems. It uncovers novel biomarkers for disease diagnosis and prognosis and advances our understanding of gene regulatory networks and pathways implicated in health and disease, as well as drug discovery. Namely, we have implemented clustering procedure to find interrelations among miRNAs within clusters, and their relations to diseases. Deep clustering (DC) algorithms signify a departure from traditional clustering methods towards more sophisticated techniques, that can uncover intricate patterns and relationships within gene expression data. Deep learning (DL) models have shown remarkable success in various domains, and their application in genomics, especially for tasks like clustering, holding immense promise. The deep convolutional clustering procedure used is different from other traditional methods, demonstrating unbiased clustering results. In the paper, we implement the procedure on a Multiple Myeloma miRNA dataset publicly available on GEO platform, as a template of a cancer instance analysis, and hazard some biological issues.展开更多
The remaining useful life(RUL)of a system is generally predicted by utilising the data collected from the sensors that continuously monitor different indicators.Recently,different deep learning(DL)techniques have been...The remaining useful life(RUL)of a system is generally predicted by utilising the data collected from the sensors that continuously monitor different indicators.Recently,different deep learning(DL)techniques have been used for RUL prediction and achieved great success.Because the data is often time-sequential,recurrent neural network(RNN)has attracted significant interests due to its efficiency in dealing with such data.This paper systematically reviews RNN and its variants for RUL prediction,with a specific focus on understanding how different components(e.g.,types of optimisers and activation functions)or parameters(e.g.,sequence length,neuron quantities)affect their performance.After that,a case study using the well-studied NASA’s C-MAPSS dataset is presented to quantitatively evaluate the influence of various state-of-the-art RNN structures on the RUL prediction performance.The result suggests that the variant methods usually perform better than the original RNN,and among which,Bi-directional Long Short-Term Memory generally has the best performance in terms of stability,precision and accuracy.Certain model structures may fail to produce valid RUL prediction result due to the gradient vanishing or gradient exploring problem if the parameters are not chosen appropriately.It is concluded that parameter tuning is a crucial step to achieve optimal prediction performance.展开更多
Drug-target interactions prediction(DTIP)remains an important requirement in thefield of drug discovery and human medicine.The identification of interaction among the drug compound and target protein plays an essential ...Drug-target interactions prediction(DTIP)remains an important requirement in thefield of drug discovery and human medicine.The identification of interaction among the drug compound and target protein plays an essential pro-cess in the drug discovery process.It is a lengthier and complex process for pre-dicting the drug target interaction(DTI)utilizing experimental approaches.To resolve these issues,computational intelligence based DTIP techniques were developed to offer an efficient predictive model with low cost.The recently devel-oped deep learning(DL)models can be employed for the design of effective pre-dictive approaches for DTIP.With this motivation,this paper presents a new drug target interaction prediction using optimal recurrent neural network(DTIP-ORNN)technique.The goal of the DTIP-ORNN technique is to predict the DTIs in a semi-supervised way,i.e.,inclusion of both labelled and unlabelled instances.Initially,the DTIP-ORNN technique performs data preparation process and also includes class labelling process,where the target interactions from the database are used to determine thefinal label of the unlabelled instances.Besides,drug-to-drug(D-D)and target-to-target(T-T)interactions are used for the weight initia-tion of the RNN based bidirectional long short term memory(BiLSTM)model which is then utilized to the prediction of DTIs.Since hyperparameters signifi-cantly affect the prediction performance of the BiLSTM technique,the Adam optimizer is used which mainly helps to improve the DTI prediction outcomes.In order to ensure the enhanced predictive outcomes of the DTIP-ORNN techni-que,a series of simulations are implemented on four benchmark datasets.The comparative result analysis shows the promising performance of the DTIP-ORNN method on the recent approaches.展开更多
Communication is a significant part of being human and living in the world.Diverse kinds of languages and their variations are there;thus,one person can speak any language and cannot effectively communicate with one w...Communication is a significant part of being human and living in the world.Diverse kinds of languages and their variations are there;thus,one person can speak any language and cannot effectively communicate with one who speaks that language in a different accent.Numerous application fields such as education,mobility,smart systems,security,and health care systems utilize the speech or voice recognition models abundantly.Though,various studies are focused on the Arabic or Asian and English languages by ignoring other significant languages like Marathi that leads to the broader research motivations in regional languages.It is necessary to understand the speech recognition field,in which the major concentrated stages are feature extraction and classification.This paper emphasis developing a Speech Recognition model for the Marathi language by optimizing Recurrent Neural Network(RNN).Here,the preprocessing of the input signal is performed by smoothing and median filtering.After preprocessing the feature extraction is carried out using MFCC and Spectral features to get precise features from the input Marathi Speech corpus.The optimized RNN classifier is used for speech recognition after completing the feature extraction task,where the optimization of hidden neurons in RNN is performed by the Grasshopper Optimization Algorithm(GOA).Finally,the comparison with the conventional techniques has shown that the proposed model outperforms most competing models on a benchmark dataset.展开更多
In recent years,wearable devices-based Human Activity Recognition(HAR)models have received significant attention.Previously developed HAR models use hand-crafted features to recognize human activities,leading to the e...In recent years,wearable devices-based Human Activity Recognition(HAR)models have received significant attention.Previously developed HAR models use hand-crafted features to recognize human activities,leading to the extraction of basic features.The images captured by wearable sensors contain advanced features,allowing them to be analyzed by deep learning algorithms to enhance the detection and recognition of human actions.Poor lighting and limited sensor capabilities can impact data quality,making the recognition of human actions a challenging task.The unimodal-based HAR approaches are not suitable in a real-time environment.Therefore,an updated HAR model is developed using multiple types of data and an advanced deep-learning approach.Firstly,the required signals and sensor data are accumulated from the standard databases.From these signals,the wave features are retrieved.Then the extracted wave features and sensor data are given as the input to recognize the human activity.An Adaptive Hybrid Deep Attentive Network(AHDAN)is developed by incorporating a“1D Convolutional Neural Network(1DCNN)”with a“Gated Recurrent Unit(GRU)”for the human activity recognition process.Additionally,the Enhanced Archerfish Hunting Optimizer(EAHO)is suggested to fine-tune the network parameters for enhancing the recognition process.An experimental evaluation is performed on various deep learning networks and heuristic algorithms to confirm the effectiveness of the proposed HAR model.The EAHO-based HAR model outperforms traditional deep learning networks with an accuracy of 95.36,95.25 for recall,95.48 for specificity,and 95.47 for precision,respectively.The result proved that the developed model is effective in recognizing human action by taking less time.Additionally,it reduces the computation complexity and overfitting issue through using an optimization approach.展开更多
For training the present Neural Network(NN)models,the standard technique is to utilize decaying Learning Rates(LR).While the majority of these techniques commence with a large LR,they will decay multiple times over ti...For training the present Neural Network(NN)models,the standard technique is to utilize decaying Learning Rates(LR).While the majority of these techniques commence with a large LR,they will decay multiple times over time.Decaying has been proved to enhance generalization as well as optimization.Other parameters,such as the network’s size,the number of hidden layers,drop-outs to avoid overfitting,batch size,and so on,are solely based on heuristics.This work has proposed Adaptive Teaching Learning Based(ATLB)Heuristic to identify the optimal hyperparameters for diverse networks.Here we consider three architec-tures Recurrent Neural Networks(RNN),Long Short Term Memory(LSTM),Bidirectional Long Short Term Memory(BiLSTM)of Deep Neural Networks for classification.The evaluation of the proposed ATLB is done through the various learning rate schedulers Cyclical Learning Rate(CLR),Hyperbolic Tangent Decay(HTD),and Toggle between Hyperbolic Tangent Decay and Triangular mode with Restarts(T-HTR)techniques.Experimental results have shown the performance improvement on the 20Newsgroup,Reuters Newswire and IMDB dataset.展开更多
Currently,Bitcoin is the world’s most popular cryptocurrency.The price of Bitcoin is extremely volatile,which can be described as high-benefit and high-risk.To minimize the risk involved,a means of more accurately pr...Currently,Bitcoin is the world’s most popular cryptocurrency.The price of Bitcoin is extremely volatile,which can be described as high-benefit and high-risk.To minimize the risk involved,a means of more accurately predicting the Bitcoin price is required.Most of the existing studies of Bitcoin prediction are based on historical(i.e.,benchmark)data,without considering the real-time(i.e.,live)data.To mitigate the issue of price volatility and achieve more precise outcomes,this study suggests using historical and real-time data to predict the Bitcoin candlestick—or open,high,low,and close(OHLC)—prices.Seeking a better prediction model,the present study proposes time series-based deep learning models.In particular,two deep learning algorithms were applied,namely,long short-term memory(LSTM)and gated recurrent unit(GRU).Using real-time data,the Bitcoin candlesticks were predicted for three intervals:the next 4 h,the next 12 h,and the next 24 h.The results showed that the best-performing model was the LSTM-based model with the 4-h interval.In particular,this model achieved a stellar performance with a mean absolute percentage error(MAPE)of 0.63,a root mean square error(RMSE)of 0.0009,a mean square error(MSE)of 9e-07,a mean absolute error(MAE)of 0.0005,and an R-squared coefficient(R2)of 0.994.With these results,the proposed prediction model has demonstrated its efficiency over the models proposed in previous studies.The findings of this study have considerable implications in the business field,as the proposed model can assist investors and traders in precisely identifying Bitcoin sales and buying opportunities.展开更多
Currently,mobile communication is one of the widely used means of communication.Nevertheless,it is quite challenging for a telecommunication company to attract new customers.The recent concept of mobile number portabi...Currently,mobile communication is one of the widely used means of communication.Nevertheless,it is quite challenging for a telecommunication company to attract new customers.The recent concept of mobile number portability has also aggravated the problem of customer churn.Companies need to identify beforehand the customers,who could potentially churn out to the competitors.In the telecommunication industry,such identification could be done based on call detail records.This research presents an extensive experimental study based on various deep learning models,such as the 1D convolutional neural network(CNN)model along with the recurrent neural network(RNN)and deep neural network(DNN)for churn prediction.We use the mobile telephony churn prediction dataset obtained from customers-dna.com,containing the data for around 100,000 individuals,out of which 86,000 are non-churners,whereas 14,000 are churned customers.The imbalanced data are handled using undersampling and oversampling.The accuracy for CNN,RNN,and DNN is 91%,93%,and 96%,respectively.Furthermore,DNN got 99%for ROC.展开更多
A recent trend in machine learning is to use deep architectures to discover multiple levels of features from data,which has achieved impressive results on various natural language processing(NLP)tasks.We propose a dee...A recent trend in machine learning is to use deep architectures to discover multiple levels of features from data,which has achieved impressive results on various natural language processing(NLP)tasks.We propose a deep neural network-based solution to Chinese semantic role labeling(SRL)with its application on message analysis.The solution adopts a six-step strategy:text normalization,named entity recognition(NER),Chinese word segmentation and part-of-speech(POS)tagging,theme classification,SRL,and slot filling.For each step,a novel deep neural network-based model is designed and optimized,particularly for smart phone applications.Experiment results on all the NLP sub-tasks of the solution show that the proposed neural networks achieve state-of-the-art performance with the minimal computational cost.The speed advantage of deep neural networks makes them more competitive for large-scale applications or applications requiring real-time response,highlighting the potential of the proposed solution for practical NLP systems.展开更多
In recent years,machine learning algorithms and in particular deep learning has shown promising results when used in the field of legal domain.The legal field is strongly affected by the problem of information overloa...In recent years,machine learning algorithms and in particular deep learning has shown promising results when used in the field of legal domain.The legal field is strongly affected by the problem of information overload,due to the large amount of legal material stored in textual form.Legal text processing is essential in the legal domain to analyze the texts of the court events to automatically predict smart decisions.With an increasing number of digitally available documents,legal text processing is essential to analyze documents which helps to automate various legal domain tasks.Legal document classification is a valuable tool in legal services for enhancing the quality and efficiency of legal document review.In this paper,we propose Sammon Keyword Mapping-based Quadratic Discriminant Recurrent Multilayer Perceptive Deep Neural Classifier(SKM-QDRMPDNC),a system that applies deep neural methods to the problem of legal document classification.The SKM-QDRMPDNC technique consists of many layers to perform the keyword extraction and classification.First,the set of legal documents are collected from the dataset.Then the keyword extraction is performed using SammonMapping technique based on the distance measure.With the extracted features,Quadratic Discriminant analysis is applied to performthe document classification based on the likelihood ratio test.Finally,the classified legal documents are obtained at the output layer.This process is repeated until minimum error is attained.The experimental assessment is carried out using various performance metrics such as accuracy,precision,recall,F-measure,and computational time based on several legal documents collected from the dataset.The observed results validated that the proposed SKM-QDRMPDNC technique provides improved performance in terms of achieving higher accuracy,precision,recall,and F-measure with minimum computation time when compared to existing methods.展开更多
In this paper we present a CNN based approach for a real time 3 D-hand pose estimation from the depth sequence.Prior discriminative approaches have achieved remarkable success but are facing two main challenges:Firstl...In this paper we present a CNN based approach for a real time 3 D-hand pose estimation from the depth sequence.Prior discriminative approaches have achieved remarkable success but are facing two main challenges:Firstly,the methods are fully supervised hence require large numbers of annotated training data to extract the dynamic information from a hand representation.Secondly,unreliable hand detectors based on strong assumptions or a weak detector which often fail in several situations like complex environment and multiple hands.In contrast to these methods,this paper presents an approach that can be considered as semi-supervised by performing predictive coding of image sequences of hand poses in order to capture latent features underlying a given image without supervision.The hand is modelled using a novel latent tree dependency model(LDTM)which transforms internal joint location to an explicit representation.Then the modeled hand topology is integrated with the pose estimator using data dependent method to jointly learn latent variables of the posterior pose appearance and the pose configuration respectively.Finally,an unsupervised error term which is a part of the recurrent architecture ensures smooth estimations of the final pose.Experiments on three challenging public datasets,ICVL,MSRA,and NYU demonstrate the significant performance of the proposed method which is comparable or better than state-of-the-art approaches.展开更多
There are many techniques using sensors and wearable devices for detecting and monitoring patients with Parkinson’s disease(PD).A recent development is the utilization of human interaction with computer keyboards for...There are many techniques using sensors and wearable devices for detecting and monitoring patients with Parkinson’s disease(PD).A recent development is the utilization of human interaction with computer keyboards for analyzing and identifying motor signs in the early stages of the disease.Current designs for classification of time series of computer-key hold durations recorded from healthy control and PD subjects require the time series of length to be considerably long.With an attempt to avoid discomfort to participants in performing long physical tasks for data recording,this paper introduces the use of fuzzy recurrence plots of very short time series as input data for the machine training and classification with long short-term memory(LSTM)neural networks.Being an original approach that is able to both significantly increase the feature dimensions and provides the property of deterministic dynamical systems of very short time series for information processing carried out by an LSTM layer architecture,fuzzy recurrence plots provide promising results and outperform the direct input of the time series for the classification of healthy control and early PD subjects.展开更多
Considering the recent developments in deep learning, it has become increasingly important to verify what methods are valid for the prediction of multivariate time-series data. In this study, we propose a novel method...Considering the recent developments in deep learning, it has become increasingly important to verify what methods are valid for the prediction of multivariate time-series data. In this study, we propose a novel method of time-series prediction employing multiple deep learners combined with a Bayesian network where training data is divided into clusters using K-means clustering. We decided how many clusters are the best for K-means with the Bayesian information criteria. Depending on each cluster, the multiple deep learners are trained. We used three types of deep learners: deep neural network (DNN), recurrent neural network (RNN), and long short-term memory (LSTM). A naive Bayes classifier is used to determine which deep learner is in charge of predicting a particular time-series. Our proposed method will be applied to a set of financial time-series data, the Nikkei Average Stock price, to assess the accuracy of the predictions made. Compared with the conventional method of employing a single deep learner to acquire all the data, it is demonstrated by our proposed method that F-value and accuracy are improved.展开更多
Weather radar echo extrapolation plays a crucial role in weather forecasting.However,traditional weather radar echo extrapolation methods are not very accurate and do not make full use of historical data.Deep learning...Weather radar echo extrapolation plays a crucial role in weather forecasting.However,traditional weather radar echo extrapolation methods are not very accurate and do not make full use of historical data.Deep learning algorithms based on Recurrent Neural Networks also have the problem of accumulating errors.Moreover,it is difficult to obtain higher accuracy by relying on a single historical radar echo observation.Therefore,in this study,we constructed the Fusion GRU module,which leverages a cascade structure to effectively combine radar echo data and mean wind data.We also designed the Top Connection so that the model can capture the global spatial relationship to construct constraints on the predictions.Based on the Jiangsu Province dataset,we compared some models.The results show that our proposed model,Cascade Fusion Spatiotemporal Network(CFSN),improved the critical success index(CSI)by 10.7%over the baseline at the threshold of 30 dBZ.Ablation experiments further validated the effectiveness of our model.Similarly,the CSI of the complete CFSN was 0.004 higher than the suboptimal solution without the cross-attention module at the threshold of 30 dBZ.展开更多
Sentiment analysis, a crucial task in discerning emotional tones within the text, plays a pivotal role in understandingpublic opinion and user sentiment across diverse languages.While numerous scholars conduct sentime...Sentiment analysis, a crucial task in discerning emotional tones within the text, plays a pivotal role in understandingpublic opinion and user sentiment across diverse languages.While numerous scholars conduct sentiment analysisin widely spoken languages such as English, Chinese, Arabic, Roman Arabic, and more, we come to grapplingwith resource-poor languages like Urdu literature which becomes a challenge. Urdu is a uniquely crafted language,characterized by a script that amalgamates elements from diverse languages, including Arabic, Parsi, Pashtu,Turkish, Punjabi, Saraiki, and more. As Urdu literature, characterized by distinct character sets and linguisticfeatures, presents an additional hurdle due to the lack of accessible datasets, rendering sentiment analysis aformidable undertaking. The limited availability of resources has fueled increased interest among researchers,prompting a deeper exploration into Urdu sentiment analysis. This research is dedicated to Urdu languagesentiment analysis, employing sophisticated deep learning models on an extensive dataset categorized into fivelabels: Positive, Negative, Neutral, Mixed, and Ambiguous. The primary objective is to discern sentiments andemotions within the Urdu language, despite the absence of well-curated datasets. To tackle this challenge, theinitial step involves the creation of a comprehensive Urdu dataset by aggregating data from various sources such asnewspapers, articles, and socialmedia comments. Subsequent to this data collection, a thorough process of cleaningand preprocessing is implemented to ensure the quality of the data. The study leverages two well-known deeplearningmodels, namely Convolutional Neural Networks (CNN) and Recurrent Neural Networks (RNN), for bothtraining and evaluating sentiment analysis performance. Additionally, the study explores hyperparameter tuning tooptimize the models’ efficacy. Evaluation metrics such as precision, recall, and the F1-score are employed to assessthe effectiveness of the models. The research findings reveal that RNN surpasses CNN in Urdu sentiment analysis,gaining a significantly higher accuracy rate of 91%. This result accentuates the exceptional performance of RNN,solidifying its status as a compelling option for conducting sentiment analysis tasks in the Urdu language.展开更多
The battery thermal management of electric vehicles can be improved using neural networks predicting quantile sequences of the battery temperature.This work extends a method for the development of Quantile Convolution...The battery thermal management of electric vehicles can be improved using neural networks predicting quantile sequences of the battery temperature.This work extends a method for the development of Quantile Convolutional and Quantile Recurrent Neural Networks(namely Q*NN).Fleet data of 225629 drives are clustered and balanced,simulation data from 971 simulations are augmented before they are combined for training and testing.The Q*NN hyperparameters are optimized using an efficient Bayesian optimization,before the Q*NN models are compared with regression and quantile regression models for four horizons.The analysis of point-forecast and quantile-related metrics shows the superior performance of the novel Q*NN models.The median predictions of the best performing model achieve an average RMSE of 0.66°C and R^(2) of 0.84.The predicted 0.99 quantile covers 98.87%of the true values in the test data.In conclusion,this work proposes an extended development and comparison of Q*NN models for accurate battery temperature prediction.展开更多
This article presents an exhaustive comparative investigation into the accuracy of gender identification across diverse geographical regions,employing a deep learning classification algorithm for speech signal analysi...This article presents an exhaustive comparative investigation into the accuracy of gender identification across diverse geographical regions,employing a deep learning classification algorithm for speech signal analysis.In this study,speech samples are categorized for both training and testing purposes based on their geographical origin.Category 1 comprises speech samples from speakers outside of India,whereas Category 2 comprises live-recorded speech samples from Indian speakers.Testing speech samples are likewise classified into four distinct sets,taking into consideration both geographical origin and the language spoken by the speakers.Significantly,the results indicate a noticeable difference in gender identification accuracy among speakers from different geographical areas.Indian speakers,utilizing 52 Hindi and 26 English phonemes in their speech,demonstrate a notably higher gender identification accuracy of 85.75%compared to those speakers who predominantly use 26 English phonemes in their conversations when the system is trained using speech samples from Indian speakers.The gender identification accuracy of the proposed model reaches 83.20%when the system is trained using speech samples from speakers outside of India.In the analysis of speech signals,Mel Frequency Cepstral Coefficients(MFCCs)serve as relevant features for the speech data.The deep learning classification algorithm utilized in this research is based on a Bidirectional Long Short-Term Memory(BiLSTM)architecture within a Recurrent Neural Network(RNN)model.展开更多
基金supported by the“Human Resources Program in Energy Technology”of the Korea Institute of Energy Technology Evaluation and Planning(KETEP)and granted financial resources from the Ministry of Trade,Industry,and Energy,Korea(No.20204010600090).
文摘This research aims to enhance Clinical Decision Support Systems(CDSS)within Wireless Body Area Networks(WBANs)by leveraging advanced machine learning techniques.Specifically,we target the challenges of accurate diagnosis in medical imaging and sequential data analysis using Recurrent Neural Networks(RNNs)with Long Short-Term Memory(LSTM)layers and echo state cells.These models are tailored to improve diagnostic precision,particularly for conditions like rotator cuff tears in osteoporosis patients and gastrointestinal diseases.Traditional diagnostic methods and existing CDSS frameworks often fall short in managing complex,sequential medical data,struggling with long-term dependencies and data imbalances,resulting in suboptimal accuracy and delayed decisions.Our goal is to develop Artificial Intelligence(AI)models that address these shortcomings,offering robust,real-time diagnostic support.We propose a hybrid RNN model that integrates SimpleRNN,LSTM layers,and echo state cells to manage long-term dependencies effectively.Additionally,we introduce CG-Net,a novel Convolutional Neural Network(CNN)framework for gastrointestinal disease classification,which outperforms traditional CNN models.We further enhance model performance through data augmentation and transfer learning,improving generalization and robustness against data scarcity and imbalance.Comprehensive validation,including 5-fold cross-validation and metrics such as accuracy,precision,recall,F1-score,and Area Under the Curve(AUC),confirms the models’reliability.Moreover,SHapley Additive exPlanations(SHAP)and Local Interpretable Model-agnostic Explanations(LIME)are employed to improve model interpretability.Our findings show that the proposed models significantly enhance diagnostic accuracy and efficiency,offering substantial advancements in WBANs and CDSS.
基金supported by the National Natural Science Foundation of China under Grant No. 61762005, 61231015, 61671335, 61702472, 61701194, 61761044, 61471271National High Technology Research and Development Program of China (863 Program) under Grant No. 2015AA016306+2 种基金 Hubei Province Technological Innovation Major Project under Grant No. 2016AAA015the Science Project of Education Department of Jiangxi Province under No. GJJ150585The Opening Project of Collaborative Innovation Center for Economics Crime Investigation and Prevention Technology, Jiangxi Province, under Grant No. JXJZXTCX-025
文摘Non-blind audio bandwidth extension is a standard technique within contemporary audio codecs to efficiently code audio signals at low bitrates. In existing methods, in most cases high frequencies signal is usually generated by a duplication of the corresponding low frequencies and some parameters of high frequencies. However, the perception quality of coding will significantly degrade if the correlation between high frequencies and low frequencies becomes weak. In this paper, we quantitatively analyse the correlation via computing mutual information value. The analysis results show the correlation also exists in low frequency signal of the context dependent frames besides the current frame. In order to improve the perception quality of coding, we propose a novel method of high frequency coarse spectrum generation to improve the conventional replication method. In the proposed method, the coarse high frequency spectrums are generated by a nonlinear mapping model using deep recurrent neural network. The experiments confirm that the proposed method shows better performance than the reference methods.
基金supported by the Japan Society for the Promotion of Science(JSPS)KAKENHI(JP22H03643)Japan Science and Technology Agency(JST)Support for Pioneering Research Initiated by the Next Generation(SPRING)(JPMJSP2145)+2 种基金JST Through the Establishment of University Fellowships Towards the Creation of Science Technology Innovation(JPMJFS2115)the National Natural Science Foundation of China(52078382)the State Key Laboratory of Disaster Reduction in Civil Engineering(CE19-A-01)。
文摘Accurately predicting fluid forces acting on the sur-face of a structure is crucial in engineering design.However,this task becomes particularly challenging in turbulent flow,due to the complex and irregular changes in the flow field.In this study,we propose a novel deep learning method,named mapping net-work-coordinated stacked gated recurrent units(MSU),for pre-dicting pressure on a circular cylinder from velocity data.Specifi-cally,our coordinated learning strategy is designed to extract the most critical velocity point for prediction,a process that has not been explored before.In our experiments,MSU extracts one point from a velocity field containing 121 points and utilizes this point to accurately predict 100 pressure points on the cylinder.This method significantly reduces the workload of data measure-ment in practical engineering applications.Our experimental results demonstrate that MSU predictions are highly similar to the real turbulent data in both spatio-temporal and individual aspects.Furthermore,the comparison results show that MSU predicts more precise results,even outperforming models that use all velocity field points.Compared with state-of-the-art methods,MSU has an average improvement of more than 45%in various indicators such as root mean square error(RMSE).Through comprehensive and authoritative physical verification,we estab-lished that MSU’s prediction results closely align with pressure field data obtained in real turbulence fields.This confirmation underscores the considerable potential of MSU for practical applications in real engineering scenarios.The code is available at https://github.com/zhangzm0128/MSU.
文摘The regulatory role of the Micro-RNAs (miRNAs) in the messenger RNAs (mRNAs) gene expression is well understood by the biologists since some decades, even though the delving into specific aspects is in progress. Clustering is a cornerstone in bioinformatics research, offering a potent computational tool for analyzing diverse types of data encountered in genomics and related fields. MiRNA clustering plays a pivotal role in deciphering the intricate regulatory roles of miRNAs in biological systems. It uncovers novel biomarkers for disease diagnosis and prognosis and advances our understanding of gene regulatory networks and pathways implicated in health and disease, as well as drug discovery. Namely, we have implemented clustering procedure to find interrelations among miRNAs within clusters, and their relations to diseases. Deep clustering (DC) algorithms signify a departure from traditional clustering methods towards more sophisticated techniques, that can uncover intricate patterns and relationships within gene expression data. Deep learning (DL) models have shown remarkable success in various domains, and their application in genomics, especially for tasks like clustering, holding immense promise. The deep convolutional clustering procedure used is different from other traditional methods, demonstrating unbiased clustering results. In the paper, we implement the procedure on a Multiple Myeloma miRNA dataset publicly available on GEO platform, as a template of a cancer instance analysis, and hazard some biological issues.
基金Supported by U.K.EPSRC Platform Grant(Grant No.EP/P027121/1).
文摘The remaining useful life(RUL)of a system is generally predicted by utilising the data collected from the sensors that continuously monitor different indicators.Recently,different deep learning(DL)techniques have been used for RUL prediction and achieved great success.Because the data is often time-sequential,recurrent neural network(RNN)has attracted significant interests due to its efficiency in dealing with such data.This paper systematically reviews RNN and its variants for RUL prediction,with a specific focus on understanding how different components(e.g.,types of optimisers and activation functions)or parameters(e.g.,sequence length,neuron quantities)affect their performance.After that,a case study using the well-studied NASA’s C-MAPSS dataset is presented to quantitatively evaluate the influence of various state-of-the-art RNN structures on the RUL prediction performance.The result suggests that the variant methods usually perform better than the original RNN,and among which,Bi-directional Long Short-Term Memory generally has the best performance in terms of stability,precision and accuracy.Certain model structures may fail to produce valid RUL prediction result due to the gradient vanishing or gradient exploring problem if the parameters are not chosen appropriately.It is concluded that parameter tuning is a crucial step to achieve optimal prediction performance.
文摘Drug-target interactions prediction(DTIP)remains an important requirement in thefield of drug discovery and human medicine.The identification of interaction among the drug compound and target protein plays an essential pro-cess in the drug discovery process.It is a lengthier and complex process for pre-dicting the drug target interaction(DTI)utilizing experimental approaches.To resolve these issues,computational intelligence based DTIP techniques were developed to offer an efficient predictive model with low cost.The recently devel-oped deep learning(DL)models can be employed for the design of effective pre-dictive approaches for DTIP.With this motivation,this paper presents a new drug target interaction prediction using optimal recurrent neural network(DTIP-ORNN)technique.The goal of the DTIP-ORNN technique is to predict the DTIs in a semi-supervised way,i.e.,inclusion of both labelled and unlabelled instances.Initially,the DTIP-ORNN technique performs data preparation process and also includes class labelling process,where the target interactions from the database are used to determine thefinal label of the unlabelled instances.Besides,drug-to-drug(D-D)and target-to-target(T-T)interactions are used for the weight initia-tion of the RNN based bidirectional long short term memory(BiLSTM)model which is then utilized to the prediction of DTIs.Since hyperparameters signifi-cantly affect the prediction performance of the BiLSTM technique,the Adam optimizer is used which mainly helps to improve the DTI prediction outcomes.In order to ensure the enhanced predictive outcomes of the DTIP-ORNN techni-que,a series of simulations are implemented on four benchmark datasets.The comparative result analysis shows the promising performance of the DTIP-ORNN method on the recent approaches.
基金Taif University Researchers Supporting Project number(TURSP-2020/349),Taif University,Taif,Saudi Arabia.
文摘Communication is a significant part of being human and living in the world.Diverse kinds of languages and their variations are there;thus,one person can speak any language and cannot effectively communicate with one who speaks that language in a different accent.Numerous application fields such as education,mobility,smart systems,security,and health care systems utilize the speech or voice recognition models abundantly.Though,various studies are focused on the Arabic or Asian and English languages by ignoring other significant languages like Marathi that leads to the broader research motivations in regional languages.It is necessary to understand the speech recognition field,in which the major concentrated stages are feature extraction and classification.This paper emphasis developing a Speech Recognition model for the Marathi language by optimizing Recurrent Neural Network(RNN).Here,the preprocessing of the input signal is performed by smoothing and median filtering.After preprocessing the feature extraction is carried out using MFCC and Spectral features to get precise features from the input Marathi Speech corpus.The optimized RNN classifier is used for speech recognition after completing the feature extraction task,where the optimization of hidden neurons in RNN is performed by the Grasshopper Optimization Algorithm(GOA).Finally,the comparison with the conventional techniques has shown that the proposed model outperforms most competing models on a benchmark dataset.
文摘In recent years,wearable devices-based Human Activity Recognition(HAR)models have received significant attention.Previously developed HAR models use hand-crafted features to recognize human activities,leading to the extraction of basic features.The images captured by wearable sensors contain advanced features,allowing them to be analyzed by deep learning algorithms to enhance the detection and recognition of human actions.Poor lighting and limited sensor capabilities can impact data quality,making the recognition of human actions a challenging task.The unimodal-based HAR approaches are not suitable in a real-time environment.Therefore,an updated HAR model is developed using multiple types of data and an advanced deep-learning approach.Firstly,the required signals and sensor data are accumulated from the standard databases.From these signals,the wave features are retrieved.Then the extracted wave features and sensor data are given as the input to recognize the human activity.An Adaptive Hybrid Deep Attentive Network(AHDAN)is developed by incorporating a“1D Convolutional Neural Network(1DCNN)”with a“Gated Recurrent Unit(GRU)”for the human activity recognition process.Additionally,the Enhanced Archerfish Hunting Optimizer(EAHO)is suggested to fine-tune the network parameters for enhancing the recognition process.An experimental evaluation is performed on various deep learning networks and heuristic algorithms to confirm the effectiveness of the proposed HAR model.The EAHO-based HAR model outperforms traditional deep learning networks with an accuracy of 95.36,95.25 for recall,95.48 for specificity,and 95.47 for precision,respectively.The result proved that the developed model is effective in recognizing human action by taking less time.Additionally,it reduces the computation complexity and overfitting issue through using an optimization approach.
文摘For training the present Neural Network(NN)models,the standard technique is to utilize decaying Learning Rates(LR).While the majority of these techniques commence with a large LR,they will decay multiple times over time.Decaying has been proved to enhance generalization as well as optimization.Other parameters,such as the network’s size,the number of hidden layers,drop-outs to avoid overfitting,batch size,and so on,are solely based on heuristics.This work has proposed Adaptive Teaching Learning Based(ATLB)Heuristic to identify the optimal hyperparameters for diverse networks.Here we consider three architec-tures Recurrent Neural Networks(RNN),Long Short Term Memory(LSTM),Bidirectional Long Short Term Memory(BiLSTM)of Deep Neural Networks for classification.The evaluation of the proposed ATLB is done through the various learning rate schedulers Cyclical Learning Rate(CLR),Hyperbolic Tangent Decay(HTD),and Toggle between Hyperbolic Tangent Decay and Triangular mode with Restarts(T-HTR)techniques.Experimental results have shown the performance improvement on the 20Newsgroup,Reuters Newswire and IMDB dataset.
文摘Currently,Bitcoin is the world’s most popular cryptocurrency.The price of Bitcoin is extremely volatile,which can be described as high-benefit and high-risk.To minimize the risk involved,a means of more accurately predicting the Bitcoin price is required.Most of the existing studies of Bitcoin prediction are based on historical(i.e.,benchmark)data,without considering the real-time(i.e.,live)data.To mitigate the issue of price volatility and achieve more precise outcomes,this study suggests using historical and real-time data to predict the Bitcoin candlestick—or open,high,low,and close(OHLC)—prices.Seeking a better prediction model,the present study proposes time series-based deep learning models.In particular,two deep learning algorithms were applied,namely,long short-term memory(LSTM)and gated recurrent unit(GRU).Using real-time data,the Bitcoin candlesticks were predicted for three intervals:the next 4 h,the next 12 h,and the next 24 h.The results showed that the best-performing model was the LSTM-based model with the 4-h interval.In particular,this model achieved a stellar performance with a mean absolute percentage error(MAPE)of 0.63,a root mean square error(RMSE)of 0.0009,a mean square error(MSE)of 9e-07,a mean absolute error(MAE)of 0.0005,and an R-squared coefficient(R2)of 0.994.With these results,the proposed prediction model has demonstrated its efficiency over the models proposed in previous studies.The findings of this study have considerable implications in the business field,as the proposed model can assist investors and traders in precisely identifying Bitcoin sales and buying opportunities.
文摘Currently,mobile communication is one of the widely used means of communication.Nevertheless,it is quite challenging for a telecommunication company to attract new customers.The recent concept of mobile number portability has also aggravated the problem of customer churn.Companies need to identify beforehand the customers,who could potentially churn out to the competitors.In the telecommunication industry,such identification could be done based on call detail records.This research presents an extensive experimental study based on various deep learning models,such as the 1D convolutional neural network(CNN)model along with the recurrent neural network(RNN)and deep neural network(DNN)for churn prediction.We use the mobile telephony churn prediction dataset obtained from customers-dna.com,containing the data for around 100,000 individuals,out of which 86,000 are non-churners,whereas 14,000 are churned customers.The imbalanced data are handled using undersampling and oversampling.The accuracy for CNN,RNN,and DNN is 91%,93%,and 96%,respectively.Furthermore,DNN got 99%for ROC.
文摘A recent trend in machine learning is to use deep architectures to discover multiple levels of features from data,which has achieved impressive results on various natural language processing(NLP)tasks.We propose a deep neural network-based solution to Chinese semantic role labeling(SRL)with its application on message analysis.The solution adopts a six-step strategy:text normalization,named entity recognition(NER),Chinese word segmentation and part-of-speech(POS)tagging,theme classification,SRL,and slot filling.For each step,a novel deep neural network-based model is designed and optimized,particularly for smart phone applications.Experiment results on all the NLP sub-tasks of the solution show that the proposed neural networks achieve state-of-the-art performance with the minimal computational cost.The speed advantage of deep neural networks makes them more competitive for large-scale applications or applications requiring real-time response,highlighting the potential of the proposed solution for practical NLP systems.
文摘In recent years,machine learning algorithms and in particular deep learning has shown promising results when used in the field of legal domain.The legal field is strongly affected by the problem of information overload,due to the large amount of legal material stored in textual form.Legal text processing is essential in the legal domain to analyze the texts of the court events to automatically predict smart decisions.With an increasing number of digitally available documents,legal text processing is essential to analyze documents which helps to automate various legal domain tasks.Legal document classification is a valuable tool in legal services for enhancing the quality and efficiency of legal document review.In this paper,we propose Sammon Keyword Mapping-based Quadratic Discriminant Recurrent Multilayer Perceptive Deep Neural Classifier(SKM-QDRMPDNC),a system that applies deep neural methods to the problem of legal document classification.The SKM-QDRMPDNC technique consists of many layers to perform the keyword extraction and classification.First,the set of legal documents are collected from the dataset.Then the keyword extraction is performed using SammonMapping technique based on the distance measure.With the extracted features,Quadratic Discriminant analysis is applied to performthe document classification based on the likelihood ratio test.Finally,the classified legal documents are obtained at the output layer.This process is repeated until minimum error is attained.The experimental assessment is carried out using various performance metrics such as accuracy,precision,recall,F-measure,and computational time based on several legal documents collected from the dataset.The observed results validated that the proposed SKM-QDRMPDNC technique provides improved performance in terms of achieving higher accuracy,precision,recall,and F-measure with minimum computation time when compared to existing methods.
基金supported in part by the Fundamental Research Funds for the Central Universities(WK2350000002)。
文摘In this paper we present a CNN based approach for a real time 3 D-hand pose estimation from the depth sequence.Prior discriminative approaches have achieved remarkable success but are facing two main challenges:Firstly,the methods are fully supervised hence require large numbers of annotated training data to extract the dynamic information from a hand representation.Secondly,unreliable hand detectors based on strong assumptions or a weak detector which often fail in several situations like complex environment and multiple hands.In contrast to these methods,this paper presents an approach that can be considered as semi-supervised by performing predictive coding of image sequences of hand poses in order to capture latent features underlying a given image without supervision.The hand is modelled using a novel latent tree dependency model(LDTM)which transforms internal joint location to an explicit representation.Then the modeled hand topology is integrated with the pose estimator using data dependent method to jointly learn latent variables of the posterior pose appearance and the pose configuration respectively.Finally,an unsupervised error term which is a part of the recurrent architecture ensures smooth estimations of the final pose.Experiments on three challenging public datasets,ICVL,MSRA,and NYU demonstrate the significant performance of the proposed method which is comparable or better than state-of-the-art approaches.
文摘There are many techniques using sensors and wearable devices for detecting and monitoring patients with Parkinson’s disease(PD).A recent development is the utilization of human interaction with computer keyboards for analyzing and identifying motor signs in the early stages of the disease.Current designs for classification of time series of computer-key hold durations recorded from healthy control and PD subjects require the time series of length to be considerably long.With an attempt to avoid discomfort to participants in performing long physical tasks for data recording,this paper introduces the use of fuzzy recurrence plots of very short time series as input data for the machine training and classification with long short-term memory(LSTM)neural networks.Being an original approach that is able to both significantly increase the feature dimensions and provides the property of deterministic dynamical systems of very short time series for information processing carried out by an LSTM layer architecture,fuzzy recurrence plots provide promising results and outperform the direct input of the time series for the classification of healthy control and early PD subjects.
文摘Considering the recent developments in deep learning, it has become increasingly important to verify what methods are valid for the prediction of multivariate time-series data. In this study, we propose a novel method of time-series prediction employing multiple deep learners combined with a Bayesian network where training data is divided into clusters using K-means clustering. We decided how many clusters are the best for K-means with the Bayesian information criteria. Depending on each cluster, the multiple deep learners are trained. We used three types of deep learners: deep neural network (DNN), recurrent neural network (RNN), and long short-term memory (LSTM). A naive Bayes classifier is used to determine which deep learner is in charge of predicting a particular time-series. Our proposed method will be applied to a set of financial time-series data, the Nikkei Average Stock price, to assess the accuracy of the predictions made. Compared with the conventional method of employing a single deep learner to acquire all the data, it is demonstrated by our proposed method that F-value and accuracy are improved.
基金National Natural Science Foundation of China(42375145)The Open Grants of China Meteorological Admin-istration Radar Meteorology Key Laboratory(2023LRM-A02)。
文摘Weather radar echo extrapolation plays a crucial role in weather forecasting.However,traditional weather radar echo extrapolation methods are not very accurate and do not make full use of historical data.Deep learning algorithms based on Recurrent Neural Networks also have the problem of accumulating errors.Moreover,it is difficult to obtain higher accuracy by relying on a single historical radar echo observation.Therefore,in this study,we constructed the Fusion GRU module,which leverages a cascade structure to effectively combine radar echo data and mean wind data.We also designed the Top Connection so that the model can capture the global spatial relationship to construct constraints on the predictions.Based on the Jiangsu Province dataset,we compared some models.The results show that our proposed model,Cascade Fusion Spatiotemporal Network(CFSN),improved the critical success index(CSI)by 10.7%over the baseline at the threshold of 30 dBZ.Ablation experiments further validated the effectiveness of our model.Similarly,the CSI of the complete CFSN was 0.004 higher than the suboptimal solution without the cross-attention module at the threshold of 30 dBZ.
文摘Sentiment analysis, a crucial task in discerning emotional tones within the text, plays a pivotal role in understandingpublic opinion and user sentiment across diverse languages.While numerous scholars conduct sentiment analysisin widely spoken languages such as English, Chinese, Arabic, Roman Arabic, and more, we come to grapplingwith resource-poor languages like Urdu literature which becomes a challenge. Urdu is a uniquely crafted language,characterized by a script that amalgamates elements from diverse languages, including Arabic, Parsi, Pashtu,Turkish, Punjabi, Saraiki, and more. As Urdu literature, characterized by distinct character sets and linguisticfeatures, presents an additional hurdle due to the lack of accessible datasets, rendering sentiment analysis aformidable undertaking. The limited availability of resources has fueled increased interest among researchers,prompting a deeper exploration into Urdu sentiment analysis. This research is dedicated to Urdu languagesentiment analysis, employing sophisticated deep learning models on an extensive dataset categorized into fivelabels: Positive, Negative, Neutral, Mixed, and Ambiguous. The primary objective is to discern sentiments andemotions within the Urdu language, despite the absence of well-curated datasets. To tackle this challenge, theinitial step involves the creation of a comprehensive Urdu dataset by aggregating data from various sources such asnewspapers, articles, and socialmedia comments. Subsequent to this data collection, a thorough process of cleaningand preprocessing is implemented to ensure the quality of the data. The study leverages two well-known deeplearningmodels, namely Convolutional Neural Networks (CNN) and Recurrent Neural Networks (RNN), for bothtraining and evaluating sentiment analysis performance. Additionally, the study explores hyperparameter tuning tooptimize the models’ efficacy. Evaluation metrics such as precision, recall, and the F1-score are employed to assessthe effectiveness of the models. The research findings reveal that RNN surpasses CNN in Urdu sentiment analysis,gaining a significantly higher accuracy rate of 91%. This result accentuates the exceptional performance of RNN,solidifying its status as a compelling option for conducting sentiment analysis tasks in the Urdu language.
基金support by the KIT-Publication Fund of the Karlsruhe Institute of Technology.
文摘The battery thermal management of electric vehicles can be improved using neural networks predicting quantile sequences of the battery temperature.This work extends a method for the development of Quantile Convolutional and Quantile Recurrent Neural Networks(namely Q*NN).Fleet data of 225629 drives are clustered and balanced,simulation data from 971 simulations are augmented before they are combined for training and testing.The Q*NN hyperparameters are optimized using an efficient Bayesian optimization,before the Q*NN models are compared with regression and quantile regression models for four horizons.The analysis of point-forecast and quantile-related metrics shows the superior performance of the novel Q*NN models.The median predictions of the best performing model achieve an average RMSE of 0.66°C and R^(2) of 0.84.The predicted 0.99 quantile covers 98.87%of the true values in the test data.In conclusion,this work proposes an extended development and comparison of Q*NN models for accurate battery temperature prediction.
文摘This article presents an exhaustive comparative investigation into the accuracy of gender identification across diverse geographical regions,employing a deep learning classification algorithm for speech signal analysis.In this study,speech samples are categorized for both training and testing purposes based on their geographical origin.Category 1 comprises speech samples from speakers outside of India,whereas Category 2 comprises live-recorded speech samples from Indian speakers.Testing speech samples are likewise classified into four distinct sets,taking into consideration both geographical origin and the language spoken by the speakers.Significantly,the results indicate a noticeable difference in gender identification accuracy among speakers from different geographical areas.Indian speakers,utilizing 52 Hindi and 26 English phonemes in their speech,demonstrate a notably higher gender identification accuracy of 85.75%compared to those speakers who predominantly use 26 English phonemes in their conversations when the system is trained using speech samples from Indian speakers.The gender identification accuracy of the proposed model reaches 83.20%when the system is trained using speech samples from speakers outside of India.In the analysis of speech signals,Mel Frequency Cepstral Coefficients(MFCCs)serve as relevant features for the speech data.The deep learning classification algorithm utilized in this research is based on a Bidirectional Long Short-Term Memory(BiLSTM)architecture within a Recurrent Neural Network(RNN)model.