When building a classification model,the scenario where the samples of one class are significantly more than those of the other class is called data imbalance.Data imbalance causes the trained classification model to ...When building a classification model,the scenario where the samples of one class are significantly more than those of the other class is called data imbalance.Data imbalance causes the trained classification model to be in favor of the majority class(usually defined as the negative class),which may do harm to the accuracy of the minority class(usually defined as the positive class),and then lead to poor overall performance of the model.A method called MSHR-FCSSVM for solving imbalanced data classification is proposed in this article,which is based on a new hybrid resampling approach(MSHR)and a new fine cost-sensitive support vector machine(CS-SVM)classifier(FCSSVM).The MSHR measures the separability of each negative sample through its Silhouette value calculated by Mahalanobis distance between samples,based on which,the so-called pseudo-negative samples are screened out to generate new positive samples(over-sampling step)through linear interpolation and are deleted finally(under-sampling step).This approach replaces pseudo-negative samples with generated new positive samples one by one to clear up the inter-class overlap on the borderline,without changing the overall scale of the dataset.The FCSSVM is an improved version of the traditional CS-SVM.It considers influences of both the imbalance of sample number and the class distribution on classification simultaneously,and through finely tuning the class cost weights by using the efficient optimization algorithm based on the physical phenomenon of rime-ice(RIME)algorithm with cross-validation accuracy as the fitness function to accurately adjust the classification borderline.To verify the effectiveness of the proposed method,a series of experiments are carried out based on 20 imbalanced datasets including both mildly and extremely imbalanced datasets.The experimental results show that the MSHR-FCSSVM method performs better than the methods for comparison in most cases,and both the MSHR and the FCSSVM played significant roles.展开更多
With the exponential developments of wireless networking and inexpensive Internet of Things(IoT),a wide range of applications has been designed to attain enhanced services.Due to the limited energy capacity of IoT dev...With the exponential developments of wireless networking and inexpensive Internet of Things(IoT),a wide range of applications has been designed to attain enhanced services.Due to the limited energy capacity of IoT devices,energy-aware clustering techniques can be highly preferable.At the same time,artificial intelligence(AI)techniques can be applied to perform appropriate disease diagnostic processes.With this motivation,this study designs a novel squirrel search algorithm-based energy-aware clustering with a medical data classification(SSAC-MDC)model in an IoT environment.The goal of the SSAC-MDC technique is to attain maximum energy efficiency and disease diagnosis in the IoT environment.The proposed SSAC-MDC technique involves the design of the squirrel search algorithm-based clustering(SSAC)technique to choose the proper set of cluster heads(CHs)and construct clusters.Besides,the medical data classification process involves three different subprocesses namely pre-processing,autoencoder(AE)based classification,and improved beetle antenna search(IBAS)based parameter tuning.The design of the SSAC technique and IBAS based parameter optimization processes show the novelty of the work.For show-casing the improved performance of the SSAC-MDC technique,a series of experiments were performed and the comparative results highlighted the supremacy of the SSAC-MDC technique over the recent methods.展开更多
Datamining plays a crucial role in extractingmeaningful knowledge fromlarge-scale data repositories,such as data warehouses and databases.Association rule mining,a fundamental process in data mining,involves discoveri...Datamining plays a crucial role in extractingmeaningful knowledge fromlarge-scale data repositories,such as data warehouses and databases.Association rule mining,a fundamental process in data mining,involves discovering correlations,patterns,and causal structures within datasets.In the healthcare domain,association rules offer valuable opportunities for building knowledge bases,enabling intelligent diagnoses,and extracting invaluable information rapidly.This paper presents a novel approach called the Machine Learning based Association Rule Mining and Classification for Healthcare Data Management System(MLARMC-HDMS).The MLARMC-HDMS technique integrates classification and association rule mining(ARM)processes.Initially,the chimp optimization algorithm-based feature selection(COAFS)technique is employed within MLARMC-HDMS to select relevant attributes.Inspired by the foraging behavior of chimpanzees,the COA algorithm mimics their search strategy for food.Subsequently,the classification process utilizes stochastic gradient descent with a multilayer perceptron(SGD-MLP)model,while the Apriori algorithm determines attribute relationships.We propose a COA-based feature selection approach for medical data classification using machine learning techniques.This approach involves selecting pertinent features from medical datasets through COA and training machine learning models using the reduced feature set.We evaluate the performance of our approach on various medical datasets employing diverse machine learning classifiers.Experimental results demonstrate that our proposed approach surpasses alternative feature selection methods,achieving higher accuracy and precision rates in medical data classification tasks.The study showcases the effectiveness and efficiency of the COA-based feature selection approach in identifying relevant features,thereby enhancing the diagnosis and treatment of various diseases.To provide further validation,we conduct detailed experiments on a benchmark medical dataset,revealing the superiority of the MLARMCHDMS model over other methods,with a maximum accuracy of 99.75%.Therefore,this research contributes to the advancement of feature selection techniques in medical data classification and highlights the potential for improving healthcare outcomes through accurate and efficient data analysis.The presented MLARMC-HDMS framework and COA-based feature selection approach offer valuable insights for researchers and practitioners working in the field of healthcare data mining and machine learning.展开更多
Recently,a massive quantity of data is being produced from a distinct number of sources and the size of the daily created on the Internet has crossed two Exabytes.At the same time,clustering is one of the efficient te...Recently,a massive quantity of data is being produced from a distinct number of sources and the size of the daily created on the Internet has crossed two Exabytes.At the same time,clustering is one of the efficient techniques for mining big data to extract the useful and hidden patterns that exist in it.Density-based clustering techniques have gained significant attention owing to the fact that it helps to effectively recognize complex patterns in spatial dataset.Big data clustering is a trivial process owing to the increasing quantity of data which can be solved by the use of Map Reduce tool.With this motivation,this paper presents an efficient Map Reduce based hybrid density based clustering and classification algorithm for big data analytics(MR-HDBCC).The proposed MR-HDBCC technique is executed on Map Reduce tool for handling the big data.In addition,the MR-HDBCC technique involves three distinct processes namely pre-processing,clustering,and classification.The proposed model utilizes the Density-Based Spatial Clustering of Applications with Noise(DBSCAN)techni-que which is capable of detecting random shapes and diverse clusters with noisy data.For improving the performance of the DBSCAN technique,a hybrid model using cockroach swarm optimization(CSO)algorithm is developed for the exploration of the search space and determine the optimal parameters for density based clustering.Finally,bidirectional gated recurrent neural network(BGRNN)is employed for the classification of big data.The experimental validation of the proposed MR-HDBCC technique takes place using the benchmark dataset and the simulation outcomes demonstrate the promising performance of the proposed model interms of different measures.展开更多
A method for data classification will influence the efficiency of classification. Attributes reduction based on discernibility matrix and discernibility function in rough sets can use in data classification, so we put...A method for data classification will influence the efficiency of classification. Attributes reduction based on discernibility matrix and discernibility function in rough sets can use in data classification, so we put forward a method for data classification. Namely, firstly, we use discernibility matrix and discernibility function to delete superfluous attributes in formation system and get a necessary attribute set. Secondly, we delete superfluous attribute values and get decision rules. Finally, we classify data by means of decision rules. The experiments show that data classification using this method is simpler in the structure, and can improve the efficiency of classification.展开更多
Medical data classification(MDC)refers to the application of classification methods on medical datasets.This work focuses on applying a classification task to medical datasets related to specific diseases in order to ...Medical data classification(MDC)refers to the application of classification methods on medical datasets.This work focuses on applying a classification task to medical datasets related to specific diseases in order to predict the associated diagnosis or prognosis.To gain experts’trust,the prediction and the reasoning behind it are equally important.Accordingly,we confine our research to learn rule-based models because they are transparent and comprehensible.One approach to MDC involves the use of metaheuristic(MH)algorithms.Here we report on the development and testing of a novel MH algorithm:IWD-Miner.This algorithm can be viewed as a fusion of Intelligent Water Drops(IWDs)and AntMiner+.It was subjected to a four-stage sensitivity analysis to optimize its performance.For this purpose,21 publicly available medical datasets were used from the Machine Learning Repository at the University of California Irvine.Interestingly,there were only limited differences in performance between IWDMiner variants which is suggestive of its robustness.Finally,using the same 21 datasets,we compared the performance of the optimized IWD-Miner against two extant algorithms,AntMiner+and J48.The experiments showed that both rival algorithms are considered comparable in the effectiveness to IWD-Miner,as confirmed by the Wilcoxon nonparametric statistical test.Results suggest that IWD-Miner is more efficient than AntMiner+as measured by the average number of fitness evaluations to a solution(1,386,621.30 vs.2,827,283.88 fitness evaluations,respectively).J48 exhibited higher accuracy on average than IWD-Miner(79.58 vs.73.65,respectively)but produced larger models(32.82 leaves vs.8.38 terms,respectively).展开更多
The biomedical data classification process has received significant attention in recent times due to a massive increase in the generation of healthcare data from various sources.The developments of artificial intellig...The biomedical data classification process has received significant attention in recent times due to a massive increase in the generation of healthcare data from various sources.The developments of artificial intelligence(AI)and machine learning(ML)models assist in the effectual design of medical data classification models.Therefore,this article concentrates on the development of optimal Stacked Long Short Term Memory Sequence-toSequence Autoencoder(OSAE-LSTM)model for biomedical data classification.The presented OSAE-LSTM model intends to classify the biomedical data for the existence of diseases.Primarily,the OSAE-LSTM model involves min-max normalization based pre-processing to scale the data into uniform format.Followed by,the SAE-LSTM model is utilized for the detection and classification of diseases in biomedical data.At last,manta ray foraging optimization(MRFO)algorithm has been employed for hyperparameter optimization process.The utilization of MRFO algorithm assists in optimal selection of hypermeters involved in the SAE-LSTM model.The simulation analysis of the OSAE-LSTM model has been tested using a set of benchmark medical datasets and the results reported the improvements of the OSAELSTM model over the other approaches under several dimensions.展开更多
Medical data classification becomes a hot research topic in the healthcare sector to aid physicians in the healthcare sector for decision making.Besides,the advances of machine learning(ML)techniques assist to perform...Medical data classification becomes a hot research topic in the healthcare sector to aid physicians in the healthcare sector for decision making.Besides,the advances of machine learning(ML)techniques assist to perform the effective classification task.With this motivation,this paper presents a Fuzzy Clustering Approach Based on Breadth-first Search Algorithm(FCA-BFS)with optimal support vector machine(OSVM)model,named FCABFS-OSVM for medical data classification.The proposed FCABFS-OSVM technique intends to classify the healthcare data by the use of clustering and classification models.Besides,the proposed FCABFSOSVM technique involves the design of FCABFS technique to cluster the medical data which helps to boost the classification performance.Moreover,the OSVM model investigates the clustered medical data to perform classification process.Furthermore,Archimedes optimization algorithm(AOA)is utilized to the SVM parameters and boost the medical data classification results.A wide range of simulations takes place to highlight the promising performance of the FCABFS-OSVM technique.Extensive comparison studies reported the enhanced outcomes of the FCABFS-OSVM technique over the recent state of art approaches.展开更多
The significance of the preprocessing stage in any data mining task is well known. Before attempting medical data classification, characteristics of medical datasets, including noise, incompleteness, and the existence...The significance of the preprocessing stage in any data mining task is well known. Before attempting medical data classification, characteristics of medical datasets, including noise, incompleteness, and the existence of multiple and possibly irrelevant features, need to be addressed. In this paper, we show that selecting the right combination of prepro- cessing methods has a considerable impact on the classification potential of a dataset. The preprocessing operations con- sidered include the discretization of numeric attributes, the selection of attribute subset(s), and the handling of missing values. The classification is performed by an ant colony optimization algorithm as a case study. Experimental results on 25 real-world medical datasets show that a significant relative improvement in predictive accuracy, exceeding 60% in some cases, is obtained.展开更多
Purpose-The purpose of this paper is to provide a fault diagnosis method for rolling bearings.Rolling bearings are widely used in industrial appliances,and their fault diagnosis is of great importance and has drawn mo...Purpose-The purpose of this paper is to provide a fault diagnosis method for rolling bearings.Rolling bearings are widely used in industrial appliances,and their fault diagnosis is of great importance and has drawn more and more attention.Based on the common failure mechanism of failure modes of rolling bearings,this paper proposes a novel compound data classification method based on the discrete wavelet transform and the support vector machine(SVM)and applies it in the fault diagnosis of rolling bearings.Design/methodology/approach-Vibration signal contains large quantity of information of bearing status and this paper uses various types of wavelet base functions to perform discrete wavelet transform of vibration and denoise.Feature vectors are constructed based on several time-domain indices of the denoised signal.SVM is then used to perform classification and fault diagnosis.Then the optimal wavelet base function is determined based on the diagnosis accuracy.Findings-Experiments of fault diagnosis of rolling bearings are carried out and wavelet functions in several wavelet families were tested.The results show that the SVM classifier with the db4 wavelet base function in the db wavelet family has the best fault diagnosis accuracy.Originality/value-This method provides a practical candidate for the fault diagnosis of rolling bearings in the industrial applications.展开更多
Phishing is a type of cybercrime in which cyber-attackers pose themselves as authorized persons or entities and hack the victims’sensitive data.E-mails,instant messages and phone calls are some of the common modes us...Phishing is a type of cybercrime in which cyber-attackers pose themselves as authorized persons or entities and hack the victims’sensitive data.E-mails,instant messages and phone calls are some of the common modes used in cyberattacks.Though the security models are continuously upgraded to prevent cyberattacks,hackers find innovative ways to target the victims.In this background,there is a drastic increase observed in the number of phishing emails sent to potential targets.This scenario necessitates the importance of designing an effective classification model.Though numerous conventional models are available in the literature for proficient classification of phishing emails,the Machine Learning(ML)techniques and the Deep Learning(DL)models have been employed in the literature.The current study presents an Intelligent Cuckoo Search(CS)Optimization Algorithm with a Deep Learning-based Phishing Email Detection and Classification(ICSOA-DLPEC)model.The aim of the proposed ICSOA-DLPEC model is to effectually distinguish the emails as either legitimate or phishing ones.At the initial stage,the pre-processing is performed through three stages such as email cleaning,tokenization and stop-word elimination.Then,the N-gram approach is;moreover,the CS algorithm is applied to extract the useful feature vectors.Moreover,the CS algorithm is employed with the Gated Recurrent Unit(GRU)model to detect and classify phishing emails.Furthermore,the CS algorithm is used to fine-tune the parameters involved in the GRU model.The performance of the proposed ICSOA-DLPEC model was experimentally validated using a benchmark dataset,and the results were assessed under several dimensions.Extensive comparative studies were conducted,and the results confirmed the superior performance of the proposed ICSOA-DLPEC model over other existing approaches.The proposed model achieved a maximum accuracy of 99.72%.展开更多
In this paper,an Observation Points Classifier Ensemble(OPCE)algorithm is proposed to deal with High-Dimensional Imbalanced Classification(HDIC)problems based on data processed using the Multi-Dimensional Scaling(MDS)...In this paper,an Observation Points Classifier Ensemble(OPCE)algorithm is proposed to deal with High-Dimensional Imbalanced Classification(HDIC)problems based on data processed using the Multi-Dimensional Scaling(MDS)feature extraction technique.First,dimensionality of the original imbalanced data is reduced using MDS so that distances between any two different samples are preserved as well as possible.Second,a novel OPCE algorithm is applied to classify imbalanced samples by placing optimised observation points in a low-dimensional data space.Third,optimization of the observation point mappings is carried out to obtain a reliable assessment of the unknown samples.Exhaustive experiments have been conducted to evaluate the feasibility,rationality,and effectiveness of the proposed OPCE algorithm using seven benchmark HDIC data sets.Experimental results show that(1)the OPCE algorithm can be trained faster on low-dimensional imbalanced data than on high-dimensional data;(2)the OPCE algorithm can correctly identify samples as the number of optimised observation points is increased;and(3)statistical analysis reveals that OPCE yields better HDIC performances on the selected data sets in comparison with eight other HDIC algorithms.This demonstrates that OPCE is a viable algorithm to deal with HDIC problems.展开更多
Recently,medical data classification becomes a hot research topic among healthcare professionals and research communities,which assist in the disease diagnosis and decision making process.The latest developments of ar...Recently,medical data classification becomes a hot research topic among healthcare professionals and research communities,which assist in the disease diagnosis and decision making process.The latest developments of artificial intelligence(AI)approaches paves a way for the design of effective medical data classification models.At the same time,the existence of numerous features in the medical dataset poses a curse of dimensionality problem.For resolving the issues,this article introduces a novel feature subset selection with artificial intelligence based classification model for biomedical data(FSS-AICBD)technique.The FSS-AICBD technique intends to derive a useful set of features and thereby improve the classifier results.Primarily,the FSS-AICBD technique undergoes min-max normalization technique to prevent data complexity.In addition,the information gain(IG)approach is applied for the optimal selection of feature subsets.Also,group search optimizer(GSO)with deep belief network(DBN)model is utilized for biomedical data classification where the hyperparameters of the DBN model can be optimally tuned by the GSO algorithm.The choice of IG and GSO approaches results in promising medical data classification results.The experimental result analysis of the FSS-AICBD technique takes place using different benchmark healthcare datasets.The simulation results reported the enhanced outcomes of the FSS-AICBD technique interms of several measures.展开更多
In bioinformatics applications,examination of microarray data has received significant interest to diagnose diseases.Microarray gene expression data can be defined by a massive searching space that poses a primary cha...In bioinformatics applications,examination of microarray data has received significant interest to diagnose diseases.Microarray gene expression data can be defined by a massive searching space that poses a primary challenge in the appropriate selection of genes.Microarray data classification incorporates multiple disciplines such as bioinformatics,machine learning(ML),data science,and pattern classification.This paper designs an optimal deep neural network based microarray gene expression classification(ODNN-MGEC)model for bioinformatics applications.The proposed ODNN-MGEC technique performs data normalization process to normalize the data into a uniform scale.Besides,improved fruit fly optimization(IFFO)based feature selection technique is used to reduce the high dimensionality in the biomedical data.Moreover,deep neural network(DNN)model is applied for the classification of microarray gene expression data and the hyperparameter tuning of the DNN model is carried out using the Symbiotic Organisms Search(SOS)algorithm.The utilization of IFFO and SOS algorithms pave the way for accomplishing maximum gene expression classification outcomes.For examining the improved outcomes of the ODNN-MGEC technique,a wide ranging experimental analysis is made against benchmark datasets.The extensive comparison study with recent approaches demonstrates the enhanced outcomes of the ODNN-MGEC technique in terms of different measures.展开更多
Due to the open nature of wireless data transmission,routing and data security pose an important research challenge in the Internet of Things(IoT)-enabled networks.Also,the characteristic features,like constrained res...Due to the open nature of wireless data transmission,routing and data security pose an important research challenge in the Internet of Things(IoT)-enabled networks.Also,the characteristic features,like constrained resources,heterogeneity,uncontrolled environment,and scalability requirement,make the security issues even more challenging.Hence,an effective and secure routing protocol named modified Energy Harvesting Trust-aware Routing Algorithm(mod-EHTARA)is proposed to increase the energy efficiency and the lifespan of the nodes.The proposed mod-EHTARA is designed by adopting the Link Lifetime(LLT)model with the traditional EHTARA.The optimal secure routing path is effectively selected by the proposed mod-EHTARA using the cost metric,which considers the factors like delay,LLT,energy,and trust.The big data classification process is carried out at the Base Station(BS)using the MapReduce framework.Accordingly,the big data classification is progressed using a stacked autoencoder,which is trained by the Adaptive E-Bat algorithm.The Adaptive E-Bat algorithm is developed by integrating the adaptive concept with the Bat Algorithm(BA)and Exponential Weighted Moving Average(EWMA).The proposed mod-EHTARA showed better performance by obtaining a maximal energy of 0.9855.展开更多
Autism spectrum disorder(ASD)is regarded as a neurological disorder well-defined by a specific set of problems associated with social skills,recurrent conduct,and communication.Identifying ASD as soon as possible is f...Autism spectrum disorder(ASD)is regarded as a neurological disorder well-defined by a specific set of problems associated with social skills,recurrent conduct,and communication.Identifying ASD as soon as possible is favourable due to prior identification of ASD permits prompt interferences in children with ASD.Recognition of ASD related to objective pathogenicmutation screening is the initial step against prior intervention and efficient treatment of children who were affected.Nowadays,healthcare and machine learning(ML)industries are combined for determining the existence of various diseases.This article devises a Jellyfish Search Optimization with Deep Learning Driven ASD Detection and Classification(JSODL-ASDDC)model.The goal of the JSODL-ASDDC algorithm is to identify the different stages of ASD with the help of biomedical data.The proposed JSODLASDDC model initially performs min-max data normalization approach to scale the data into uniform range.In addition,the JSODL-ASDDC model involves JSO based feature selection(JFSO-FS)process to choose optimal feature subsets.Moreover,Gated Recurrent Unit(GRU)based classification model is utilized for the recognition and classification of ASD.Furthermore,the Bacterial Foraging Optimization(BFO)assisted parameter tuning process gets executed to enhance the efficacy of the GRU system.The experimental assessment of the JSODL-ASDDC model is investigated against distinct datasets.The experimental outcomes highlighted the enhanced performances of the JSODL-ASDDC algorithm over recent approaches.展开更多
With new developments experienced in Internet of Things(IoT),wearable,and sensing technology,the value of healthcare services has enhanced.This evolution has brought significant changes from conventional medicine-base...With new developments experienced in Internet of Things(IoT),wearable,and sensing technology,the value of healthcare services has enhanced.This evolution has brought significant changes from conventional medicine-based healthcare to real-time observation-based healthcare.Biomedical Electrocardiogram(ECG)signals are generally utilized in examination and diagnosis of Cardiovascular Diseases(CVDs)since it is quick and non-invasive in nature.Due to increasing number of patients in recent years,the classifier efficiency gets reduced due to high variances observed in ECG signal patterns obtained from patients.In such scenario computer-assisted automated diagnostic tools are important for classification of ECG signals.The current study devises an Improved Bat Algorithm with Deep Learning Based Biomedical ECGSignal Classification(IBADL-BECGC)approach.To accomplish this,the proposed IBADL-BECGC model initially pre-processes the input signals.Besides,IBADL-BECGC model applies NasNet model to derive the features from test ECG signals.In addition,Improved Bat Algorithm(IBA)is employed to optimally fine-tune the hyperparameters related to NasNet approach.Finally,Extreme Learning Machine(ELM)classification algorithm is executed to perform ECG classification method.The presented IBADL-BECGC model was experimentally validated utilizing benchmark dataset.The comparison study outcomes established the improved performance of IBADL-BECGC model over other existing methodologies since the former achieved a maximum accuracy of 97.49%.展开更多
Fake news and its significance carried the significance of affecting diverse aspects of diverse entities,ranging from a city lifestyle to a country global relativity,various methods are available to collect and determ...Fake news and its significance carried the significance of affecting diverse aspects of diverse entities,ranging from a city lifestyle to a country global relativity,various methods are available to collect and determine fake news.The recently developed machine learning(ML)models can be employed for the detection and classification of fake news.This study designs a novel Chaotic Ant Swarm with Weighted Extreme Learning Machine(CAS-WELM)for Cybersecurity Fake News Detection and Classification.The goal of the CAS-WELM technique is to discriminate news into fake and real.The CAS-WELM technique initially pre-processes the input data and Glove technique is used for word embed-ding process.Then,N-gram based feature extraction technique is derived to gen-erate feature vectors.Lastly,WELM model is applied for the detection and classification of fake news,in which the weight value of the WELM model can be optimally adjusted by the use of CAS algorithm.The performance validation of the CAS-WELM technique is carried out using the benchmark dataset and the results are inspected under several dimensions.The experimental results reported the enhanced outcomes of the CAS-WELM technique over the recent approaches.展开更多
Recently,developments of Internet and cloud technologies have resulted in a considerable rise in utilization of online media for day to day lives.It results in illegal access to users’private data and compromises it....Recently,developments of Internet and cloud technologies have resulted in a considerable rise in utilization of online media for day to day lives.It results in illegal access to users’private data and compromises it.Phishing is a popular attack which tricked the user into accessing malicious data and gaining the data.Proper identification of phishing emails can be treated as an essential process in the domain of cybersecurity.This article focuses on the design of bio-geography based optimization with deep learning for Phishing Email detection and classification(BBODL-PEDC)model.The major intention of the BBODL-PEDC model is to distinguish emails between legitimate and phishing.The BBODL-PEDC model initially performs data pre-processing in three levels namely email cleaning,tokenization,and stop word elimination.Besides,TF-IDF model is applied for the extraction of useful feature vectors.Moreover,optimal deep belief network(DBN)model is used for the email classification and its efficacy can be boosted by the BBO based hyperparameter tuning process.The performance validation of the BBODL-PEDC model can be performed using benchmark dataset and the results are assessed under several dimensions.Extensive comparative studies reported the superior outcomes of the BBODL-PEDC model over the recent approaches.展开更多
Sentiment Analysis(SA)is one of the subfields in Natural Language Processing(NLP)which focuses on identification and extraction of opinions that exist in the text provided across reviews,social media,blogs,news,and so...Sentiment Analysis(SA)is one of the subfields in Natural Language Processing(NLP)which focuses on identification and extraction of opinions that exist in the text provided across reviews,social media,blogs,news,and so on.SA has the ability to handle the drastically-increasing unstructured text by transform-ing them into structured data with the help of NLP and open source tools.The current research work designs a novel Modified Red Deer Algorithm(MRDA)Extreme Learning Machine Sparse Autoencoder(ELMSAE)model for SA and classification.The proposed MRDA-ELMSAE technique initially performs pre-processing to transform the data into a compatible format.Moreover,TF-IDF vec-torizer is employed in the extraction of features while ELMSAE model is applied in the classification of sentiments.Furthermore,optimal parameter tuning is done for ELMSAE model using MRDA technique.A wide range of simulation analyses was carried out and results from comparative analysis establish the enhanced effi-ciency of MRDA-ELMSAE technique against other recent techniques.展开更多
基金supported by the Yunnan Major Scientific and Technological Projects(Grant No.202302AD080001)the National Natural Science Foundation,China(No.52065033).
文摘When building a classification model,the scenario where the samples of one class are significantly more than those of the other class is called data imbalance.Data imbalance causes the trained classification model to be in favor of the majority class(usually defined as the negative class),which may do harm to the accuracy of the minority class(usually defined as the positive class),and then lead to poor overall performance of the model.A method called MSHR-FCSSVM for solving imbalanced data classification is proposed in this article,which is based on a new hybrid resampling approach(MSHR)and a new fine cost-sensitive support vector machine(CS-SVM)classifier(FCSSVM).The MSHR measures the separability of each negative sample through its Silhouette value calculated by Mahalanobis distance between samples,based on which,the so-called pseudo-negative samples are screened out to generate new positive samples(over-sampling step)through linear interpolation and are deleted finally(under-sampling step).This approach replaces pseudo-negative samples with generated new positive samples one by one to clear up the inter-class overlap on the borderline,without changing the overall scale of the dataset.The FCSSVM is an improved version of the traditional CS-SVM.It considers influences of both the imbalance of sample number and the class distribution on classification simultaneously,and through finely tuning the class cost weights by using the efficient optimization algorithm based on the physical phenomenon of rime-ice(RIME)algorithm with cross-validation accuracy as the fitness function to accurately adjust the classification borderline.To verify the effectiveness of the proposed method,a series of experiments are carried out based on 20 imbalanced datasets including both mildly and extremely imbalanced datasets.The experimental results show that the MSHR-FCSSVM method performs better than the methods for comparison in most cases,and both the MSHR and the FCSSVM played significant roles.
文摘With the exponential developments of wireless networking and inexpensive Internet of Things(IoT),a wide range of applications has been designed to attain enhanced services.Due to the limited energy capacity of IoT devices,energy-aware clustering techniques can be highly preferable.At the same time,artificial intelligence(AI)techniques can be applied to perform appropriate disease diagnostic processes.With this motivation,this study designs a novel squirrel search algorithm-based energy-aware clustering with a medical data classification(SSAC-MDC)model in an IoT environment.The goal of the SSAC-MDC technique is to attain maximum energy efficiency and disease diagnosis in the IoT environment.The proposed SSAC-MDC technique involves the design of the squirrel search algorithm-based clustering(SSAC)technique to choose the proper set of cluster heads(CHs)and construct clusters.Besides,the medical data classification process involves three different subprocesses namely pre-processing,autoencoder(AE)based classification,and improved beetle antenna search(IBAS)based parameter tuning.The design of the SSAC technique and IBAS based parameter optimization processes show the novelty of the work.For show-casing the improved performance of the SSAC-MDC technique,a series of experiments were performed and the comparative results highlighted the supremacy of the SSAC-MDC technique over the recent methods.
基金Deputyship for Research&Innovation,Ministry of Education in Saudi Arabia for funding this research work through the Project Number RI-44-0444.
文摘Datamining plays a crucial role in extractingmeaningful knowledge fromlarge-scale data repositories,such as data warehouses and databases.Association rule mining,a fundamental process in data mining,involves discovering correlations,patterns,and causal structures within datasets.In the healthcare domain,association rules offer valuable opportunities for building knowledge bases,enabling intelligent diagnoses,and extracting invaluable information rapidly.This paper presents a novel approach called the Machine Learning based Association Rule Mining and Classification for Healthcare Data Management System(MLARMC-HDMS).The MLARMC-HDMS technique integrates classification and association rule mining(ARM)processes.Initially,the chimp optimization algorithm-based feature selection(COAFS)technique is employed within MLARMC-HDMS to select relevant attributes.Inspired by the foraging behavior of chimpanzees,the COA algorithm mimics their search strategy for food.Subsequently,the classification process utilizes stochastic gradient descent with a multilayer perceptron(SGD-MLP)model,while the Apriori algorithm determines attribute relationships.We propose a COA-based feature selection approach for medical data classification using machine learning techniques.This approach involves selecting pertinent features from medical datasets through COA and training machine learning models using the reduced feature set.We evaluate the performance of our approach on various medical datasets employing diverse machine learning classifiers.Experimental results demonstrate that our proposed approach surpasses alternative feature selection methods,achieving higher accuracy and precision rates in medical data classification tasks.The study showcases the effectiveness and efficiency of the COA-based feature selection approach in identifying relevant features,thereby enhancing the diagnosis and treatment of various diseases.To provide further validation,we conduct detailed experiments on a benchmark medical dataset,revealing the superiority of the MLARMCHDMS model over other methods,with a maximum accuracy of 99.75%.Therefore,this research contributes to the advancement of feature selection techniques in medical data classification and highlights the potential for improving healthcare outcomes through accurate and efficient data analysis.The presented MLARMC-HDMS framework and COA-based feature selection approach offer valuable insights for researchers and practitioners working in the field of healthcare data mining and machine learning.
基金supported by a grant of the Korea Health Technology R&D Project through the Korea Health Industry Development Institute(KHIDI),funded by the Ministry of Health&Welfare,Republic of Korea(Grant Number:HI21C1831)the Soonchunhyang University Research Fund.
文摘Recently,a massive quantity of data is being produced from a distinct number of sources and the size of the daily created on the Internet has crossed two Exabytes.At the same time,clustering is one of the efficient techniques for mining big data to extract the useful and hidden patterns that exist in it.Density-based clustering techniques have gained significant attention owing to the fact that it helps to effectively recognize complex patterns in spatial dataset.Big data clustering is a trivial process owing to the increasing quantity of data which can be solved by the use of Map Reduce tool.With this motivation,this paper presents an efficient Map Reduce based hybrid density based clustering and classification algorithm for big data analytics(MR-HDBCC).The proposed MR-HDBCC technique is executed on Map Reduce tool for handling the big data.In addition,the MR-HDBCC technique involves three distinct processes namely pre-processing,clustering,and classification.The proposed model utilizes the Density-Based Spatial Clustering of Applications with Noise(DBSCAN)techni-que which is capable of detecting random shapes and diverse clusters with noisy data.For improving the performance of the DBSCAN technique,a hybrid model using cockroach swarm optimization(CSO)algorithm is developed for the exploration of the search space and determine the optimal parameters for density based clustering.Finally,bidirectional gated recurrent neural network(BGRNN)is employed for the classification of big data.The experimental validation of the proposed MR-HDBCC technique takes place using the benchmark dataset and the simulation outcomes demonstrate the promising performance of the proposed model interms of different measures.
基金Supported by the National Natural Science Foun-dation of China(60474022)
文摘A method for data classification will influence the efficiency of classification. Attributes reduction based on discernibility matrix and discernibility function in rough sets can use in data classification, so we put forward a method for data classification. Namely, firstly, we use discernibility matrix and discernibility function to delete superfluous attributes in formation system and get a necessary attribute set. Secondly, we delete superfluous attribute values and get decision rules. Finally, we classify data by means of decision rules. The experiments show that data classification using this method is simpler in the structure, and can improve the efficiency of classification.
基金a grant from the“Research Center of the Female Scientific and Medical Colleges”,the Deanship of Scientific Research,King Saud University.
文摘Medical data classification(MDC)refers to the application of classification methods on medical datasets.This work focuses on applying a classification task to medical datasets related to specific diseases in order to predict the associated diagnosis or prognosis.To gain experts’trust,the prediction and the reasoning behind it are equally important.Accordingly,we confine our research to learn rule-based models because they are transparent and comprehensible.One approach to MDC involves the use of metaheuristic(MH)algorithms.Here we report on the development and testing of a novel MH algorithm:IWD-Miner.This algorithm can be viewed as a fusion of Intelligent Water Drops(IWDs)and AntMiner+.It was subjected to a four-stage sensitivity analysis to optimize its performance.For this purpose,21 publicly available medical datasets were used from the Machine Learning Repository at the University of California Irvine.Interestingly,there were only limited differences in performance between IWDMiner variants which is suggestive of its robustness.Finally,using the same 21 datasets,we compared the performance of the optimized IWD-Miner against two extant algorithms,AntMiner+and J48.The experiments showed that both rival algorithms are considered comparable in the effectiveness to IWD-Miner,as confirmed by the Wilcoxon nonparametric statistical test.Results suggest that IWD-Miner is more efficient than AntMiner+as measured by the average number of fitness evaluations to a solution(1,386,621.30 vs.2,827,283.88 fitness evaluations,respectively).J48 exhibited higher accuracy on average than IWD-Miner(79.58 vs.73.65,respectively)but produced larger models(32.82 leaves vs.8.38 terms,respectively).
基金The authors extend their appreciation to the Deanship of Scientific Research at King Khalid University for funding this work under grant number(RGP 2/158/43)Princess Nourah bint Abdulrahman University Researchers Supporting Project number(PNURSP2022R235)Princess Nourah bint Abdulrahman University,Riyadh,Saudi Arabia.The authors would like to thank the Deanship of Scientific Research at Umm Al-Qura University for supporting this work by Grant Code:(22UQU4340237DSR06).
文摘The biomedical data classification process has received significant attention in recent times due to a massive increase in the generation of healthcare data from various sources.The developments of artificial intelligence(AI)and machine learning(ML)models assist in the effectual design of medical data classification models.Therefore,this article concentrates on the development of optimal Stacked Long Short Term Memory Sequence-toSequence Autoencoder(OSAE-LSTM)model for biomedical data classification.The presented OSAE-LSTM model intends to classify the biomedical data for the existence of diseases.Primarily,the OSAE-LSTM model involves min-max normalization based pre-processing to scale the data into uniform format.Followed by,the SAE-LSTM model is utilized for the detection and classification of diseases in biomedical data.At last,manta ray foraging optimization(MRFO)algorithm has been employed for hyperparameter optimization process.The utilization of MRFO algorithm assists in optimal selection of hypermeters involved in the SAE-LSTM model.The simulation analysis of the OSAE-LSTM model has been tested using a set of benchmark medical datasets and the results reported the improvements of the OSAELSTM model over the other approaches under several dimensions.
基金This project was supported financially by Institution Fund projects under Grant No.(IFPIP-249-145-1442).
文摘Medical data classification becomes a hot research topic in the healthcare sector to aid physicians in the healthcare sector for decision making.Besides,the advances of machine learning(ML)techniques assist to perform the effective classification task.With this motivation,this paper presents a Fuzzy Clustering Approach Based on Breadth-first Search Algorithm(FCA-BFS)with optimal support vector machine(OSVM)model,named FCABFS-OSVM for medical data classification.The proposed FCABFS-OSVM technique intends to classify the healthcare data by the use of clustering and classification models.Besides,the proposed FCABFSOSVM technique involves the design of FCABFS technique to cluster the medical data which helps to boost the classification performance.Moreover,the OSVM model investigates the clustered medical data to perform classification process.Furthermore,Archimedes optimization algorithm(AOA)is utilized to the SVM parameters and boost the medical data classification results.A wide range of simulations takes place to highlight the promising performance of the FCABFS-OSVM technique.Extensive comparison studies reported the enhanced outcomes of the FCABFS-OSVM technique over the recent state of art approaches.
文摘The significance of the preprocessing stage in any data mining task is well known. Before attempting medical data classification, characteristics of medical datasets, including noise, incompleteness, and the existence of multiple and possibly irrelevant features, need to be addressed. In this paper, we show that selecting the right combination of prepro- cessing methods has a considerable impact on the classification potential of a dataset. The preprocessing operations con- sidered include the discretization of numeric attributes, the selection of attribute subset(s), and the handling of missing values. The classification is performed by an ant colony optimization algorithm as a case study. Experimental results on 25 real-world medical datasets show that a significant relative improvement in predictive accuracy, exceeding 60% in some cases, is obtained.
文摘Purpose-The purpose of this paper is to provide a fault diagnosis method for rolling bearings.Rolling bearings are widely used in industrial appliances,and their fault diagnosis is of great importance and has drawn more and more attention.Based on the common failure mechanism of failure modes of rolling bearings,this paper proposes a novel compound data classification method based on the discrete wavelet transform and the support vector machine(SVM)and applies it in the fault diagnosis of rolling bearings.Design/methodology/approach-Vibration signal contains large quantity of information of bearing status and this paper uses various types of wavelet base functions to perform discrete wavelet transform of vibration and denoise.Feature vectors are constructed based on several time-domain indices of the denoised signal.SVM is then used to perform classification and fault diagnosis.Then the optimal wavelet base function is determined based on the diagnosis accuracy.Findings-Experiments of fault diagnosis of rolling bearings are carried out and wavelet functions in several wavelet families were tested.The results show that the SVM classifier with the db4 wavelet base function in the db wavelet family has the best fault diagnosis accuracy.Originality/value-This method provides a practical candidate for the fault diagnosis of rolling bearings in the industrial applications.
基金This research was supported in part by Basic Science Research Program through the National Research Foundation of Korea(NRF),funded by the Ministry of Education(NRF-2021R1A6A1A03039493)in part by the NRF grant funded by the Korea government(MSIT)(NRF-2022R1A2C1004401).
文摘Phishing is a type of cybercrime in which cyber-attackers pose themselves as authorized persons or entities and hack the victims’sensitive data.E-mails,instant messages and phone calls are some of the common modes used in cyberattacks.Though the security models are continuously upgraded to prevent cyberattacks,hackers find innovative ways to target the victims.In this background,there is a drastic increase observed in the number of phishing emails sent to potential targets.This scenario necessitates the importance of designing an effective classification model.Though numerous conventional models are available in the literature for proficient classification of phishing emails,the Machine Learning(ML)techniques and the Deep Learning(DL)models have been employed in the literature.The current study presents an Intelligent Cuckoo Search(CS)Optimization Algorithm with a Deep Learning-based Phishing Email Detection and Classification(ICSOA-DLPEC)model.The aim of the proposed ICSOA-DLPEC model is to effectually distinguish the emails as either legitimate or phishing ones.At the initial stage,the pre-processing is performed through three stages such as email cleaning,tokenization and stop-word elimination.Then,the N-gram approach is;moreover,the CS algorithm is applied to extract the useful feature vectors.Moreover,the CS algorithm is employed with the Gated Recurrent Unit(GRU)model to detect and classify phishing emails.Furthermore,the CS algorithm is used to fine-tune the parameters involved in the GRU model.The performance of the proposed ICSOA-DLPEC model was experimentally validated using a benchmark dataset,and the results were assessed under several dimensions.Extensive comparative studies were conducted,and the results confirmed the superior performance of the proposed ICSOA-DLPEC model over other existing approaches.The proposed model achieved a maximum accuracy of 99.72%.
基金National Natural Science Foundation of China,Grant/Award Number:61972261Basic Research Foundations of Shenzhen,Grant/Award Numbers:JCYJ20210324093609026,JCYJ20200813091134001。
文摘In this paper,an Observation Points Classifier Ensemble(OPCE)algorithm is proposed to deal with High-Dimensional Imbalanced Classification(HDIC)problems based on data processed using the Multi-Dimensional Scaling(MDS)feature extraction technique.First,dimensionality of the original imbalanced data is reduced using MDS so that distances between any two different samples are preserved as well as possible.Second,a novel OPCE algorithm is applied to classify imbalanced samples by placing optimised observation points in a low-dimensional data space.Third,optimization of the observation point mappings is carried out to obtain a reliable assessment of the unknown samples.Exhaustive experiments have been conducted to evaluate the feasibility,rationality,and effectiveness of the proposed OPCE algorithm using seven benchmark HDIC data sets.Experimental results show that(1)the OPCE algorithm can be trained faster on low-dimensional imbalanced data than on high-dimensional data;(2)the OPCE algorithm can correctly identify samples as the number of optimised observation points is increased;and(3)statistical analysis reveals that OPCE yields better HDIC performances on the selected data sets in comparison with eight other HDIC algorithms.This demonstrates that OPCE is a viable algorithm to deal with HDIC problems.
基金The authors extend their appreciation to the Deanship of Scientific Research at King Khalid University for funding this work under grant number(RGP 2/180/43)Taif University Researchers Supporting Project number(TURSP-2020/346)Taif University,Taif,Saudi Arabia.The authors would like to thank the Deanship of Scientific Research at Umm Al-Qura University for supporting this work by Grant Code:22UQU4340237DSR02.
文摘Recently,medical data classification becomes a hot research topic among healthcare professionals and research communities,which assist in the disease diagnosis and decision making process.The latest developments of artificial intelligence(AI)approaches paves a way for the design of effective medical data classification models.At the same time,the existence of numerous features in the medical dataset poses a curse of dimensionality problem.For resolving the issues,this article introduces a novel feature subset selection with artificial intelligence based classification model for biomedical data(FSS-AICBD)technique.The FSS-AICBD technique intends to derive a useful set of features and thereby improve the classifier results.Primarily,the FSS-AICBD technique undergoes min-max normalization technique to prevent data complexity.In addition,the information gain(IG)approach is applied for the optimal selection of feature subsets.Also,group search optimizer(GSO)with deep belief network(DBN)model is utilized for biomedical data classification where the hyperparameters of the DBN model can be optimally tuned by the GSO algorithm.The choice of IG and GSO approaches results in promising medical data classification results.The experimental result analysis of the FSS-AICBD technique takes place using different benchmark healthcare datasets.The simulation results reported the enhanced outcomes of the FSS-AICBD technique interms of several measures.
基金The authors extend their appreciation to the Deanship of Scientific Research at King Khalid University for funding this work under grant number(RGP 2/42/43)This work was supported by Taif University Researchers Supporting Program(project number:TURSP-2020/200),Taif University,Saudi Arabia.
文摘In bioinformatics applications,examination of microarray data has received significant interest to diagnose diseases.Microarray gene expression data can be defined by a massive searching space that poses a primary challenge in the appropriate selection of genes.Microarray data classification incorporates multiple disciplines such as bioinformatics,machine learning(ML),data science,and pattern classification.This paper designs an optimal deep neural network based microarray gene expression classification(ODNN-MGEC)model for bioinformatics applications.The proposed ODNN-MGEC technique performs data normalization process to normalize the data into a uniform scale.Besides,improved fruit fly optimization(IFFO)based feature selection technique is used to reduce the high dimensionality in the biomedical data.Moreover,deep neural network(DNN)model is applied for the classification of microarray gene expression data and the hyperparameter tuning of the DNN model is carried out using the Symbiotic Organisms Search(SOS)algorithm.The utilization of IFFO and SOS algorithms pave the way for accomplishing maximum gene expression classification outcomes.For examining the improved outcomes of the ODNN-MGEC technique,a wide ranging experimental analysis is made against benchmark datasets.The extensive comparison study with recent approaches demonstrates the enhanced outcomes of the ODNN-MGEC technique in terms of different measures.
文摘Due to the open nature of wireless data transmission,routing and data security pose an important research challenge in the Internet of Things(IoT)-enabled networks.Also,the characteristic features,like constrained resources,heterogeneity,uncontrolled environment,and scalability requirement,make the security issues even more challenging.Hence,an effective and secure routing protocol named modified Energy Harvesting Trust-aware Routing Algorithm(mod-EHTARA)is proposed to increase the energy efficiency and the lifespan of the nodes.The proposed mod-EHTARA is designed by adopting the Link Lifetime(LLT)model with the traditional EHTARA.The optimal secure routing path is effectively selected by the proposed mod-EHTARA using the cost metric,which considers the factors like delay,LLT,energy,and trust.The big data classification process is carried out at the Base Station(BS)using the MapReduce framework.Accordingly,the big data classification is progressed using a stacked autoencoder,which is trained by the Adaptive E-Bat algorithm.The Adaptive E-Bat algorithm is developed by integrating the adaptive concept with the Bat Algorithm(BA)and Exponential Weighted Moving Average(EWMA).The proposed mod-EHTARA showed better performance by obtaining a maximal energy of 0.9855.
文摘Autism spectrum disorder(ASD)is regarded as a neurological disorder well-defined by a specific set of problems associated with social skills,recurrent conduct,and communication.Identifying ASD as soon as possible is favourable due to prior identification of ASD permits prompt interferences in children with ASD.Recognition of ASD related to objective pathogenicmutation screening is the initial step against prior intervention and efficient treatment of children who were affected.Nowadays,healthcare and machine learning(ML)industries are combined for determining the existence of various diseases.This article devises a Jellyfish Search Optimization with Deep Learning Driven ASD Detection and Classification(JSODL-ASDDC)model.The goal of the JSODL-ASDDC algorithm is to identify the different stages of ASD with the help of biomedical data.The proposed JSODLASDDC model initially performs min-max data normalization approach to scale the data into uniform range.In addition,the JSODL-ASDDC model involves JSO based feature selection(JFSO-FS)process to choose optimal feature subsets.Moreover,Gated Recurrent Unit(GRU)based classification model is utilized for the recognition and classification of ASD.Furthermore,the Bacterial Foraging Optimization(BFO)assisted parameter tuning process gets executed to enhance the efficacy of the GRU system.The experimental assessment of the JSODL-ASDDC model is investigated against distinct datasets.The experimental outcomes highlighted the enhanced performances of the JSODL-ASDDC algorithm over recent approaches.
基金the Deanship of Scientific Research at King Khalid University for funding this work through Large Groups Project under Grant Number(71/43)Princess Nourah Bint Abdulrahman University Researchers Supporting Project Number(PNURSP2022R203)Princess Nourah Bint Abdulrahman University,Riyadh,Saudi Arabia.The authors would like to thank the Deanship of Scientific Research at Umm Al-Qura University for supporting this work by Grant Code:(22UQU4310373DSR29).
文摘With new developments experienced in Internet of Things(IoT),wearable,and sensing technology,the value of healthcare services has enhanced.This evolution has brought significant changes from conventional medicine-based healthcare to real-time observation-based healthcare.Biomedical Electrocardiogram(ECG)signals are generally utilized in examination and diagnosis of Cardiovascular Diseases(CVDs)since it is quick and non-invasive in nature.Due to increasing number of patients in recent years,the classifier efficiency gets reduced due to high variances observed in ECG signal patterns obtained from patients.In such scenario computer-assisted automated diagnostic tools are important for classification of ECG signals.The current study devises an Improved Bat Algorithm with Deep Learning Based Biomedical ECGSignal Classification(IBADL-BECGC)approach.To accomplish this,the proposed IBADL-BECGC model initially pre-processes the input signals.Besides,IBADL-BECGC model applies NasNet model to derive the features from test ECG signals.In addition,Improved Bat Algorithm(IBA)is employed to optimally fine-tune the hyperparameters related to NasNet approach.Finally,Extreme Learning Machine(ELM)classification algorithm is executed to perform ECG classification method.The presented IBADL-BECGC model was experimentally validated utilizing benchmark dataset.The comparison study outcomes established the improved performance of IBADL-BECGC model over other existing methodologies since the former achieved a maximum accuracy of 97.49%.
基金This research was supported by the Researchers Supporting Program(TUMA-Project2021-27)Almaarefa UniversityRiyadh,Saudi Arabia.Taif University Researchers Supporting Project number(TURSP-2020/161)Taif University,Taif,Saudi Arabia.
文摘Fake news and its significance carried the significance of affecting diverse aspects of diverse entities,ranging from a city lifestyle to a country global relativity,various methods are available to collect and determine fake news.The recently developed machine learning(ML)models can be employed for the detection and classification of fake news.This study designs a novel Chaotic Ant Swarm with Weighted Extreme Learning Machine(CAS-WELM)for Cybersecurity Fake News Detection and Classification.The goal of the CAS-WELM technique is to discriminate news into fake and real.The CAS-WELM technique initially pre-processes the input data and Glove technique is used for word embed-ding process.Then,N-gram based feature extraction technique is derived to gen-erate feature vectors.Lastly,WELM model is applied for the detection and classification of fake news,in which the weight value of the WELM model can be optimally adjusted by the use of CAS algorithm.The performance validation of the CAS-WELM technique is carried out using the benchmark dataset and the results are inspected under several dimensions.The experimental results reported the enhanced outcomes of the CAS-WELM technique over the recent approaches.
基金This research was supported by the Researchers Supporting Program(TUMA-Project2021–27)Almaarefa University,Riyadh,Saudi Arabia.
文摘Recently,developments of Internet and cloud technologies have resulted in a considerable rise in utilization of online media for day to day lives.It results in illegal access to users’private data and compromises it.Phishing is a popular attack which tricked the user into accessing malicious data and gaining the data.Proper identification of phishing emails can be treated as an essential process in the domain of cybersecurity.This article focuses on the design of bio-geography based optimization with deep learning for Phishing Email detection and classification(BBODL-PEDC)model.The major intention of the BBODL-PEDC model is to distinguish emails between legitimate and phishing.The BBODL-PEDC model initially performs data pre-processing in three levels namely email cleaning,tokenization,and stop word elimination.Besides,TF-IDF model is applied for the extraction of useful feature vectors.Moreover,optimal deep belief network(DBN)model is used for the email classification and its efficacy can be boosted by the BBO based hyperparameter tuning process.The performance validation of the BBODL-PEDC model can be performed using benchmark dataset and the results are assessed under several dimensions.Extensive comparative studies reported the superior outcomes of the BBODL-PEDC model over the recent approaches.
基金We acknowledge Taif University for Supporting this study through Taif University Researchers Supporting Project number(TURSP-2020/173)Taif University,Taif,Saudi Arabia.
文摘Sentiment Analysis(SA)is one of the subfields in Natural Language Processing(NLP)which focuses on identification and extraction of opinions that exist in the text provided across reviews,social media,blogs,news,and so on.SA has the ability to handle the drastically-increasing unstructured text by transform-ing them into structured data with the help of NLP and open source tools.The current research work designs a novel Modified Red Deer Algorithm(MRDA)Extreme Learning Machine Sparse Autoencoder(ELMSAE)model for SA and classification.The proposed MRDA-ELMSAE technique initially performs pre-processing to transform the data into a compatible format.Moreover,TF-IDF vec-torizer is employed in the extraction of features while ELMSAE model is applied in the classification of sentiments.Furthermore,optimal parameter tuning is done for ELMSAE model using MRDA technique.A wide range of simulation analyses was carried out and results from comparative analysis establish the enhanced effi-ciency of MRDA-ELMSAE technique against other recent techniques.