期刊文献+
共找到57篇文章
< 1 2 3 >
每页显示 20 50 100
An Imbalanced Data Classification Method Based on Hybrid Resampling and Fine Cost Sensitive Support Vector Machine 被引量:1
1
作者 Bo Zhu Xiaona Jing +1 位作者 Lan Qiu Runbo Li 《Computers, Materials & Continua》 SCIE EI 2024年第6期3977-3999,共23页
When building a classification model,the scenario where the samples of one class are significantly more than those of the other class is called data imbalance.Data imbalance causes the trained classification model to ... When building a classification model,the scenario where the samples of one class are significantly more than those of the other class is called data imbalance.Data imbalance causes the trained classification model to be in favor of the majority class(usually defined as the negative class),which may do harm to the accuracy of the minority class(usually defined as the positive class),and then lead to poor overall performance of the model.A method called MSHR-FCSSVM for solving imbalanced data classification is proposed in this article,which is based on a new hybrid resampling approach(MSHR)and a new fine cost-sensitive support vector machine(CS-SVM)classifier(FCSSVM).The MSHR measures the separability of each negative sample through its Silhouette value calculated by Mahalanobis distance between samples,based on which,the so-called pseudo-negative samples are screened out to generate new positive samples(over-sampling step)through linear interpolation and are deleted finally(under-sampling step).This approach replaces pseudo-negative samples with generated new positive samples one by one to clear up the inter-class overlap on the borderline,without changing the overall scale of the dataset.The FCSSVM is an improved version of the traditional CS-SVM.It considers influences of both the imbalance of sample number and the class distribution on classification simultaneously,and through finely tuning the class cost weights by using the efficient optimization algorithm based on the physical phenomenon of rime-ice(RIME)algorithm with cross-validation accuracy as the fitness function to accurately adjust the classification borderline.To verify the effectiveness of the proposed method,a series of experiments are carried out based on 20 imbalanced datasets including both mildly and extremely imbalanced datasets.The experimental results show that the MSHR-FCSSVM method performs better than the methods for comparison in most cases,and both the MSHR and the FCSSVM played significant roles. 展开更多
关键词 Imbalanced data classification Silhouette value Mahalanobis distance RIME algorithm CS-SVM
下载PDF
A novel method for clustering cellular data to improve classification
2
作者 Diek W.Wheeler Giorgio A.Ascoli 《Neural Regeneration Research》 SCIE CAS 2025年第9期2697-2705,共9页
Many fields,such as neuroscience,are experiencing the vast prolife ration of cellular data,underscoring the need fo r organizing and interpreting large datasets.A popular approach partitions data into manageable subse... Many fields,such as neuroscience,are experiencing the vast prolife ration of cellular data,underscoring the need fo r organizing and interpreting large datasets.A popular approach partitions data into manageable subsets via hierarchical clustering,but objective methods to determine the appropriate classification granularity are missing.We recently introduced a technique to systematically identify when to stop subdividing clusters based on the fundamental principle that cells must differ more between than within clusters.Here we present the corresponding protocol to classify cellular datasets by combining datadriven unsupervised hierarchical clustering with statistical testing.These general-purpose functions are applicable to any cellular dataset that can be organized as two-dimensional matrices of numerical values,including molecula r,physiological,and anatomical datasets.We demonstrate the protocol using cellular data from the Janelia MouseLight project to chara cterize morphological aspects of neurons. 展开更多
关键词 cellular data clustering dendrogram data classification Levene's one-tailed statistical test unsupervised hierarchical clustering
下载PDF
Chimp Optimization Algorithm Based Feature Selection with Machine Learning for Medical Data Classification
3
作者 Firas Abedi Hayder M.A.Ghanimi +6 位作者 Abeer D.Algarni Naglaa F.Soliman Walid El-Shafai Ali Hashim Abbas Zahraa H.Kareem Hussein Muhi Hariz Ahmed Alkhayyat 《Computer Systems Science & Engineering》 SCIE EI 2023年第12期2791-2814,共24页
Datamining plays a crucial role in extractingmeaningful knowledge fromlarge-scale data repositories,such as data warehouses and databases.Association rule mining,a fundamental process in data mining,involves discoveri... Datamining plays a crucial role in extractingmeaningful knowledge fromlarge-scale data repositories,such as data warehouses and databases.Association rule mining,a fundamental process in data mining,involves discovering correlations,patterns,and causal structures within datasets.In the healthcare domain,association rules offer valuable opportunities for building knowledge bases,enabling intelligent diagnoses,and extracting invaluable information rapidly.This paper presents a novel approach called the Machine Learning based Association Rule Mining and Classification for Healthcare Data Management System(MLARMC-HDMS).The MLARMC-HDMS technique integrates classification and association rule mining(ARM)processes.Initially,the chimp optimization algorithm-based feature selection(COAFS)technique is employed within MLARMC-HDMS to select relevant attributes.Inspired by the foraging behavior of chimpanzees,the COA algorithm mimics their search strategy for food.Subsequently,the classification process utilizes stochastic gradient descent with a multilayer perceptron(SGD-MLP)model,while the Apriori algorithm determines attribute relationships.We propose a COA-based feature selection approach for medical data classification using machine learning techniques.This approach involves selecting pertinent features from medical datasets through COA and training machine learning models using the reduced feature set.We evaluate the performance of our approach on various medical datasets employing diverse machine learning classifiers.Experimental results demonstrate that our proposed approach surpasses alternative feature selection methods,achieving higher accuracy and precision rates in medical data classification tasks.The study showcases the effectiveness and efficiency of the COA-based feature selection approach in identifying relevant features,thereby enhancing the diagnosis and treatment of various diseases.To provide further validation,we conduct detailed experiments on a benchmark medical dataset,revealing the superiority of the MLARMCHDMS model over other methods,with a maximum accuracy of 99.75%.Therefore,this research contributes to the advancement of feature selection techniques in medical data classification and highlights the potential for improving healthcare outcomes through accurate and efficient data analysis.The presented MLARMC-HDMS framework and COA-based feature selection approach offer valuable insights for researchers and practitioners working in the field of healthcare data mining and machine learning. 展开更多
关键词 Association rule mining data classification healthcare data machine learning parameter tuning data mining feature selection MLARMC-HDMS COA stochastic gradient descent Apriori algorithm
下载PDF
Energy Aware Clustering with Medical Data Classification Model in IoT Environment
4
作者 R.Bharathi T.Abirami 《Computer Systems Science & Engineering》 SCIE EI 2023年第1期797-811,共15页
With the exponential developments of wireless networking and inexpensive Internet of Things(IoT),a wide range of applications has been designed to attain enhanced services.Due to the limited energy capacity of IoT dev... With the exponential developments of wireless networking and inexpensive Internet of Things(IoT),a wide range of applications has been designed to attain enhanced services.Due to the limited energy capacity of IoT devices,energy-aware clustering techniques can be highly preferable.At the same time,artificial intelligence(AI)techniques can be applied to perform appropriate disease diagnostic processes.With this motivation,this study designs a novel squirrel search algorithm-based energy-aware clustering with a medical data classification(SSAC-MDC)model in an IoT environment.The goal of the SSAC-MDC technique is to attain maximum energy efficiency and disease diagnosis in the IoT environment.The proposed SSAC-MDC technique involves the design of the squirrel search algorithm-based clustering(SSAC)technique to choose the proper set of cluster heads(CHs)and construct clusters.Besides,the medical data classification process involves three different subprocesses namely pre-processing,autoencoder(AE)based classification,and improved beetle antenna search(IBAS)based parameter tuning.The design of the SSAC technique and IBAS based parameter optimization processes show the novelty of the work.For show-casing the improved performance of the SSAC-MDC technique,a series of experiments were performed and the comparative results highlighted the supremacy of the SSAC-MDC technique over the recent methods. 展开更多
关键词 Internet of things healthcare medical data classification energy efficiency CLUSTERING autoencoder
下载PDF
Metaheuristic Based Clustering with Deep Learning Model for Big Data Classification
5
作者 R.Krishnaswamy Kamalraj Subramaniam +3 位作者 V.Nandini K.Vijayalakshmi Seifedine Kadry Yunyoung Nam 《Computer Systems Science & Engineering》 SCIE EI 2023年第1期391-406,共16页
Recently,a massive quantity of data is being produced from a distinct number of sources and the size of the daily created on the Internet has crossed two Exabytes.At the same time,clustering is one of the efficient te... Recently,a massive quantity of data is being produced from a distinct number of sources and the size of the daily created on the Internet has crossed two Exabytes.At the same time,clustering is one of the efficient techniques for mining big data to extract the useful and hidden patterns that exist in it.Density-based clustering techniques have gained significant attention owing to the fact that it helps to effectively recognize complex patterns in spatial dataset.Big data clustering is a trivial process owing to the increasing quantity of data which can be solved by the use of Map Reduce tool.With this motivation,this paper presents an efficient Map Reduce based hybrid density based clustering and classification algorithm for big data analytics(MR-HDBCC).The proposed MR-HDBCC technique is executed on Map Reduce tool for handling the big data.In addition,the MR-HDBCC technique involves three distinct processes namely pre-processing,clustering,and classification.The proposed model utilizes the Density-Based Spatial Clustering of Applications with Noise(DBSCAN)techni-que which is capable of detecting random shapes and diverse clusters with noisy data.For improving the performance of the DBSCAN technique,a hybrid model using cockroach swarm optimization(CSO)algorithm is developed for the exploration of the search space and determine the optimal parameters for density based clustering.Finally,bidirectional gated recurrent neural network(BGRNN)is employed for the classification of big data.The experimental validation of the proposed MR-HDBCC technique takes place using the benchmark dataset and the simulation outcomes demonstrate the promising performance of the proposed model interms of different measures. 展开更多
关键词 Big data data classification CLUSTERING MAPREDUCE dbscan algorithm
下载PDF
Cost-Sensitive Dual-Stream Residual Networks for Imbalanced Classification
6
作者 Congcong Ma Jiaqi Mi +1 位作者 Wanlin Gao Sha Tao 《Computers, Materials & Continua》 SCIE EI 2024年第9期4243-4261,共19页
Imbalanced data classification is the task of classifying datasets where there is a significant disparity in the number of samples between different classes.This task is prevalent in practical scenarios such as indust... Imbalanced data classification is the task of classifying datasets where there is a significant disparity in the number of samples between different classes.This task is prevalent in practical scenarios such as industrial fault diagnosis,network intrusion detection,cancer detection,etc.In imbalanced classification tasks,the focus is typically on achieving high recognition accuracy for the minority class.However,due to the challenges presented by imbalanced multi-class datasets,such as the scarcity of samples in minority classes and complex inter-class relationships with overlapping boundaries,existing methods often do not perform well in multi-class imbalanced data classification tasks,particularly in terms of recognizing minority classes with high accuracy.Therefore,this paper proposes a multi-class imbalanced data classification method called CSDSResNet,which is based on a cost-sensitive dualstream residual network.Firstly,to address the issue of limited samples in the minority class within imbalanced datasets,a dual-stream residual network backbone structure is designed to enhance the model’s feature extraction capability.Next,considering the complexities arising fromimbalanced inter-class sample quantities and imbalanced inter-class overlapping boundaries in multi-class imbalanced datasets,a unique cost-sensitive loss function is devised.This loss function places more emphasis on the minority class and the challenging classes with high interclass similarity,thereby improving the model’s classification ability.Finally,the effectiveness and generalization of the proposed method,CSDSResNet,are evaluated on two datasets:‘DryBeans’and‘Electric Motor Defects’.The experimental results demonstrate that CSDSResNet achieves the best performance on imbalanced datasets,with macro_F1-score values improving by 2.9%and 1.9%on the two datasets compared to current state-of-the-art classification methods,respectively.Furthermore,it achieves the highest precision in single-class recognition tasks for the minority class. 展开更多
关键词 Deep learning imbalanced data classification fault diagnosis cost-sensitivity
下载PDF
Data classification method based on network dynamics analysis and cloud model
7
作者 王肖霞 杨风暴 +1 位作者 梁若飞 张文华 《Journal of Measurement Science and Instrumentation》 CAS CSCD 2016年第3期266-271,共6页
In order to reduce amount of data storage and improve processing capacity of the system, this paper proposes a new classification method of data source by combining phase synchronization model in network clusteri... In order to reduce amount of data storage and improve processing capacity of the system, this paper proposes a new classification method of data source by combining phase synchronization model in network clustering with cloud model. Firstly, taking data source as a complex network, after the topography of network is obtained, the cloud model of each node data is determined by fuzzy analytic hierarchy process (AHP). Secondly, by calculating expectation, entropy and hyper entropy of the cloud model, comprehensive coupling strength is got and then it is regarded as the edge weight of topography. Finally, distribution curve is obtained by iterating the phase of each node by means of phase synchronization model. Thus classification of data source is completed. This method can not only provide convenience for storage, cleaning and compression of data, but also improve the efficiency of data analysis. 展开更多
关键词 data classification complex network phase synchronization cloud model
下载PDF
Signal classification method based on data mining formulti-mode radar 被引量:9
8
作者 qiang guo pulong nan jian wan 《Journal of Systems Engineering and Electronics》 SCIE EI CSCD 2016年第5期1010-1017,共8页
For the multi-mode radar working in the modern electronicbattlefield, different working states of one single radar areprone to being classified as multiple emitters when adoptingtraditional classification methods to p... For the multi-mode radar working in the modern electronicbattlefield, different working states of one single radar areprone to being classified as multiple emitters when adoptingtraditional classification methods to process intercepted signals,which has a negative effect on signal classification. A classificationmethod based on spatial data mining is presented to address theabove challenge. Inspired by the idea of spatial data mining, theclassification method applies nuclear field to depicting the distributioninformation of pulse samples in feature space, and digs out thehidden cluster information by analyzing distribution characteristics.In addition, a membership-degree criterion to quantify the correlationamong all classes is established, which ensures classificationaccuracy of signal samples. Numerical experiments show that thepresented method can effectively prevent different working statesof multi-mode emitter from being classified as several emitters,and achieve higher classification accuracy. 展开更多
关键词 multi-mode radar signal classification data mining nuclear field cloud model membership.
下载PDF
Application system and data description of the China Seismo-Electromagnetic Satellite 被引量:11
9
作者 JianPing Huang XuHui Shen +8 位作者 XueMin Zhang HengXin Lu Qiao Tan Qiao Wang Rui Yan Wei Chu YanYan Yang DaPeng Liu Song Xu 《Earth and Planetary Physics》 2018年第6期444-454,共11页
The China Seismo-Electromagnetic Satellite, launched into orbit from Jiuquan Satellite Launch Centre on February 2 nd, 2018, is China's first space satellite dedicated to geophysical exporation. The satellite carr... The China Seismo-Electromagnetic Satellite, launched into orbit from Jiuquan Satellite Launch Centre on February 2 nd, 2018, is China's first space satellite dedicated to geophysical exporation. The satellite carries eight scientific payloads including high-precision magnetometers to detect electromagnetic changes in space, in particular changes associated with global earthquake disasters. In order to encourage and facilitate use by geophysical scientists of data from the satellite's payloads, this paper introduces the application systems developed for the China Seismo-Electromagnetic Satellite by the Institute of Crustal Dynamics, China Earthquake Administration;these include platform construction, data classification, data storage, data format, and data access and acquisition. 展开更多
关键词 China Seismo-Electromagnetic Satellite application system geophysical field data classification
下载PDF
The materials data ecosystem: Materials data science and its role in data-driven materials discovery 被引量:2
10
作者 Hai-Qing Yin Xue Jiang +4 位作者 Guo-Quan Liu Sharon Elder Bin Xu Qing-Jun Zheng Xuan-Hui Qu 《Chinese Physics B》 SCIE EI CAS CSCD 2018年第11期120-125,共6页
Since its launch in 2011, the Materials Genome Initiative(MGI) has drawn the attention of researchers from academia,government, and industry worldwide. As one of the three tools of the MGI, the use of materials data... Since its launch in 2011, the Materials Genome Initiative(MGI) has drawn the attention of researchers from academia,government, and industry worldwide. As one of the three tools of the MGI, the use of materials data, for the first time, has emerged as an extremely significant approach in materials discovery. Data science has been applied in different disciplines as an interdisciplinary field to extract knowledge from data. The concept of materials data science has been utilized to demonstrate its application in materials science. To explore its potential as an active research branch in the big data era, a three-tier system has been put forward to define the infrastructure for the classification, curation and knowledge extraction of materials data. 展开更多
关键词 Materials Genome Initiative materials data science data classification life-cycle curation
下载PDF
Data Flow&Transaction Mode Classification and An Explorative Estimation on Data Storage&Transaction Volume 被引量:3
11
作者 Cai Yuezhou Liu Yuexin 《China Economist》 2022年第6期78-112,共35页
The public has shown great interest in the data factor and data transactions,but the current attention is overly focused on personal behavioral data and transactions happening at Data Exchanges.To deliver a complete p... The public has shown great interest in the data factor and data transactions,but the current attention is overly focused on personal behavioral data and transactions happening at Data Exchanges.To deliver a complete picture of data flaw and transaction,this paper presents a systematic overview of the flow and transaction of personal,corporate and public data on the basis of data factor classification from various perspectives.By utilizing various sources of information,this paper estimates the volume of data generation&storage and the volume&trend of data market transactions for major economies in the world with the following findings:(i)Data classification is diverse due to a broad variety of applying scenarios,and data transaction and profit distribution are complex due to heterogenous entities,ownerships,information density and other attributes of different data types.(ii)Global data transaction has presented with the characteristics of productization,servitization and platform-based mode.(iii)For major economies,there is a commonly observed disequilibrium between data generation scale and storage scale,which is particularly striking for China.(i^v)The global data market is in a nascent stage of rapid development with a transaction volume of about 100 billion US dollars,and China s data market is even more underdeveloped and only accounts for some 10%of the world total.All sectors of the society should be flly aware of the diversity and complexity of data factor classification and data transactions,as well as the arduous and long-term nature of developing and improving relevant institutional systems.Adapting to such features,efforts should be made to improve data classification,enhance computing infrastructure development,foster professional data transaction and development institutions,and perfect the data governance system. 展开更多
关键词 data factor data classification data transaction mode data generation&storage volume data transaction volume
下载PDF
A Method for Data Classification Based on Discernibility Matrix and Discernibility Function 被引量:1
12
作者 SUN Shi-bao QIN Ke-yun 《Wuhan University Journal of Natural Sciences》 EI CAS 2006年第1期230-233,共4页
A method for data classification will influence the efficiency of classification. Attributes reduction based on discernibility matrix and discernibility function in rough sets can use in data classification, so we put... A method for data classification will influence the efficiency of classification. Attributes reduction based on discernibility matrix and discernibility function in rough sets can use in data classification, so we put forward a method for data classification. Namely, firstly, we use discernibility matrix and discernibility function to delete superfluous attributes in formation system and get a necessary attribute set. Secondly, we delete superfluous attribute values and get decision rules. Finally, we classify data by means of decision rules. The experiments show that data classification using this method is simpler in the structure, and can improve the efficiency of classification. 展开更多
关键词 discernibility matrix discernibility function attributes reduction data classification
下载PDF
THRFuzzy:Tangential holoentropy-enabled rough fuzzy classifier to classification of evolving data streams 被引量:1
13
作者 Jagannath E.Nalavade T.Senthil Murugan 《Journal of Central South University》 SCIE EI CAS CSCD 2017年第8期1789-1800,共12页
The rapid developments in the fields of telecommunication, sensor data, financial applications, analyzing of data streams, and so on, increase the rate of data arrival, among which the data mining technique is conside... The rapid developments in the fields of telecommunication, sensor data, financial applications, analyzing of data streams, and so on, increase the rate of data arrival, among which the data mining technique is considered a vital process. The data analysis process consists of different tasks, among which the data stream classification approaches face more challenges than the other commonly used techniques. Even though the classification is a continuous process, it requires a design that can adapt the classification model so as to adjust the concept change or the boundary change between the classes. Hence, we design a novel fuzzy classifier known as THRFuzzy to classify new incoming data streams. Rough set theory along with tangential holoentropy function helps in the designing the dynamic classification model. The classification approach uses kernel fuzzy c-means(FCM) clustering for the generation of the rules and tangential holoentropy function to update the membership function. The performance of the proposed THRFuzzy method is verified using three datasets, namely skin segmentation, localization, and breast cancer datasets, and the evaluated metrics, accuracy and time, comparing its performance with HRFuzzy and adaptive k-NN classifiers. The experimental results conclude that THRFuzzy classifier shows better classification results providing a maximum accuracy consuming a minimal time than the existing classifiers. 展开更多
关键词 data stream classification fuzzy rough set tangential holoentropy concept change
下载PDF
Intelligent Deep Learning Based Cybersecurity Phishing Email Detection and Classification 被引量:1
14
作者 R.Brindha S.Nandagopal +3 位作者 H.Azath V.Sathana Gyanendra Prasad Joshi Sung Won Kim 《Computers, Materials & Continua》 SCIE EI 2023年第3期5901-5914,共14页
Phishing is a type of cybercrime in which cyber-attackers pose themselves as authorized persons or entities and hack the victims’sensitive data.E-mails,instant messages and phone calls are some of the common modes us... Phishing is a type of cybercrime in which cyber-attackers pose themselves as authorized persons or entities and hack the victims’sensitive data.E-mails,instant messages and phone calls are some of the common modes used in cyberattacks.Though the security models are continuously upgraded to prevent cyberattacks,hackers find innovative ways to target the victims.In this background,there is a drastic increase observed in the number of phishing emails sent to potential targets.This scenario necessitates the importance of designing an effective classification model.Though numerous conventional models are available in the literature for proficient classification of phishing emails,the Machine Learning(ML)techniques and the Deep Learning(DL)models have been employed in the literature.The current study presents an Intelligent Cuckoo Search(CS)Optimization Algorithm with a Deep Learning-based Phishing Email Detection and Classification(ICSOA-DLPEC)model.The aim of the proposed ICSOA-DLPEC model is to effectually distinguish the emails as either legitimate or phishing ones.At the initial stage,the pre-processing is performed through three stages such as email cleaning,tokenization and stop-word elimination.Then,the N-gram approach is;moreover,the CS algorithm is applied to extract the useful feature vectors.Moreover,the CS algorithm is employed with the Gated Recurrent Unit(GRU)model to detect and classify phishing emails.Furthermore,the CS algorithm is used to fine-tune the parameters involved in the GRU model.The performance of the proposed ICSOA-DLPEC model was experimentally validated using a benchmark dataset,and the results were assessed under several dimensions.Extensive comparative studies were conducted,and the results confirmed the superior performance of the proposed ICSOA-DLPEC model over other existing approaches.The proposed model achieved a maximum accuracy of 99.72%. 展开更多
关键词 Phishing email data classification natural language processing deep learning CYBERSECURITY
下载PDF
Observation points classifier ensemble for high-dimensional imbalanced classification 被引量:1
15
作者 Yulin He Xu Li +3 位作者 Philippe Fournier‐Viger Joshua Zhexue Huang Mianjie Li Salman Salloum 《CAAI Transactions on Intelligence Technology》 SCIE EI 2023年第2期500-517,共18页
In this paper,an Observation Points Classifier Ensemble(OPCE)algorithm is proposed to deal with High-Dimensional Imbalanced Classification(HDIC)problems based on data processed using the Multi-Dimensional Scaling(MDS)... In this paper,an Observation Points Classifier Ensemble(OPCE)algorithm is proposed to deal with High-Dimensional Imbalanced Classification(HDIC)problems based on data processed using the Multi-Dimensional Scaling(MDS)feature extraction technique.First,dimensionality of the original imbalanced data is reduced using MDS so that distances between any two different samples are preserved as well as possible.Second,a novel OPCE algorithm is applied to classify imbalanced samples by placing optimised observation points in a low-dimensional data space.Third,optimization of the observation point mappings is carried out to obtain a reliable assessment of the unknown samples.Exhaustive experiments have been conducted to evaluate the feasibility,rationality,and effectiveness of the proposed OPCE algorithm using seven benchmark HDIC data sets.Experimental results show that(1)the OPCE algorithm can be trained faster on low-dimensional imbalanced data than on high-dimensional data;(2)the OPCE algorithm can correctly identify samples as the number of optimised observation points is increased;and(3)statistical analysis reveals that OPCE yields better HDIC performances on the selected data sets in comparison with eight other HDIC algorithms.This demonstrates that OPCE is a viable algorithm to deal with HDIC problems. 展开更多
关键词 classifier ensemble feature transformation high-dimensional data classification imbalanced learning observation point mechanism
下载PDF
IWD-Miner: A Novel Metaheuristic Algorithm for Medical Data Classification
16
作者 Sarab AlMuhaideb Reem BinGhannam +3 位作者 Nourah Alhelal Shatha Alduheshi Fatimah Alkhamees Raghad Alsuhaibani 《Computers, Materials & Continua》 SCIE EI 2021年第2期1329-1346,共18页
Medical data classification(MDC)refers to the application of classification methods on medical datasets.This work focuses on applying a classification task to medical datasets related to specific diseases in order to ... Medical data classification(MDC)refers to the application of classification methods on medical datasets.This work focuses on applying a classification task to medical datasets related to specific diseases in order to predict the associated diagnosis or prognosis.To gain experts’trust,the prediction and the reasoning behind it are equally important.Accordingly,we confine our research to learn rule-based models because they are transparent and comprehensible.One approach to MDC involves the use of metaheuristic(MH)algorithms.Here we report on the development and testing of a novel MH algorithm:IWD-Miner.This algorithm can be viewed as a fusion of Intelligent Water Drops(IWDs)and AntMiner+.It was subjected to a four-stage sensitivity analysis to optimize its performance.For this purpose,21 publicly available medical datasets were used from the Machine Learning Repository at the University of California Irvine.Interestingly,there were only limited differences in performance between IWDMiner variants which is suggestive of its robustness.Finally,using the same 21 datasets,we compared the performance of the optimized IWD-Miner against two extant algorithms,AntMiner+and J48.The experiments showed that both rival algorithms are considered comparable in the effectiveness to IWD-Miner,as confirmed by the Wilcoxon nonparametric statistical test.Results suggest that IWD-Miner is more efficient than AntMiner+as measured by the average number of fitness evaluations to a solution(1,386,621.30 vs.2,827,283.88 fitness evaluations,respectively).J48 exhibited higher accuracy on average than IWD-Miner(79.58 vs.73.65,respectively)but produced larger models(32.82 leaves vs.8.38 terms,respectively). 展开更多
关键词 Ant colony optimization AntMiner+ IWDs IWD-Miner J48 medical data classification metaheuristic algorithms swarm intelligence
下载PDF
Deep Learning Enabled Microarray Gene Expression Classification for Data Science Applications
17
作者 Areej A.Malibari Reem M.Alshehri +5 位作者 Fahd N.Al-Wesabi Noha Negm Mesfer Al Duhayyim Anwer Mustafa Hilal Ishfaq Yaseen Abdelwahed Motwakel 《Computers, Materials & Continua》 SCIE EI 2022年第11期4277-4290,共14页
In bioinformatics applications,examination of microarray data has received significant interest to diagnose diseases.Microarray gene expression data can be defined by a massive searching space that poses a primary cha... In bioinformatics applications,examination of microarray data has received significant interest to diagnose diseases.Microarray gene expression data can be defined by a massive searching space that poses a primary challenge in the appropriate selection of genes.Microarray data classification incorporates multiple disciplines such as bioinformatics,machine learning(ML),data science,and pattern classification.This paper designs an optimal deep neural network based microarray gene expression classification(ODNN-MGEC)model for bioinformatics applications.The proposed ODNN-MGEC technique performs data normalization process to normalize the data into a uniform scale.Besides,improved fruit fly optimization(IFFO)based feature selection technique is used to reduce the high dimensionality in the biomedical data.Moreover,deep neural network(DNN)model is applied for the classification of microarray gene expression data and the hyperparameter tuning of the DNN model is carried out using the Symbiotic Organisms Search(SOS)algorithm.The utilization of IFFO and SOS algorithms pave the way for accomplishing maximum gene expression classification outcomes.For examining the improved outcomes of the ODNN-MGEC technique,a wide ranging experimental analysis is made against benchmark datasets.The extensive comparison study with recent approaches demonstrates the enhanced outcomes of the ODNN-MGEC technique in terms of different measures. 展开更多
关键词 BIOINFORMATICS data science microarray gene expression data classification deep learning metaheuristics
下载PDF
Fuzzy Logic with Archimedes Optimization Based Biomedical Data Classification Model
18
作者 Mahmoud Ragab Diaa Hamed 《Computers, Materials & Continua》 SCIE EI 2022年第8期4185-4200,共16页
Medical data classification becomes a hot research topic in the healthcare sector to aid physicians in the healthcare sector for decision making.Besides,the advances of machine learning(ML)techniques assist to perform... Medical data classification becomes a hot research topic in the healthcare sector to aid physicians in the healthcare sector for decision making.Besides,the advances of machine learning(ML)techniques assist to perform the effective classification task.With this motivation,this paper presents a Fuzzy Clustering Approach Based on Breadth-first Search Algorithm(FCA-BFS)with optimal support vector machine(OSVM)model,named FCABFS-OSVM for medical data classification.The proposed FCABFS-OSVM technique intends to classify the healthcare data by the use of clustering and classification models.Besides,the proposed FCABFSOSVM technique involves the design of FCABFS technique to cluster the medical data which helps to boost the classification performance.Moreover,the OSVM model investigates the clustered medical data to perform classification process.Furthermore,Archimedes optimization algorithm(AOA)is utilized to the SVM parameters and boost the medical data classification results.A wide range of simulations takes place to highlight the promising performance of the FCABFS-OSVM technique.Extensive comparison studies reported the enhanced outcomes of the FCABFS-OSVM technique over the recent state of art approaches. 展开更多
关键词 CLUSTERING medical data classification machine learning parameter tuning support vector machines
下载PDF
Manta Ray Foraging Optimization with Machine Learning Based Biomedical Data Classification
19
作者 Amal Al-Rasheed Jaber S.Alzahrani +5 位作者 Majdy M.Eltahir Abdullah Mohamed Anwer Mustafa Hilal Abdelwahed Motwakel Abu Sarwar Zamani Mohamed I.Eldesouki 《Computers, Materials & Continua》 SCIE EI 2022年第11期3275-3290,共16页
The biomedical data classification process has received significant attention in recent times due to a massive increase in the generation of healthcare data from various sources.The developments of artificial intellig... The biomedical data classification process has received significant attention in recent times due to a massive increase in the generation of healthcare data from various sources.The developments of artificial intelligence(AI)and machine learning(ML)models assist in the effectual design of medical data classification models.Therefore,this article concentrates on the development of optimal Stacked Long Short Term Memory Sequence-toSequence Autoencoder(OSAE-LSTM)model for biomedical data classification.The presented OSAE-LSTM model intends to classify the biomedical data for the existence of diseases.Primarily,the OSAE-LSTM model involves min-max normalization based pre-processing to scale the data into uniform format.Followed by,the SAE-LSTM model is utilized for the detection and classification of diseases in biomedical data.At last,manta ray foraging optimization(MRFO)algorithm has been employed for hyperparameter optimization process.The utilization of MRFO algorithm assists in optimal selection of hypermeters involved in the SAE-LSTM model.The simulation analysis of the OSAE-LSTM model has been tested using a set of benchmark medical datasets and the results reported the improvements of the OSAELSTM model over the other approaches under several dimensions. 展开更多
关键词 Biomedical data classification deep learning manta ray foraging optimization healthcare machine learning artificial intelligence
下载PDF
Feature Subset Selection with Artificial Intelligence-Based Classification Model for Biomedical Data
20
作者 Jaber S.Alzahrani Reem M.Alshehri +3 位作者 Mohammad Alamgeer Anwer Mustafa Hilal Abdelwahed Motwakel Ishfaq Yaseen 《Computers, Materials & Continua》 SCIE EI 2022年第9期4267-4281,共15页
Recently,medical data classification becomes a hot research topic among healthcare professionals and research communities,which assist in the disease diagnosis and decision making process.The latest developments of ar... Recently,medical data classification becomes a hot research topic among healthcare professionals and research communities,which assist in the disease diagnosis and decision making process.The latest developments of artificial intelligence(AI)approaches paves a way for the design of effective medical data classification models.At the same time,the existence of numerous features in the medical dataset poses a curse of dimensionality problem.For resolving the issues,this article introduces a novel feature subset selection with artificial intelligence based classification model for biomedical data(FSS-AICBD)technique.The FSS-AICBD technique intends to derive a useful set of features and thereby improve the classifier results.Primarily,the FSS-AICBD technique undergoes min-max normalization technique to prevent data complexity.In addition,the information gain(IG)approach is applied for the optimal selection of feature subsets.Also,group search optimizer(GSO)with deep belief network(DBN)model is utilized for biomedical data classification where the hyperparameters of the DBN model can be optimally tuned by the GSO algorithm.The choice of IG and GSO approaches results in promising medical data classification results.The experimental result analysis of the FSS-AICBD technique takes place using different benchmark healthcare datasets.The simulation results reported the enhanced outcomes of the FSS-AICBD technique interms of several measures. 展开更多
关键词 Medical data classification feature selection deep learning healthcare sector artificial intelligence
下载PDF
上一页 1 2 3 下一页 到第
使用帮助 返回顶部