期刊文献+
共找到30篇文章
< 1 2 >
每页显示 20 50 100
Class Imbalanced Problem:Taxonomy,Open Challenges,Applications and State-of-the-Art Solutions
1
作者 Khursheed Ahmad Bhat Shabir Ahmad Sofi 《China Communications》 SCIE CSCD 2024年第11期216-242,共27页
The study of machine learning has revealed that it can unleash new applications in a variety of disciplines.Many limitations limit their expressiveness,and researchers are working to overcome them to fully exploit the... The study of machine learning has revealed that it can unleash new applications in a variety of disciplines.Many limitations limit their expressiveness,and researchers are working to overcome them to fully exploit the power of data-driven machine learning(ML)and deep learning(DL)techniques.The data imbalance presents major hurdles for classification and prediction problems in machine learning,restricting data analytics and acquiring relevant insights in practically all real-world research domains.In visual learning,network information security,failure prediction,digital marketing,healthcare,and a variety of other domains,raw data suffers from a biased data distribution of one class over the other.This article aims to present a taxonomy of the approaches for handling imbalanced data problems and their comparative study on the classification metrics and their application areas.We have explored very recent trends of techniques employed for solutions to class imbalance problems in datasets and have also discussed their limitations.This article has also identified open challenges for further research in the direction of class data imbalance. 展开更多
关键词 class imbalance classification deep learning GANs sampling
下载PDF
Combined Effect of Concept Drift and Class Imbalance on Model Performance During Stream Classification
2
作者 Abdul Sattar Palli Jafreezal Jaafar +3 位作者 Manzoor Ahmed Hashmani Heitor Murilo Gomes Aeshah Alsughayyir Abdul Rehman Gilal 《Computers, Materials & Continua》 SCIE EI 2023年第4期1827-1845,共19页
Every application in a smart city environment like the smart grid,health monitoring, security, and surveillance generates non-stationary datastreams. Due to such nature, the statistical properties of data changes over... Every application in a smart city environment like the smart grid,health monitoring, security, and surveillance generates non-stationary datastreams. Due to such nature, the statistical properties of data changes overtime, leading to class imbalance and concept drift issues. Both these issuescause model performance degradation. Most of the current work has beenfocused on developing an ensemble strategy by training a new classifier on thelatest data to resolve the issue. These techniques suffer while training the newclassifier if the data is imbalanced. Also, the class imbalance ratio may changegreatly from one input stream to another, making the problem more complex.The existing solutions proposed for addressing the combined issue of classimbalance and concept drift are lacking in understating of correlation of oneproblem with the other. This work studies the association between conceptdrift and class imbalance ratio and then demonstrates how changes in classimbalance ratio along with concept drift affect the classifier’s performance.We analyzed the effect of both the issues on minority and majority classesindividually. To do this, we conducted experiments on benchmark datasetsusing state-of-the-art classifiers especially designed for data stream classification.Precision, recall, F1 score, and geometric mean were used to measure theperformance. Our findings show that when both class imbalance and conceptdrift problems occur together the performance can decrease up to 15%. Ourresults also show that the increase in the imbalance ratio can cause a 10% to15% decrease in the precision scores of both minority and majority classes.The study findings may help in designing intelligent and adaptive solutionsthat can cope with the challenges of non-stationary data streams like conceptdrift and class imbalance. 展开更多
关键词 classIFICATION data streams class imbalance concept drift class imbalance ratio
下载PDF
Attenuate Class Imbalance Problem for Pneumonia Diagnosis Using Ensemble Parallel Stacked Pre-Trained Models
3
作者 Aswathy Ravikumar Harini Sriraman 《Computers, Materials & Continua》 SCIE EI 2023年第4期891-909,共19页
Pneumonia is an acute lung infection that has caused many fatalitiesglobally. Radiologists often employ chest X-rays to identify pneumoniasince they are presently the most effective imaging method for this purpose.Com... Pneumonia is an acute lung infection that has caused many fatalitiesglobally. Radiologists often employ chest X-rays to identify pneumoniasince they are presently the most effective imaging method for this purpose.Computer-aided diagnosis of pneumonia using deep learning techniques iswidely used due to its effectiveness and performance. In the proposed method,the Synthetic Minority Oversampling Technique (SMOTE) approach is usedto eliminate the class imbalance in the X-ray dataset. To compensate forthe paucity of accessible data, pre-trained transfer learning is used, and anensemble Convolutional Neural Network (CNN) model is developed. Theensemble model consists of all possible combinations of the MobileNetv2,Visual Geometry Group (VGG16), and DenseNet169 models. MobileNetV2and DenseNet169 performed well in the Single classifier model, with anaccuracy of 94%, while the ensemble model (MobileNetV2+DenseNet169)achieved an accuracy of 96.9%. Using the data synchronous parallel modelin Distributed Tensorflow, the training process accelerated performance by98.6% and outperformed other conventional approaches. 展开更多
关键词 Pneumonia prediction distributed deep learning data parallel model ensemble deep learning class imbalance skewed data
下载PDF
BLS-identification:A device fingerprint classification mechanism based on broad learning for Internet of Things
4
作者 Yu Zhang Bei Gong Qian Wang 《Digital Communications and Networks》 SCIE CSCD 2024年第3期728-739,共12页
The popularity of the Internet of Things(IoT)has enabled a large number of vulnerable devices to connect to the Internet,bringing huge security risks.As a network-level security authentication method,device fingerprin... The popularity of the Internet of Things(IoT)has enabled a large number of vulnerable devices to connect to the Internet,bringing huge security risks.As a network-level security authentication method,device fingerprint based on machine learning has attracted considerable attention because it can detect vulnerable devices in complex and heterogeneous access phases.However,flexible and diversified IoT devices with limited resources increase dif-ficulty of the device fingerprint authentication method executed in IoT,because it needs to retrain the model network to deal with incremental features or types.To address this problem,a device fingerprinting mechanism based on a Broad Learning System(BLS)is proposed in this paper.The mechanism firstly characterizes IoT devices by traffic analysis based on the identifiable differences of the traffic data of IoT devices,and extracts feature parameters of the traffic packets.A hierarchical hybrid sampling method is designed at the preprocessing phase to improve the imbalanced data distribution and reconstruct the fingerprint dataset.The complexity of the dataset is reduced using Principal Component Analysis(PCA)and the device type is identified by training weights using BLS.The experimental results show that the proposed method can achieve state-of-the-art accuracy and spend less training time than other existing methods. 展开更多
关键词 Device fingerprint Traffic analysis class imbalance Broad learning system Access authentication
下载PDF
MCBC-SMOTE:A Majority Clustering Model for Classification of Imbalanced Data
5
作者 Jyoti Arora Meena Tushir +4 位作者 Keshav Sharma Lalit Mohan Aman Singh Abdullah Alharbi Wael Alosaimi 《Computers, Materials & Continua》 SCIE EI 2022年第12期4801-4817,共17页
Datasets with the imbalanced class distribution are difficult to handle with the standard classification algorithms.In supervised learning,dealing with the problem of class imbalance is still considered to be a challe... Datasets with the imbalanced class distribution are difficult to handle with the standard classification algorithms.In supervised learning,dealing with the problem of class imbalance is still considered to be a challenging research problem.Various machine learning techniques are designed to operate on balanced datasets;therefore,the state of the art,different undersampling,over-sampling and hybrid strategies have been proposed to deal with the problem of imbalanced datasets,but highly skewed datasets still pose the problem of generalization and noise generation during resampling.To overcome these problems,this paper proposes amajority clusteringmodel for classification of imbalanced datasets known as MCBC-SMOTE(Majority Clustering for balanced Classification-SMOTE).The model provides a method to convert the problem of binary classification into a multi-class problem.In the proposed algorithm,the number of clusters for themajority class is calculated using the elbow method and the minority class is over-sampled as an average of clustered majority classes to generate a symmetrical class distribution.The proposed technique is cost-effective,reduces the problem of noise generation and successfully disables the imbalances present in between and within classes.The results of the evaluations on diverse real datasets proved to provide better classification results as compared to state of the art existing methodologies based on several performance metrics. 展开更多
关键词 imbalance class problem classIFICATION SMOTE K-MEANS CLUSTERING sampling
下载PDF
Handling Class Imbalance in Online Transaction Fraud Detection
6
作者 Kanika Jimmy Singla +3 位作者 Ali Kashif Bashir Yunyoung Nam Najam UI Hasan Usman Tariq 《Computers, Materials & Continua》 SCIE EI 2022年第2期2861-2877,共17页
With the rise of internet facilities,a greater number of people have started doing online transactions at an exponential rate in recent years as the online transaction system has eliminated the need of going to the ba... With the rise of internet facilities,a greater number of people have started doing online transactions at an exponential rate in recent years as the online transaction system has eliminated the need of going to the bank physically for every transaction.However,the fraud cases have also increased causing the loss of money to the consumers.Hence,an effective fraud detection system is the need of the hour which can detect fraudulent transactions automatically in real-time.Generally,the genuine transactions are large in number than the fraudulent transactions which leads to the class imbalance problem.In this research work,an online transaction fraud detection system using deep learning has been proposed which can handle class imbalance problem by applying algorithm-level methods which modify the learning of the model to focus more on the minority class i.e.,fraud transactions.A novel loss function named Weighted Hard-Reduced Focal Loss(WH-RFL)has been proposed which has achieved maximum fraud detection rate i.e.,True PositiveRate(TPR)at the cost of misclassification of few genuine transactions as high TPR is preferred over a high True Negative Rate(TNR)in fraud detection system and same has been demonstrated using three publicly available imbalanced transactional datasets.Also,Thresholding has been applied to optimize the decision threshold using cross-validation to detect maximum number of frauds and it has been demonstrated by the experimental results that the selection of the right thresholding method with deep learning yields better results. 展开更多
关键词 class imbalance deep learning fraud detection loss function THRESHOLDING
下载PDF
Dealing with the Class Imbalance Problem in the Detection of Fake Job Descriptions
7
作者 Minh Thanh Vo Anh H.Vo +2 位作者 Trang Nguyen Rohit Sharma Tuong Le 《Computers, Materials & Continua》 SCIE EI 2021年第7期521-535,共15页
In recent years,the detection of fake job descriptions has become increasingly necessary because social networking has changed the way people access burgeoning information in the internet age.Identifying fraud in job ... In recent years,the detection of fake job descriptions has become increasingly necessary because social networking has changed the way people access burgeoning information in the internet age.Identifying fraud in job descriptions can help jobseekers to avoid many of the risks of job hunting.However,the problem of detecting fake job descriptions comes up against the problem of class imbalance when the number of genuine jobs exceeds the number of fake jobs.This causes a reduction in the predictability and performance of traditional machine learning models.We therefore present an efficient framework that uses an oversampling technique called FJD-OT(Fake Job Description Detection Using Oversampling Techniques)to improve the predictability of detecting fake job descriptions.In the proposed framework,we apply several techniques including the removal of stop words and the use of a tokenizer to preprocess the text data in the first module.We then use a bag of words in combination with the term frequency-inverse document frequency(TF-IDF)approach to extract the features from the text data to create the feature dataset in the second module.Next,our framework applies k-fold cross-validation,a commonly used technique to test the effectiveness of machine learning models,that splits the experimental dataset[the Employment Scam Aegean(ESA)dataset in our study]into training and test sets for evaluation.The training set is passed through the third module,an oversampling module in which the SVMSMOTE method is used to balance data before training the classifiers in the last module.The experimental results indicate that the proposed approach significantly improves the predictability of fake job description detection on the ESA dataset based on several popular performance metrics. 展开更多
关键词 Fake job description detection class imbalance problem oversampling techniques
下载PDF
Scientific Elegance in NIDS: Unveiling Cardinality Reduction, Box-Cox Transformation, and ADASYN for Enhanced Intrusion Detection
8
作者 Amerah Alabrah 《Computers, Materials & Continua》 SCIE EI 2024年第6期3897-3912,共16页
The emergence of digital networks and the wide adoption of information on internet platforms have given rise to threats against users’private information.Many intruders actively seek such private data either for sale... The emergence of digital networks and the wide adoption of information on internet platforms have given rise to threats against users’private information.Many intruders actively seek such private data either for sale or other inappropriate purposes.Similarly,national and international organizations have country-level and company-level private information that could be accessed by different network attacks.Therefore,the need for a Network Intruder Detection System(NIDS)becomes essential for protecting these networks and organizations.In the evolution of NIDS,Artificial Intelligence(AI)assisted tools and methods have been widely adopted to provide effective solutions.However,the development of NIDS still faces challenges at the dataset and machine learning levels,such as large deviations in numeric features,the presence of numerous irrelevant categorical features resulting in reduced cardinality,and class imbalance in multiclass-level data.To address these challenges and offer a unified solution to NIDS development,this study proposes a novel framework that preprocesses datasets and applies a box-cox transformation to linearly transform the numeric features and bring them into closer alignment.Cardinality reduction was applied to categorical features through the binning method.Subsequently,the class imbalance dataset was addressed using the adaptive synthetic sampling data generation method.Finally,the preprocessed,refined,and oversampled feature set was divided into training and test sets with an 80–20 ratio,and two experiments were conducted.In Experiment 1,the binary classification was executed using four machine learning classifiers,with the extra trees classifier achieving the highest accuracy of 97.23%and an AUC of 0.9961.In Experiment 2,multiclass classification was performed,and the extra trees classifier emerged as the most effective,achieving an accuracy of 81.27%and an AUC of 0.97.The results were evaluated based on training,testing,and total time,and a comparative analysis with state-of-the-art studies proved the robustness and significance of the applied methods in developing a timely and precision-efficient solution to NIDS. 展开更多
关键词 Adaptive synthetic sampling class imbalance features cardinality network security over sampling
下载PDF
CMAGAN:classifier-aided minority augmentation generative adversarial networks for industrial imbalanced data and its application to fault prediction
9
作者 Wen-Jie Wang Zhao Liu Ping Zhu 《Advances in Manufacturing》 SCIE EI CAS CSCD 2024年第3期603-618,共16页
Class imbalance is a common characteristic of industrial data that adversely affects industrial data mining because it leads to the biased training of machine learning models.To address this issue,the augmentation of ... Class imbalance is a common characteristic of industrial data that adversely affects industrial data mining because it leads to the biased training of machine learning models.To address this issue,the augmentation of samples in minority classes based on generative adversarial networks(GANs)has been demonstrated as an effective approach.This study proposes a novel GAN-based minority class augmentation approach named classifier-aided minority augmentation generative adversarial network(CMAGAN).In the CMAGAN framework,an outlier elimination strategy is first applied to each class to minimize the negative impacts of outliers.Subsequently,a newly designed boundary-strengthening learning GAN(BSLGAN)is employed to generate additional samples for minority classes.By incorporating a supplementary classifier and innovative training mechanisms,the BSLGAN focuses on learning the distribution of samples near classification boundaries.Consequently,it can fully capture the characteristics of the target class and generate highly realistic samples with clear boundaries.Finally,the new samples are filtered based on the Mahalanobis distance to ensure that they are within the desired distribution.To evaluate the effectiveness of the proposed approach,CMAGAN was used to solve the class imbalance problem in eight real-world fault-prediction applications.The performance of CMAGAN was compared with that of seven other algorithms,including state-of-the-art GAN-based methods,and the results indicated that CMAGAN could provide higher-quality augmented results. 展开更多
关键词 class imbalance Minority class augmentation Generative adversarial network(GAN) Boundary strengthening learning(BSL) Fault prediction
原文传递
Integrating deep learning and logging data analytics for lithofacies classification and 3D modeling of tight sandstone reservoirs 被引量:2
10
作者 Jing-Jing Liu Jian-Chao Liu 《Geoscience Frontiers》 SCIE CAS CSCD 2022年第1期350-363,共14页
The lithofacies classification is essential for oil and gas reservoir exploration and development.The traditional method of lithofacies classification is based on"core calibration logging"and the experience ... The lithofacies classification is essential for oil and gas reservoir exploration and development.The traditional method of lithofacies classification is based on"core calibration logging"and the experience of geologists.This approach has strong subjectivity,low efficiency,and high uncertainty.This uncertainty may be one of the key factors affecting the results of 3 D modeling of tight sandstone reservoirs.In recent years,deep learning,which is a cutting-edge artificial intelligence technology,has attracted attention from various fields.However,the study of deep-learning techniques in the field of lithofacies classification has not been sufficient.Therefore,this paper proposes a novel hybrid deep-learning model based on the efficient data feature-extraction ability of convolutional neural networks(CNN)and the excellent ability to describe time-dependent features of long short-term memory networks(LSTM)to conduct lithological facies-classification experiments.The results of a series of experiments show that the hybrid CNN-LSTM model had an average accuracy of 87.3%and the best classification effect compared to the CNN,LSTM or the three commonly used machine learning models(Support vector machine,random forest,and gradient boosting decision tree).In addition,the borderline synthetic minority oversampling technique(BSMOTE)is introduced to address the class-imbalance issue of raw data.The results show that processed data balance can significantly improve the accuracy of lithofacies classification.Beside that,based on the fine lithofacies constraints,the sequential indicator simulation method is used to establish a three-dimensional lithofacies model,which completes the fine description of the spatial distribution of tight sandstone reservoirs in the study area.According to this comprehensive analysis,the proposed CNN-LSTM model,which eliminates class imbalance,can be effectively applied to lithofacies classification,and is expected to improve the reality of the geological model for the tight sandstone reservoirs. 展开更多
关键词 Deep learning Convolutional neural networks LSTM Lithological-facies classification 3D modeling class imbalance
下载PDF
An Accurate and Extensible Machine Learning Classifier for Flow-Level Traffic Classification 被引量:2
11
作者 Gang Lu Ronghua Guo +1 位作者 Ying Zhou Jing Du 《China Communications》 SCIE CSCD 2018年第6期125-138,共14页
Machine Learning(ML) techniques have been widely applied in recent traffic classification.However, the problems of both discriminator bias and class imbalance decrease the accuracies of ML based traffic classifier. In... Machine Learning(ML) techniques have been widely applied in recent traffic classification.However, the problems of both discriminator bias and class imbalance decrease the accuracies of ML based traffic classifier. In this paper, we propose an accurate and extensible traffic classifier. Specifically, to address the discriminator bias issue, our classifier is built by making an optimal cascade of binary sub-classifiers, where each binary sub-classifier is trained independently with the discriminators used for identifying application specific traffic. Moreover, to balance a training dataset,we apply SMOTE algorithm in generating artificial training samples for minority classes.We evaluate our classifier on two datasets collected from different network border routers.Compared with the previous multi-class traffic classifiers built in one-time training process,our classifier achieves much higher F-Measure and AUC for each application. 展开更多
关键词 traffic classification class imbalance dircriminator bias encrypted traffic machine learning
下载PDF
Study on Multi-Label Classification of Medical Dispute Documents 被引量:2
12
作者 Baili Zhang Shan Zhou +2 位作者 Le Yang Jianhua Lv Mingjun Zhong 《Computers, Materials & Continua》 SCIE EI 2020年第12期1975-1986,共12页
The Internet of Medical Things(IoMT)will come to be of great importance in the mediation of medical disputes,as it is emerging as the core of intelligent medical treatment.First,IoMT can track the entire medical treat... The Internet of Medical Things(IoMT)will come to be of great importance in the mediation of medical disputes,as it is emerging as the core of intelligent medical treatment.First,IoMT can track the entire medical treatment process in order to provide detailed trace data in medical dispute resolution.Second,IoMT can infiltrate the ongoing treatment and provide timely intelligent decision support to medical staff.This information includes recommendation of similar historical cases,guidance for medical treatment,alerting of hired dispute profiteers etc.The multi-label classification of medical dispute documents(MDDs)plays an important role as a front-end process for intelligent decision support,especially in the recommendation of similar historical cases.However,MDDs usually appear as long texts containing a large amount of redundant information,and there is a serious distribution imbalance in the dataset,which directly leads to weaker classification performance.Accordingly,in this paper,a multi-label classification method based on key sentence extraction is proposed for MDDs.The method is divided into two parts.First,the attention-based hierarchical bi-directional long short-term memory(BiLSTM)model is used to extract key sentences from documents;second,random comprehensive sampling Bagging(RCS-Bagging),which is an ensemble multi-label classification model,is employed to classify MDDs based on key sentence sets.The use of this approach greatly improves the classification performance.Experiments show that the performance of the two models proposed in this paper is remarkably better than that of the baseline methods. 展开更多
关键词 Internet of Medical Things(IoMT) medical disputes medical dispute document(MDD) multi-label classification(MLC) key sentence extraction class imbalance
下载PDF
Drug discrimination of Near Infrared spectroscopy based on the scaled convex hull classifier
13
作者 Zhenbing Liu Shujie Jiang Huihua Yang 《Journal of Innovative Optical Health Sciences》 SCIE EI CAS 2014年第4期101-110,共10页
Near Infrared spectroscopy(NIRS)has been widely used in the discrimination(classification)of pharmaceutical drugs.In real applications,however,the class imbalance of the drug samples,i.e.,the number of one drug sample... Near Infrared spectroscopy(NIRS)has been widely used in the discrimination(classification)of pharmaceutical drugs.In real applications,however,the class imbalance of the drug samples,i.e.,the number of one drug sample may be much larger than the number of the other drugs,deceasesdrastically the discrimination performance of the classification models.To address this classimbalance problem,a new computational method--the scaled convex hull(SCH)-basedmaximum margin classifier is proposed in this paper.By a suitable selection of the reductionfactor of the SCHs generated by the two classes of drug samples,respectively,the maximalmargin classifier bet ween SCHs can be constructed which can obtain good classification per-formance.With an optimization of the parameters involved in the modeling by Cuckoo Search,a satisfied model is achieved for the classification of the drug.The experiments on spectra samplesproduced by a pharmaceutical company show that the proposed method is more effective androbust than the existing ones. 展开更多
关键词 Drug classification Near Infrared spectroscopy class imbalance scaled convex hulls
下载PDF
Ensemble Learning Models for Classification and Selection of Web Services: A Review
14
作者 Muhammad Hasnain Imran Ghani +1 位作者 Seung Ryul Jeong Aitizaz Ali 《Computer Systems Science & Engineering》 SCIE EI 2022年第1期327-339,共13页
This paper presents a review of the ensemble learning models proposed for web services classification,selection,and composition.Web service is an evo-lutionary research area,and ensemble learning has become a hot spot... This paper presents a review of the ensemble learning models proposed for web services classification,selection,and composition.Web service is an evo-lutionary research area,and ensemble learning has become a hot spot to assess web services’earlier mentioned aspects.The proposed research aims to review the state of art approaches performed on the interesting web services area.The literature on the research topic is examined using the preferred reporting items for systematic reviews and meta-analyses(PRISMA)as a research method.The study reveals an increasing trend of using ensemble learning in the chosen papers within the last ten years.Naïve Bayes(NB),Support Vector Machine’(SVM),and other classifiers were identified as widely explored in selected studies.Core analysis of web services classification suggests that web services’performance aspects can be investigated in future works.This paper also identified performance measuring metrics,including accuracy,precision,recall,and f-measure,widely used in the literature. 展开更多
关键词 Web services composition quality improvement class imbalance machine learning
下载PDF
Deep learning based classification of sheep behaviour from accelerometer data with imbalance 被引量:2
15
作者 Kirk E.Turner Andrew Thompson +2 位作者 Ian Harris Mark Ferguson Ferdous Sohel 《Information Processing in Agriculture》 EI CSCD 2023年第3期377-390,共14页
Classification of sheep behaviour from a sequence of tri-axial accelerometer data has the potential to enhance sheep management.Sheep behaviour is inherently imbalanced(e.g.,more ruminating than walking)resulting in u... Classification of sheep behaviour from a sequence of tri-axial accelerometer data has the potential to enhance sheep management.Sheep behaviour is inherently imbalanced(e.g.,more ruminating than walking)resulting in underperforming classification for the minority activities which hold importance.Existing works have not addressed class imbalance and use traditional machine learning techniques,e.g.,Random Forest(RF).We investigated Deep Learning(DL)models,namely,Long Short Term Memory(LSTM)and Bidirectional LSTM(BLSTM),appropriate for sequential data,from imbalanced data.Two data sets were collected in normal grazing conditions using jaw-mounted and earmounted sensors.Novel to this study,alongside typical single classes,e.g.,walking,depending on the behaviours,data samples were labelled with compound classes,e.g.,walking_-grazing.The number of steps a sheep performed in the observed 10 s time window was also recorded and incorporated in the models.We designed several multi-class classification studies with imbalance being addressed using synthetic data.DL models achieved superior performance to traditional ML models,especially with augmented data(e.g.,4-Class+Steps:LSTM 88.0%,RF 82.5%).DL methods showed superior generalisability on unseen sheep(i.e.,F1-score:BLSTM 0.84,LSTM 0.83,RF 0.65).LSTM,BLSTM and RF achieved sub-millisecond average inference time,making them suitable for real-time applications.The results demonstrate the effectiveness of DL models for sheep behaviour classification in grazing conditions.The results also demonstrate the DL techniques can generalise across different sheep.The study presents a strong foundation of the development of such models for real-time animal monitoring. 展开更多
关键词 Sheep behaviour classification Data synthesis class imbalance Grazing sheep
原文传递
Enhanced Coyote Optimization with Deep Learning Based Cloud-Intrusion Detection System 被引量:1
16
作者 Abdullah M.Basahel Mohammad Yamin +1 位作者 Sulafah M.Basahel E.Laxmi Lydia 《Computers, Materials & Continua》 SCIE EI 2023年第2期4319-4336,共18页
Cloud Computing(CC)is the preference of all information technology(IT)organizations as it offers pay-per-use based and flexible services to its users.But the privacy and security become the main hindrances in its achi... Cloud Computing(CC)is the preference of all information technology(IT)organizations as it offers pay-per-use based and flexible services to its users.But the privacy and security become the main hindrances in its achievement due to distributed and open architecture that is prone to intruders.Intrusion Detection System(IDS)refers to one of the commonly utilized system for detecting attacks on cloud.IDS proves to be an effective and promising technique,that identifies malicious activities and known threats by observing traffic data in computers,and warnings are given when such threatswere identified.The current mainstream IDS are assisted with machine learning(ML)but have issues of low detection rates and demanded wide feature engineering.This article devises an Enhanced Coyote Optimization with Deep Learning based Intrusion Detection System for Cloud Security(ECODL-IDSCS)model.The ECODL-IDSCS model initially addresses the class imbalance data problem by the use of Adaptive Synthetic(ADASYN)technique.For detecting and classification of intrusions,long short term memory(LSTM)model is exploited.In addition,ECO algorithm is derived to optimally fine tune the hyperparameters related to the LSTM model to enhance its detection efficiency in the cloud environment.Once the presented ECODL-IDSCS model is tested on benchmark dataset,the experimental results show the promising performance of the ECODL-IDSCS model over the existing IDS models. 展开更多
关键词 Intrusion detection system cloud security coyote optimization algorithm class imbalance data deep learning
下载PDF
Challenges and limitations of synthetic minority oversampling techniques in machine learning
17
作者 Ibraheem M Alkhawaldeh Ibrahem Albalkhi Abdulqadir Jeprel Naswhan 《World Journal of Methodology》 2023年第5期373-378,共6页
Oversampling is the most utilized approach to deal with class-imbalanced datasets,as seen by the plethora of oversampling methods developed in the last two decades.We argue in the following editorial the issues with o... Oversampling is the most utilized approach to deal with class-imbalanced datasets,as seen by the plethora of oversampling methods developed in the last two decades.We argue in the following editorial the issues with oversampling that stem from the possibility of overfitting and the generation of synthetic cases that might not accurately represent the minority class.These limitations should be considered when using oversampling techniques.We also propose several alternate strategies for dealing with imbalanced data,as well as a future work perspective. 展开更多
关键词 Machine learning class imbalance OVERFITTING MISDIAGNOSIS
下载PDF
Online Feature Selection of Class Imbalance via PA Algorithm 被引量:4
18
作者 Chao Han Yun-Kun Tan +3 位作者 Jin-Hui Zhu Yong Guo Jian Chen Qing-Yao Wu 《Journal of Computer Science & Technology》 SCIE EI CSCD 2016年第4期673-682,共10页
Imbalance classification techniques have been frequently applied in many machine learning application domains where the number of the majority (or positive) class of a dataset is much larger than that of the minori... Imbalance classification techniques have been frequently applied in many machine learning application domains where the number of the majority (or positive) class of a dataset is much larger than that of the minority (or negative) class. Meanwhile, feature selection (FS) is one of the key techniques for the high-dimensional classification task in a manner which greatly improves the classification performance and the computational efficiency. However, most studies of feature selection and imbalance classification are restricted to off-line batch learning, which is not well adapted to some practical scenarios. In this paper, we aim to solve high-dimensional imbalanced classification problem accurately and efficiently with only a small number of active features in an online fashion, and we propose two novel online learning algorithms for this purpose. In our approach, a classifier which involves only a small and fixed number of features is constructed to classify a sequence of imbalanced data received in an online manner. We formulate the construction of such online learner into an optimization problem and use an iterative approach to solve the problem based on the passive-aggressive (PA) algorithm as well as a truncated gradient (TG) method. We evaluate the performance of the proposed algorithms based on several real-world datasets, and our experimental results have demonstrated the effectiveness of the proposed algorithms in comparison with the baselines. 展开更多
关键词 online learning feature selection class imbalance passive-aggressive (PA) algorithm
原文传递
Optimized Stacked Autoencoder for IoT Enabled Financial Crisis Prediction Model 被引量:2
19
作者 Mesfer Al Duhayyim Hadeel Alsolai +5 位作者 Fahd N.Al-Wesabi Nadhem Nemri Hany Mahgoub Anwer Mustafa Hilal Manar Ahmed Hamza Mohammed Rizwanullah 《Computers, Materials & Continua》 SCIE EI 2022年第4期1079-1094,共16页
Recently,Financial Technology(FinTech)has received more attention among financial sectors and researchers to derive effective solutions for any financial institution or firm.Financial crisis prediction(FCP)is an essen... Recently,Financial Technology(FinTech)has received more attention among financial sectors and researchers to derive effective solutions for any financial institution or firm.Financial crisis prediction(FCP)is an essential topic in business sector that finds it useful to identify the financial condition of a financial institution.At the same time,the development of the internet of things(IoT)has altered the mode of human interaction with the physical world.The IoT can be combined with the FCP model to examine the financial data from the users and perform decision making process.This paper presents a novel multi-objective squirrel search optimization algorithm with stacked autoencoder(MOSSA-SAE)model for FCP in IoT environment.The MOSSA-SAE model encompasses different subprocesses namely preprocessing,class imbalance handling,parameter tuning,and classification.Primarily,the MOSSA-SAE model allows the IoT devices such as smartphones,laptops,etc.,to collect the financial details of the users which are then transmitted to the cloud for further analysis.In addition,SMOTE technique is employed to handle class imbalance problems.The goal of MOSSA in SMOTE is to determine the oversampling rate and area of nearest neighbors of SMOTE.Besides,SAE model is utilized as a classification technique to determine the class label of the financial data.At the same time,the MOSSA is applied to appropriately select the‘weights’and‘bias’values of the SAE.An extensive experimental validation process is performed on the benchmark financial dataset and the results are examined under distinct aspects.The experimental values ensured the superior performance of the MOSSA-SAE model on the applied dataset. 展开更多
关键词 Financial data financial crisis prediction class imbalance problem internet of things stacked autoencoder
下载PDF
Iterative Semi-Supervised Learning Using Softmax Probability 被引量:1
20
作者 Heewon Chung Jinseok Lee 《Computers, Materials & Continua》 SCIE EI 2022年第9期5607-5628,共22页
For the classification problem in practice,one of the challenging issues is to obtain enough labeled data for training.Moreover,even if such labeled data has been sufficiently accumulated,most datasets often exhibit l... For the classification problem in practice,one of the challenging issues is to obtain enough labeled data for training.Moreover,even if such labeled data has been sufficiently accumulated,most datasets often exhibit long-tailed distribution with heavy class imbalance,which results in a biased model towards a majority class.To alleviate such class imbalance,semisupervised learning methods using additional unlabeled data have been considered.However,as a matter of course,the accuracy is much lower than that from supervised learning.In this study,under the assumption that additional unlabeled data is available,we propose the iterative semi-supervised learning algorithms,which iteratively correct the labeling of the extra unlabeled data based on softmax probabilities.The results show that the proposed algorithms provide the accuracy as high as that from the supervised learning.To validate the proposed algorithms,we tested on the two scenarios:with the balanced unlabeled dataset and with the imbalanced unlabeled dataset.Under both scenarios,our proposed semi-supervised learning algorithms provided higher accuracy than previous state-of-the-arts.Code is available at https://github.com/HeewonChung92/iterative-semi-learning. 展开更多
关键词 Semi-supervised learning class imbalance iterative learning unlabeled data
下载PDF
上一页 1 2 下一页 到第
使用帮助 返回顶部