期刊文献+
共找到11篇文章
< 1 >
每页显示 20 50 100
Concept Drift Analysis and Malware Attack Detection System Using Secure Adaptive Windowing
1
作者 Emad Alsuwat Suhare Solaiman Hatim Alsuwat 《Computers, Materials & Continua》 SCIE EI 2023年第5期3743-3759,共17页
Concept drift is a main security issue that has to be resolved since it presents a significant barrier to the deployment of machine learning(ML)models.Due to attackers’(and/or benign equivalents’)dynamic behavior ch... Concept drift is a main security issue that has to be resolved since it presents a significant barrier to the deployment of machine learning(ML)models.Due to attackers’(and/or benign equivalents’)dynamic behavior changes,testing data distribution frequently diverges from original training data over time,resulting in substantial model failures.Due to their dispersed and dynamic nature,distributed denial-of-service attacks pose a danger to cybersecurity,resulting in attacks with serious consequences for users and businesses.This paper proposes a novel design for concept drift analysis and detection of malware attacks like Distributed Denial of Service(DDOS)in the network.The goal of this architecture combination is to accurately represent data and create an effective cyber security prediction agent.The intrusion detection system and concept drift of the network has been analyzed using secure adaptive windowing with website data authentication protocol(SAW_WDA).The network has been analyzed by authentication protocol to avoid malware attacks.The data of network users will be collected and classified using multilayer perceptron gradient decision tree(MLPGDT)classifiers.Based on the classification output,the decision for the detection of attackers and authorized users will be identified.The experimental results show output based on intrusion detection and concept drift analysis systems in terms of throughput,end-end delay,network security,network concept drift,and results based on classification with regard to accuracy,memory,and precision and F-1 score. 展开更多
关键词 concept drift machine learning DDOS cyber security SAW_WDA MLPGDT
下载PDF
Combined Effect of Concept Drift and Class Imbalance on Model Performance During Stream Classification
2
作者 Abdul Sattar Palli Jafreezal Jaafar +3 位作者 Manzoor Ahmed Hashmani Heitor Murilo Gomes Aeshah Alsughayyir Abdul Rehman Gilal 《Computers, Materials & Continua》 SCIE EI 2023年第4期1827-1845,共19页
Every application in a smart city environment like the smart grid,health monitoring, security, and surveillance generates non-stationary datastreams. Due to such nature, the statistical properties of data changes over... Every application in a smart city environment like the smart grid,health monitoring, security, and surveillance generates non-stationary datastreams. Due to such nature, the statistical properties of data changes overtime, leading to class imbalance and concept drift issues. Both these issuescause model performance degradation. Most of the current work has beenfocused on developing an ensemble strategy by training a new classifier on thelatest data to resolve the issue. These techniques suffer while training the newclassifier if the data is imbalanced. Also, the class imbalance ratio may changegreatly from one input stream to another, making the problem more complex.The existing solutions proposed for addressing the combined issue of classimbalance and concept drift are lacking in understating of correlation of oneproblem with the other. This work studies the association between conceptdrift and class imbalance ratio and then demonstrates how changes in classimbalance ratio along with concept drift affect the classifier’s performance.We analyzed the effect of both the issues on minority and majority classesindividually. To do this, we conducted experiments on benchmark datasetsusing state-of-the-art classifiers especially designed for data stream classification.Precision, recall, F1 score, and geometric mean were used to measure theperformance. Our findings show that when both class imbalance and conceptdrift problems occur together the performance can decrease up to 15%. Ourresults also show that the increase in the imbalance ratio can cause a 10% to15% decrease in the precision scores of both minority and majority classes.The study findings may help in designing intelligent and adaptive solutionsthat can cope with the challenges of non-stationary data streams like conceptdrift and class imbalance. 展开更多
关键词 CLASSIFICATION data streams class imbalance concept drift class imbalance ratio
下载PDF
An Optimal Big Data Analytics with Concept Drift Detection on High-Dimensional Streaming Data
3
作者 Romany F.Mansour Shaha Al-Otaibi +3 位作者 Amal Al-Rasheed Hanan Aljuaid Irina V.Pustokhina Denis A.Pustokhin 《Computers, Materials & Continua》 SCIE EI 2021年第9期2843-2858,共16页
Big data streams started becoming ubiquitous in recent years,thanks to rapid generation of massive volumes of data by different applications.It is challenging to apply existing data mining tools and techniques directl... Big data streams started becoming ubiquitous in recent years,thanks to rapid generation of massive volumes of data by different applications.It is challenging to apply existing data mining tools and techniques directly in these big data streams.At the same time,streaming data from several applications results in two major problems such as class imbalance and concept drift.The current research paper presents a new Multi-Objective Metaheuristic Optimization-based Big Data Analytics with Concept Drift Detection(MOMBD-CDD)method on High-Dimensional Streaming Data.The presented MOMBD-CDD model has different operational stages such as pre-processing,CDD,and classification.MOMBD-CDD model overcomes class imbalance problem by Synthetic Minority Over-sampling Technique(SMOTE).In order to determine the oversampling rates and neighboring point values of SMOTE,Glowworm Swarm Optimization(GSO)algorithm is employed.Besides,Statistical Test of Equal Proportions(STEPD),a CDD technique is also utilized.Finally,Bidirectional Long Short-Term Memory(Bi-LSTM)model is applied for classification.In order to improve classification performance and to compute the optimum parameters for Bi-LSTM model,GSO-based hyperparameter tuning process is carried out.The performance of the presented model was evaluated using high dimensional benchmark streaming datasets namely intrusion detection(NSL KDDCup)dataset and ECUE spam dataset.An extensive experimental validation process confirmed the effective outcome of MOMBD-CDD model.The proposed model attained high accuracy of 97.45%and 94.23%on the applied KDDCup99 Dataset and ECUE Spam datasets respectively. 展开更多
关键词 Streaming data concept drift classification model deep learning class imbalance data
下载PDF
Learning Association Rules and Tracking the Changing Concepts on Webpages:An Effective Pornographic Websites Filtering Approach
4
作者 Jyh-Jian Sheu 《Journal of Electronic Science and Technology》 CAS CSCD 2018年第1期24-36,共13页
We applied the decision tree algorithm to learn association rules between webpage’s category(pornographic or normal) and the critical features.Based on these rules, we proposed an efficient method of filtering pornog... We applied the decision tree algorithm to learn association rules between webpage’s category(pornographic or normal) and the critical features.Based on these rules, we proposed an efficient method of filtering pornographic webpages with the following major advantages: 1) a weighted window-based technique was proposed to estimate for the condition of concept drift for the keywords found recently in pornographic webpages; 2) checking only contexts of webpages without scanning pictures; 3) an incremental learning mechanism was designed to incrementally update the pornographic keyword database. 展开更多
关键词 concept drift data mining decision tree pornographic websites filtering
下载PDF
Subspace Clustering in High-Dimensional Data Streams:A Systematic Literature Review
5
作者 Nur Laila Ab Ghani Izzatdin Abdul Aziz Said Jadid AbdulKadir 《Computers, Materials & Continua》 SCIE EI 2023年第5期4649-4668,共20页
Clustering high dimensional data is challenging as data dimensionality increases the distance between data points,resulting in sparse regions that degrade clustering performance.Subspace clustering is a common approac... Clustering high dimensional data is challenging as data dimensionality increases the distance between data points,resulting in sparse regions that degrade clustering performance.Subspace clustering is a common approach for processing high-dimensional data by finding relevant features for each cluster in the data space.Subspace clustering methods extend traditional clustering to account for the constraints imposed by data streams.Data streams are not only high-dimensional,but also unbounded and evolving.This necessitates the development of subspace clustering algorithms that can handle high dimensionality and adapt to the unique characteristics of data streams.Although many articles have contributed to the literature review on data stream clustering,there is currently no specific review on subspace clustering algorithms in high-dimensional data streams.Therefore,this article aims to systematically review the existing literature on subspace clustering of data streams in high-dimensional streaming environments.The review follows a systematic methodological approach and includes 18 articles for the final analysis.The analysis focused on two research questions related to the general clustering process and dealing with the unbounded and evolving characteristics of data streams.The main findings relate to six elements:clustering process,cluster search,subspace search,synopsis structure,cluster maintenance,and evaluation measures.Most algorithms use a two-phase clustering approach consisting of an initialization stage,a refinement stage,a cluster maintenance stage,and a final clustering stage.The density-based top-down subspace clustering approach is more widely used than the others because it is able to distinguish true clusters and outliers using projected microclusters.Most algorithms implicitly adapt to the evolving nature of the data stream by using a time fading function that is sensitive to outliers.Future work can focus on the clustering framework,parameter optimization,subspace search techniques,memory-efficient synopsis structures,explicit cluster change detection,and intrinsic performance metrics.This article can serve as a guide for researchers interested in high-dimensional subspace clustering methods for data streams. 展开更多
关键词 CLUSTERING subspace clustering projected clustering data stream stream clustering high dimensionality evolving data stream concept drift
下载PDF
Drift DetectionMethod Using DistanceMeasures and Windowing Schemes for Sentiment Classification
6
作者 Idris Rabiu Naomie Salim +3 位作者 Maged Nasser Aminu Da’u Taiseer Abdalla Elfadil Eisa Mhassen Elnour Elneel Dalam 《Computers, Materials & Continua》 SCIE EI 2023年第3期6001-6017,共17页
Textual data streams have been extensively used in practical applications where consumers of online products have expressed their views regarding online products.Due to changes in data distribution,commonly referred t... Textual data streams have been extensively used in practical applications where consumers of online products have expressed their views regarding online products.Due to changes in data distribution,commonly referred to as concept drift,mining this data stream is a challenging problem for researchers.The majority of the existing drift detection techniques are based on classification errors,which have higher probabilities of false-positive or missed detections.To improve classification accuracy,there is a need to develop more intuitive detection techniques that can identify a great number of drifts in the data streams.This paper presents an adaptive unsupervised learning technique,an ensemble classifier based on drift detection for opinion mining and sentiment classification.To improve classification performance,this approach uses four different dissimilarity measures to determine the degree of concept drifts in the data stream.Whenever a drift is detected,the proposed method builds and adds a new classifier to the ensemble.To add a new classifier,the total number of classifiers in the ensemble is first checked if the limit is exceeded before the classifier with the least weight is removed from the ensemble.To this end,a weighting mechanism is used to calculate the weight of each classifier,which decides the contribution of each classifier in the final classification results.Several experiments were conducted on real-world datasets and the resultswere evaluated on the false positive rate,miss detection rate,and accuracy measures.The proposed method is also compared with the state-of-the-art methods,which include DDM,EDDM,and PageHinkley with support vector machine(SVM)and Naive Bayes classifiers that are frequently used in concept drift detection studies.In all cases,the results show the efficiency of our proposed method. 展开更多
关键词 Data streams sentiment analysis concept drift ensemble classification adaptive window
下载PDF
LDM-Satellite:A New Scheme for Packet Loss Classification over LEO Satellite Network
7
作者 Ning Li Qiaodi Zhu Zhongliang Deng 《China Communications》 SCIE CSCD 2022年第12期207-215,共9页
The packet loss classification has always been a hot and difficult issue in TCP congestion control research.Compared with the terrestrial network,the probability of packet loss in LEO satellite network increases drama... The packet loss classification has always been a hot and difficult issue in TCP congestion control research.Compared with the terrestrial network,the probability of packet loss in LEO satellite network increases dramatically.What’s more,the problem of concept drifting is also more serious,which greatly affects the accuracy of the loss classification model.In this paper,we propose a new loss classification scheme based on concept drift detection and hybrid integration learning for LEO satellite networks,named LDM-Satellite,which consists of three modules:concept drift detection,lost packet cache and hybrid integration classification.As far,this is the first paper to consider the influence of concept drift on the loss classification model in satellite networks.We also innovatively use multiple base classifiers and a naive Bayes classifier as the final hybrid classifier.And a new weight algorithm for these classifiers is given.In ns-2 simulation,LDM-Satellite has a better AUC(0.9885)than the single-model machine learning classification algorithms.The accuracy of loss classification even exceeds 98%,higher than traditional TCP protocols.Moreover,compared with the existing protocols used for satellite networks,LDM-Satellite not only improves the throughput rate but also has good fairness. 展开更多
关键词 LEO Satellite Networ TCP congestion control concept drift detection ensemble learning loss classification
下载PDF
An ensemble method for data stream classification in the presence of concept drift 被引量:3
8
作者 Omid ABBASZADEH Ali AMIRI Ali Reza KHANTEYMOORI 《Frontiers of Information Technology & Electronic Engineering》 SCIE EI CSCD 2015年第12期1059-1068,共10页
One recent area of interest in computer science is data stream management and processing. By ‘data stream', we refer to continuous and rapidly generated packages of data. Specific features of data streams are imm... One recent area of interest in computer science is data stream management and processing. By ‘data stream', we refer to continuous and rapidly generated packages of data. Specific features of data streams are immense volume, high production rate, limited data processing time, and data concept drift; these features differentiate the data stream from standard types of data. An issue for the data stream is classification of input data. A novel ensemble classifier is proposed in this paper. The classifier uses base classifiers of two weighting functions under different data input conditions. In addition, a new method is used to determine drift, which emphasizes the precision of the algorithm. Another characteristic of the proposed method is removal of different numbers of the base classifiers based on their quality. Implementation of a weighting mechanism to the base classifiers at the decision-making stage is another advantage of the algorithm. This facilitates adaptability when drifts take place, which leads to classifiers with higher efficiency. Furthermore, the proposed method is tested on a set of standard data and the results confirm higher accuracy compared to available ensemble classifiers and single classifiers. In addition, in some cases the proposed classifier is faster and needs less storage space. 展开更多
关键词 Data stream Classificaion Ensemble classifiers concept drift
原文传递
Classifying Uncertain and Evolving Data Streams with Distributed Extreme Learning Machine 被引量:1
9
作者 韩东红 张昕 王国仁 《Journal of Computer Science & Technology》 SCIE EI CSCD 2015年第4期874-887,共14页
Conventional classification algorithms are not well suited for the inherent uncertainty, potential concept drift, volume, and velocity of streaming data. Specialized algorithms are needed to obtain efficient and accur... Conventional classification algorithms are not well suited for the inherent uncertainty, potential concept drift, volume, and velocity of streaming data. Specialized algorithms are needed to obtain efficient and accurate classifiers for uncertain data streams. In this paper, we first introduce Distributed Extreme Learning Machine (DELM), an optimization of ELM for large matrix operations over large datasets. We then present Weighted Ensemble Classifier Based on Distributed ELM (WE-DELM), an online and one-pass algorithm for efficiently classifying uncertain streaming data with concept drift. A probability world model is built to transform uncertain streaming data into certain streaming data. Base classifiers are learned using DELM. The weights of the base classifiers are updated dynamically according to classification results. WE-DELM improves both the efficiency in learning the model and the accuracy in performing classification. Experimental results show that WE-DELM achieves better performance on different evaluation criteria, including efficiency, accuracy, and speedup. 展开更多
关键词 uncertain data stream CLASSIFICATION extreme learning machine distributed computing concept drift
原文传递
Online clustering of streaming trajectories
10
作者 Jiali MAO Qiuge SONG +2 位作者 Cheqing JIN Zhigang ZHANG Aoying ZHOU 《Frontiers of Computer Science》 SCIE EI CSCD 2018年第2期245-263,共19页
With the increasing availability of modern mobile devices and location acquisition technologies, massive trajectory data of moving objects are collected continuously in a streaming manner. Clustering streaming traject... With the increasing availability of modern mobile devices and location acquisition technologies, massive trajectory data of moving objects are collected continuously in a streaming manner. Clustering streaming trajectories facilitates finding the representative paths or common moving trends shared by different objects in real time. Although data stream clustering has been studied extensively in the past decade, little effort has been devoted to dealing with streaming trajectories. The main challenge lies in the strict space and time complexities of processing the continuously arriving trajectory data, combined with the difficulty of concept drift. To address this issue, we present two novel synopsis structures to extract the clustering characteristics of trajectories, and develop an incremental algorithm for the online clustering of streaming trajectories (called OCluST). It contains a micro-clustering component to cluster and summarize the most recent sets of trajectory line segments at each time instant, and a macro-clustering component to build large macro-clusters based on micro-clusters over a specified time horizon. Finally, we conduct extensive experiments on four real data sets to evaluate the effectiveness and efficiency of OCluST, and compare it with other congeneric algorithms. Experimental results show that OCluST can achieve superior performance in clustering streaming trajectories. 展开更多
关键词 streaming trajectory synopsis data structure concept drift sliding window
原文传递
Data streams classification with ensemble model based on decision-feedback
11
作者 LIU Jing XU Guo-sheng +2 位作者 ZHENG Shi-hui XIAO Da GU Li-ze 《The Journal of China Universities of Posts and Telecommunications》 EI CSCD 2014年第1期79-85,共7页
The main challenges of data streams classification include infinite length, concept-drifting, arrival of novel classes and lack of labeled instances. Most existing techniques address only some of them and ignore other... The main challenges of data streams classification include infinite length, concept-drifting, arrival of novel classes and lack of labeled instances. Most existing techniques address only some of them and ignore others. So an ensemble classification model based on decision-feedback(ECM-BDF) is presented in this paper to address all these challenges. Firstly, a data stream is divided into sequential chunks and a classification model is trained from each labeled data chunk. To address the infinite length and concept-drifting problem, a fixed number of such models constitute an ensemble model E and subsequent labeled chunks are used to update E. To deal with the appearance of novel classes and limited labeled instances problem, the model incorporates a novel class detection mechanism to detect the arrival of a novel class without training E with labeled instances of that class. Meanwhile, unsupervised models are trained from unlabeled instances to provide useful constraints for E. An extended ensemble model Ex can be acquired with the constraints as feedback information, and then unlabeled instances can be classified more accurately by satisfying the maximum consensus of Ex. Experimental results demonstrate that the proposed ECM-BDF outperforms traditional techniques in classifying data streams with limited labeled data. 展开更多
关键词 ensemble classification novel class concept drifting decision-feedback
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部