摘要
针对网络入侵检测数据存在大量冗余信息和传统聚类算法对离群点检测不足的问题,提出一种基于主成分分析(principal component analysis,PCA)和半监督聚类的入侵检测算法。首先使用PCA对数据进行特征提取,消除数据间的冗余属性;然后利用少量已标记样本和成对约束信息,通过引入竞争凝聚让系统主动学习,以实现对大量未知样本的检测。在入侵检测数据集和UCI基准数据集上的实验结果表明,该算法能有效提高系统的性能。
In order to solve the problem that lots of redundant information existed in network intrusion detection data and the traditional clustering algorithms were inadequate for detecting outlier, an intrusion detection algorithm based on prin- cipal component analysis (PCA) and semi-supervised clustering was proposed. First, the features of data were extracted by using PCA, and the redundant attributes among the data were eliminated. Then, a few labeled samples and pairwise constraints information were exploited, and competitive agglomeration was introduced to letting the system active learning in order that the detection of lots of unknown samples could be realized. The experimental results on intrusion detection data set and UCI benchmark data sets showed that this algorithm could effectively improve the system performance.
出处
《山东大学学报(工学版)》
CAS
北大核心
2012年第5期41-46,共6页
Journal of Shandong University(Engineering Science)
基金
江苏省高校自然科学基金资助项目(05KJD52006)
江苏科技大学科研资助项目(2005DX006J)
关键词
入侵检测
PCA
半监督聚类
成对约束
竞争凝聚
intrusion detection
principal component analysis (PCA)
semi-supervised clustering
pairwise constraints
competitive agglomeration