摘要
针对应用聚类方法检测入侵中参数人为指定的问题,提出了一种新的基于无监督的聚类算法。该方法不需要人为设置参数并且不受数据输入顺序的影响,聚类的形状是任意的,能够较真实地反映数据分布的具体性状。算法通过比较无类标训练集样本间的距离,根据距离最近的样本首先聚合成类的特性,在每一步聚类结束时,再次比较类间距离以及计算类内数据占总数据的比率来确定异常数据类。实验证明该算法处理未知入侵检测问题的检测率为89.5%,误报率为0.4%。
An unsupervised clustering algorithm is proposed to solve the problem that most of intrusion detections based on clustering algorithm have artificial parameters. This method has no artificial parameter and is not affected by the order of data entrance. The shape of clusters is arbitrary, which can reflect the real distribution of data. By comparing the distances between unlabeled training data, the algorithm merges characters of clusters according to the characters of nearest samples. When each step of clustering is completed, the algorithm identifies the intrusion clusters by comparing the distances of clusters and calculating the rate of samples of each cluster among all samples. The identified clusters can be used in real data detection. The experimental result shows that the detection rate is 89.5% and the false alarm rate is 0.4% in detecting unknown intrusion.
出处
《南京理工大学学报》
EI
CAS
CSCD
北大核心
2009年第3期288-292,共5页
Journal of Nanjing University of Science and Technology
基金
江苏省自然科学基金(BK2008403)
关键词
入侵检测
计算机犯罪
探测器
因特网
网络安全
无监督聚类
无类标数据
intrusion detection
computer crime
detectors
internet
network security
unsupervised clustering
unlabeled data