摘要
数据流孤立点检测的主要目的是在合理的时间段内准确发现数据流中的孤立点。传统的孤立点检测算法可以有效发现静态数据集中的孤立点,在动态变化的数据流环境下并不适用,无法及时、有效地发现异常数据。针对数据流环境下对孤立点检测的实时发现、动态调整等要求以及传统算法的不适用,提出了一种新的基于网格的数据流孤立点检测算法ODGrid,ODGrid算法可以实时发现数据流中的异常数据,并根据数据流的变化情况,动态调整检测结果。通过在真实数据集与仿真数据集上的实验,证明了ODGrid算法在精度和速度上优于现有的孤立点检测算法,具有良好的伸缩性。
The traditional existing algorithms can only find outliers in static data sets, but they are inap- plicable for the data stream which leads to the inefficiency in finding abnormal data in the dynamic data stream environment. Due to the inapplicability of the existing algorithms on data stream outlier detection, a new data stream outlier detection algorithm is proposed, which can not only find data stream outliers in real time, but also can adjust detection results dynamically according to the changes of data stream. The results of experiments on real datasets and synthetic datasets show that ODGrid is superior to the existing data stream outlier detection algorithms, and it has good scalability to the dimensionaIity of data space.
出处
《黑龙江大学自然科学学报》
CAS
北大核心
2015年第2期276-280,共5页
Journal of Natural Science of Heilongjiang University
基金
黑龙江省教育厅科学技术研究项目(12531542)