一种基于密度近邻的增量式孤立点发现算法被引量：3

A Density-Neighbors-Based Incremental Outlier Detection Algorithm

导出

摘要为了解决数据集更新时孤立点增量发现问题,提出一种基于密度近邻的增量式孤立点发现算法.当数据集更新时,该算法在确定出受影响的对象后,根据对象和其近邻间k-密度变化,建立对象的密度近邻序列.然后依据对象的密度近邻序列代价和其k-距离邻域的平均密度近邻序列代价,计算出受影响对象的增量异常因子(IOF)来表征对象的孤立程度,从而提高增量孤立点发现的效果.此外,由于只需重新计算这些受影响对象的IOF值,该算法还提高孤立点发现的速度.实验表明,该算法不仅在孤立点增量发现的效果上高于以往算法且减少算法的运行时间. Aiming at the problem of incremental outlier detection with the dataset being updated, a density-neighbors-based incremental outlier detection algorithm is proposed. When the dataset is updated, the proposed algorithm identifies the affected objects and establishes the density neighbor sequences of the objects based on the change of the k-density of the object and those of its neighbors. According to the density neighbor sequence cost （DNSC） of the object and the average of the DNSC of k-distance neighbors of the object, the proposed algorithm calculates the incremental outlier factor（IOF） of each affected objects and the IOF value indicates the degree of the object as an outlier. Therefore, the proposed algorithm improves the effectiveness of incremental outlier detection. Moreover, it speeds up the outlier detection since the proposed algorithm recalculates the IOF values of these affected objects. The experimental results show that the proposed algorithm has a higher quality in outlier detection than the former incremental algorithms with the decrease of the running time.

作者曹晖司刚全张彦斌贾立新

机构地区西安交通大学电气工程学院

出处《模式识别与人工智能》 EI CSCD 北大核心 2009年第6期931-935,共5页 Pattern Recognition and Artificial Intelligence

基金国家863计划资助项目(No.2006AA04Z180)

关键词孤立点发现增量式算法密度近邻增量异常因子(IOF) Outlier Detection, Incremental Algorithm, Density Neighbor, Incremental Outlier Factor （[OF）

分类号 TP301.6 [自动化与计算机技术—计算机系统结构]

引文网络
相关文献

参考文献14

1Tan Pangning, Steinbach M, Kumar V. Introduction to Data Mining. Milano, Italy : Addison Wesley Higher Education, 2006 : 491 - 509.
2Domingos P, Hulten G. A General Framework for Mining Massive Data Streams. Journal of Computational and Graphical Statistics, 2003, 12 (4) : 945 -949.
3Takeuchi J, Yamanishi K. A Unifying Framework for Detecting Outliers and Change Points from Time Series. IEEE Trans on Knowledge and Data Engineering, 2006, 18(4) : 482 -492.
4单世民,邓贵仕,何英昊.数据流中孤立点识别方法[J].计算机工程,2007,33(15):172-174. 被引量：4
5Dong Yihong, Tai Xiaoying, Zhao Jieyu. A Novel Fuzzy-Connectedness-Based Incremental Clustering Algorithm for Large Databases// Proc of the 2nd International Conference on Fuzzy Systems and Knowledge Discovery. Changsha, China, 2005:470 -474.
6Kong Qinglu, Zhu Qiuming. Incremental Procedures for Partitioning Highly Intermixed Multi-Class Datasets into Hyper-Spherical and Hyper-Ellipsoidal Clusters. Data & Knowledge Engineering, 2007, 63(2) : 457 -477.
7Breunig M M, Kriegel H P, Ng R T, et al. LOF : Identifying Density-Based Local Outliers// Proc of the ACM SIGMOD International Conference on Management of Data. Dallas, USA, 2000:93 - 104.
8杨风召,朱扬勇,施伯乐.IncLOF:动态环境下局部异常的增量挖掘算法[J].计算机研究与发展,2004,41(3):477-484. 被引量：33
9Pokrajac D, Lazarevic A, Lateeki L J. Incremental Local Outlier Detection for Data Streams//Proc of the IEEE Symposium on Computational Intelligence and Data Mining. Honolulu, USA, 2007: 504 -515.
10Tang Jian, Chen Zhixiang, Fu A W, et al. Capabilities of Outlier Detection Schemes in Large Datasets, Framework and Methodologies. Knowledge and Information Systems, 2006, 11 ( 1 ) : 45 - 84.

二级参考文献18

1熊家军,陈新,李庆华.一种启发式的入侵检测警报概念聚类算法[J].计算机工程,2005,31(7):35-36. 被引量：2
2蒋盛益,李庆华,李新.数据流挖掘算法研究综述[J].计算机工程与设计,2005,26(5):1130-1132. 被引量：21
3卢辉斌,徐刚,李段.一种基于孤立点检测的入侵检测方法[J].微机发展,2005,15(6):93-94. 被引量：3
4单世民,邓贵仕.动态环境下一种改进的自适应微粒群算法[J].系统工程理论与实践,2006,26(3):39-44. 被引量：16
5HanJiawei MichelineKambe.数据挖掘概念与技术[M].北京：机械工业出版社,2001..
6D Hawkins. Identification of Outliers. London: Chapman and Hall, 1980
7V Barnett, T Lewis. Outliers in Statistical Data. New York: John Wiley, 1994
8E Knorr, R Ng. Algorithms for mining distance-based outliers in large data sets. The 24th Int'l Conf on Very Large Data Bases. New York, 1998
9S Ramaswamy, R Rastogi, K Shim. Efficient algorithms for mining outliers from large data sets. The ACM SIGMOD 2000 Int'l Conf on Management of Data, Dalles, TX, 2000
10R Agrawal, P Ragaran. A linear method for deviation detection in large databases. In: Proc of the 2nd Int'l Conf on Knowledge Discovery and Data Mining. Portland, OR: AAAI Press, 1996. 164～169

共引文献35

1蒋盛益,李庆华,王卉,孟中楼.一种增强的局部异常挖掘方法[J].计算机研究与发展,2005,42(2):210-216. 被引量：8
2黄洪宇,林甲祥,陈崇成,樊明辉.离群数据挖掘综述[J].计算机应用研究,2006,23(8):8-13. 被引量：42
3张应辉,饶云波.最小差异度聚类在异常入侵检测中的应用[J].计算机应用研究,2007,24(12):193-195. 被引量：1
4涂溢彬,饶云波,廖云,周明天.蜜网系统在检测新型Rootkit中的应用[J].计算机技术与发展,2008,18(1):181-184.
5李健,阎保平,李俊.基于记忆效应的局部异常检测算法[J].计算机工程,2008,34(12):4-6. 被引量：8
6王津,饶云波.基于SVM汽车牌照识别技术研究[J].福建电脑,2008,24(9):125-126.
7景波,刘莹,黄兵.基于孤立点检测的工作流研究[J].计算机工程,2008,34(22):268-270. 被引量：2
8张宁.基于滑动窗口的时间序列离群数据挖掘[J].燕山大学学报,2008,32(6):483-486. 被引量：2
9张宁.离群点检测算法研究[J].桂林电子科技大学学报,2009,29(1):22-25. 被引量：5
10张忠平,梁永欣.基于反k近邻的流数据离群点挖掘算法[J].计算机工程,2009,35(12):11-13. 被引量：11

同被引文献41

1龙军,殷建平,祝恩,赵文涛.主动学习研究综述[J].计算机研究与发展,2008,45(z1):300-304. 被引量：31
2黄光球,彭绪友,靳峰.基于密度的异常挖掘方法研究与应用[J].微电子学与计算机,2005,22(3):262-265. 被引量：8
3李循律,何钦铭.基于密度的异常检测算法在入侵检测系统中的应用[J].江南大学学报（自然科学版）,2006,5(5):543-546. 被引量：4
4李更生.基于时间序列分析的Web服务器DDoS攻击检测[J].计算机工程与应用,2007,43(7):135-138. 被引量：4
5罗华,胡光岷,姚兴苗.基于网络全局流量异常特征的DDoS攻击检测[J].计算机应用,2007,27(2):314-317. 被引量：13
6罗敏,阴晓光,张焕国,王丽娜.基于孤立点检测的入侵检测方法研究[J].计算机工程与应用,2007,43(13):146-149. 被引量：6
7翁小清,沈钧毅.基于滑动窗口的多变量时间序列异常数据的挖掘[J].计算机工程,2007,33(12):102-104. 被引量：16
8单世民,邓贵仕,何英昊.数据流中孤立点识别方法[J].计算机工程,2007,33(15):172-174. 被引量：4
9Tuia D,Ratle F,Pacifici F.Active learning methods for remotesensing image classification. IEEE Transactions on Geoscienceand Remote Sensing . 2009
10Shen D,Zhang J,Su J.Multi-criteria-based active learning for namedentity recognition. Proceedings of the 42nd Annual Meeting onAssociation for Computational Linguistics . 2004

引证文献3

1汪婵,程玉虎,王雪松.基于局部稀疏K近邻密度的主动学习[J].中国科技论文在线,2011,6(7):507-511. 被引量：1
2肖建琼,宋国琴,罗兴贤.基于时间序列数据流的孤立点自适应异常检测[J].电脑知识与技术,2011,7(12):8927-8929.
3阮晓钢,魏若岩,李建更.一种小天体软着陆中基于地表阴影区的跟踪算法[J].控制与决策,2014,29(9):1581-1586. 被引量：1

二级引证文献2

1魏若岩,阮晓钢,肖尧,朱晓庆,黄静.基于单幅图像且避封闭环境的星体表面着陆区选取方法[J].系统工程与电子技术,2015,37(12):2799-2809. 被引量：1
2王蕾,焦明海,代勇,张倩.群体主动学习算法的移动电力交易行为研究[J].控制工程,2019,26(3):484-491. 被引量：6

1白亚男,任广伟.一种基于孤立点挖掘的网络入侵系统[J].中国电子商情（通信市场）,2009(2):234-240. 被引量：1
2蒋盛益,姜灵敏.一种高效异常检测方法[J].计算机工程,2007,33(7):166-168. 被引量：7
3蒋盛益,李庆华.无指导的入侵检测方法[J].计算机工程,2005,31(9):31-33. 被引量：4
4十年风向，始终如一[J].电脑时空,2009(5):57-58.
5大肚腩 ioFX 1.5GB／S超级固态硬盘[J].数码设计,2013(5):32-32.
6掌上型扫描仪[J].军民两用技术与产品,2008(5):30-30.
7曲吉林.一种基于Voronoi图的高效异常检测方法[J].计算机工程与应用,2008,44(3):178-179. 被引量：1
8IE零日漏洞,约70% PC会受影响[J].微电脑世界,2013(11):116-116.
9唐永红,刘绪栋.一种基于混合属性数据集的异常检测方法[J].科学技术与工程,2013,21(7):1832-1835. 被引量：1
10陆林花.遗传聚类算法在污水处理异常数据分析中的应用[J].计算机应用与软件,2011,28(6):199-201. 被引量：2

模式识别与人工智能

2009年第6期

浏览历史

内容加载中请稍等...

一种基于密度近邻的增量式孤立点发现算法被引量：3

参考文献14

二级参考文献18

共引文献35

同被引文献41

引证文献3

二级引证文献2

相关作者

相关机构

相关主题

浏览历史

一种基于密度近邻的增量式孤立点发现算法 被引量：3

参考文献14

二级参考文献18

共引文献35

同被引文献41

引证文献3

二级引证文献2

相关作者

相关机构

相关主题

浏览历史

一种基于密度近邻的增量式孤立点发现算法被引量：3