期刊文献+

基于属性关联及匹配差异度的数据流异常检测

Outlier Detection Model for Data Streams Based on Attribute Associations and Match Difference Degree
下载PDF
导出
摘要 为解决类别属性数据流异常点检测问题,针对事务数据流环境,提出了基于属性关联及匹配差异度的数据流异常检测模型AAMDD(attribute associations and match difference degree).AAMDD模型离线构建一个关联规则库,并对其进行增量式更新.同时,利用时间敏感型滑动窗口(time-sensitive sliding windows,TimeSW)维护数据流数据,每经过一个时间跨度,就将当前窗口中每条数据包含的项集与关联规则库进行匹配,计算匹配差异度,根据匹配差异度的不同在线检测异常点.此外,给出了与AAMDD模型相对应的算法AAMDD-algorithm.实验结果表明,AAMDD-algorithm比FODFP-Stream算法的效率和检测精确度分别平均提高了37.43%和5.51%,并且AAMDD-algorithm的查全率保持在77%以上,可用于事务型数据流异常检测. In order to solve the problem of outlier detection for categorical data streams, an outlier detection model for data streams based on attribute associations and match difference degree was proposed, called as AAMDD. This model builds an association rule library off-line and updates it with the incremental method. Meanwhile, it maintains the data streams by using time-sensitive sliding windows (TimeSW). In a time step, the AAMDD matches data in current window with association rules in the association rule library and calculates the match difference degree (MDD). Then, oufliers can be identified on-line through different MDDs. An algorithm for the AAMDD was given, called as AAMDD-algorithm. The experiment results show that compared with the FODFP-Stream algorithm, the AAMDD-algorithm has on average 5.51% and 37.43% improvements respectively in detection precision and efficiency, and its recall is above 77%. It can be used to detect outliers in transaction data streams.
出处 《西南交通大学学报》 EI CSCD 北大核心 2013年第1期107-115,共9页 Journal of Southwest Jiaotong University
基金 国家自然科学基金资助项目(71071141) 浙江省自然科学基金重点项目(Z1091224) 教育部博士点基金资助项目(20103326110001)
关键词 数据流 关联规则 差异度 增量式异常检测 概念漂移 data stream association rule difference degree incremental outlier detection concept drifting
  • 相关文献

参考文献21

  • 1李存华,孙志挥.GridOF:面向大规模数据集的高效离群点检测算法[J].计算机研究与发展,2003,40(11):1586-1592. 被引量:28
  • 2MUTHUKRISHNAN S,SHAH R,VETTER J S. Mining deviants in time series data stream[A].Los Alamitos.CA:IEEE Computer Society Press,2004.41-50.
  • 3ANGIULLI F,FASSETTI F. Detecting distance-based outliers in streams of data[A].New York:ACM,2007.811-820.
  • 4POKRAJAC D,LAZAREVIC A,LATECKI L J. Incremental local outlier detection for data streams[A].IEEE,2007.504-515.
  • 5ZHU Xingquan,WU Xindong,YANG Ying. Effective classification of noisy data streams with attribute oriented dynamic classifier selection[J].Knowledge and Information Systems,2006,(03):339-363.
  • 6LI Peipei,HU Xuegang,LIANG Qianhui. Concept drifting detection on noisy streaming data in random ensemble decision trees[A].Berlin,Germany,2009.236-250.
  • 7CHAN P K,MAHONEY M V,ARSHAD M H. A machine learning approach to anomaly detection[M].Melbourne:Florida Institute of Technology,2003.1-13.
  • 8DAS K,SCHNEIDER J G. Detecting anomalous records in categorical datasets[A].New York:ACM,2007.220-229.
  • 9NARITA K,KITAGAWA H. Detecting outliers in categorical record databases based on attribute associations[A].Heidelberg:Springer-Verlag,2008.111-123.
  • 10江峰,杜军威,葛艳,眭跃飞,曹存根.基于粗糙集理论的序列离群点检测[J].电子学报,2011,39(2):345-350. 被引量:16

二级参考文献69

共引文献71

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部