多变量时间序列异常样本的识别被引量：3

Detection of Outlier Samples in Multivariate Time Series

导出

摘要多变量时间序列(MTS)在金融、多媒体、医学等领域的应用是非常普遍的.与其它多变量时间序列样本显著不同的样本,我们称之为异常样本.本文提出一种基于局部稀疏系数的多变量时间序列异常样本的识别算法,使用扩展的 Frobenius 范数来计算2个 MTS 样本之间相似性.使用两阶段顺序查询来进行 k-近邻查找,将不可能成为候选异常样本的 MTS 样本剪去.在2个实际数据集上进行实验.实验结果验证算法的有效性. Multivariate time series （MTS） datasets are commonly used in the fields of finance, multimedia and medicine. MTS samples, namely outlier samples, are significantly different from the other MTS samples. In this paper, a method for detecting outlier samples in the MTS dataset based on local sparsity coefficient is proposed. An extended Frobenius norm is used to compare the similarity between two MTS samples, and k -nearest neighbor （ k -NN） searches are performed by using two-phase sequential scan. MTS samples that are not possible outlier candidates are pruned, which reduces the number of computations and comparisons. Experiments are carried out on two real-world datasets, stock market dataset and BCI （ Brain Computer Interface ） dataset. The experimental results show the effectiveness of the proposed method.

作者翁小清沈钧毅

机构地区西安交通大学计算机软件研究所

出处《模式识别与人工智能》 EI CSCD 北大核心 2007年第4期463-468,共6页 Pattern Recognition and Artificial Intelligence

基金国家自然科学基金(No.60173058)

关键词多变量时间序列(MTS) 局部稀疏系数扩展的Frobenius范数异常样本 Multivariate Time Series （MTS）, Local Sparsity Coefficient, Extended FrobeniusNorm, Outlier Sample

分类号 TP18 [自动化与计算机技术—控制理论与控制工程]

引文网络
相关文献

参考文献18

1Yang K, Shahabi C. A PCA-Based Similarity Measure for Multivariate Time Series // Proc of the 2nd ACM International Workshop on Multimedia Databases. Washington, USA, 2004: 65-74
2Yoon H, Yang K, Shahabi C. Feature Subset Selection and Feature Ranking for Multivariate Time Series. IEEE Trans on Knowledge and Data Engineering, 2005, 17(9): 1186-1198
3Yang K, Shahabi C. A Multilevel Distance-Based Index Structure for Multivariate Time Series // Proc of the 12th International Symposium on Temporal Representation and Reasoning. Burlington, USA, 2005:65-73
4Singhalt A, Seborg D E. Clustering of Multivariate Time-Series Data // Proc of the American Control Conference. Anchorage, USA, 2002:3931-3936
5Hawkins D. Identification of Outliers. London, UK: Chapman and Hall, 1980
6Agyemang M, Ezeife C I. LSC-Mine: Algorithm for Mining Local Outliers // Proc of the 15th International Conference on Information Resources Management Association. New Orleans, USA, 2004:5-8
7郑斌祥,席裕庚,杜秀华.基于离群指数的时序数据离群挖掘[J].自动化学报,2004,30(1):70-77. 被引量：15
8Angiulli F, Pizzuti C. Outlier Mining in Large High-Dimensional Data Sets. IEEE Trans on Knowledge and Data Engineering, 2005, 17(2): 203-215
9Karioti V, Caroni C. Detecting Outlying Series in Sets of Short Time Series. Computational Statistics & Data Analysis, 2002, 39(3) : 351-364
10Vlachos M, Hadjieleftheriou M, Gunopulos D, et al. Indexing Multi-Dimensional Time-Series with Support for Multiple Distance Measures // Proc of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Washington, USA, 2003:216-225

二级参考文献9

1Piatetsky-Shapiro G, Fayyad U, Smyth P. From data mining to knowledge discovery: An overview. In: Fayyad U,Piatetsky-Shapiro G, Smyth P et al Eds. Advances in Knowledge Discovery and Data Mining. AAAI/MIT Press,1996. 1～35
2Barnet V, Lewis T. Outliers in Statistical Data. New York: John Wiley & Sons, 1994
3Edwin Knorr, Roymond Ng. Algorithms for mining distance-based outliers in large databases. In: Proceedings of the VLDB Conference, New York, USA: 1998. 392-403
4Arning, Rakesh Agrawal, Raghavan P. A linear method for deviation detection in large database. In: International Conference on Knowledge Discovery in Databases and Data Mining (KDD-95), Portland, Oregon: AAAI Press,1996. 164～169
5Bentley J L. K-d trees for semidynamic point sets. In: Proceedings of the 6th ACM Annual Symposium Computational Geometry, 1990. 187-197
6Roussopoulos N, Kelley S, Vincent F. Nearest neighbor queries. In: Proceedings of ACM SIGMOD, San Jose,CA:1995. 71-79
7Agrawal R, Lin K I, Sawhney H S, Shim K. Fast similarity search in the presence of noise, scaling, and translation in time series databases. In: Prococeedings of the 21st International Conference on Very Large Data Bases, 1995. 490～501
8Faloutsos C, Ranganathan M, Manolopoulos Y. Fast subsequence matching in time-series databases. In: Proceedings of ACM SIGMOD Conference on Management of Data (SIGMOD'94),ACM Press,1994. 419-429
9Berndt D J, Cliffod J. Finding patterns in time series: A dynamic programming approach. In: Fayyad U, PiatetskyShapiro G, Smyth P et al Eds. Advances in Knowledge Discovery and Data Mining, AAAI/MIT Press, 1996. 229～248

共引文献15

1陈立虎,黄智虎.“入世”与中国的反垄断立法[J].甘肃政法学院学报,2002(1):14-21. 被引量：1
2闫伟,张浩,陆剑峰.一种离群数据挖掘新方法的研究与应用[J].控制与决策,2006,21(5):563-566. 被引量：5
3崔贯勋,朱庆生.一种改进的基于密度的离群数据挖掘算法[J].计算机应用,2007,27(3):559-560. 被引量：8
4翁小清,沈钧毅.基于滑动窗口的多变量时间序列异常数据的挖掘[J].计算机工程,2007,33(12):102-104. 被引量：16
5邵华,赵宏.属性构造原则与时序计数算子的研究[J].软件学报,2008,19(2):351-357. 被引量：1
6孙金花,冯英浚,胡健.基于分形理论的股票时序数据离群模式挖掘研究[J].运筹与管理,2008,17(5):135-140. 被引量：4
7杜洪波,张颖.基于LLM的时间序列异常子序列检测算法[J].沈阳工业大学学报,2009,31(3):328-332. 被引量：4
8卫一唯,王红霞,陈俊杰,王志伟.基于规则的离群数据挖掘在病虫害预测中的应用研究[J].电脑开发与应用,2009,22(8):17-19.
9王红霞,卫一唯,陈俊杰,王志伟.基于规则的离群数据挖掘在病虫害预测中的应用[J].电脑开发与应用,2009,22(12):41-43. 被引量：2
10刘明华,张晋昕.时间序列的异常点诊断方法[J].中国卫生统计,2011,28(4):478-480. 被引量：6

同被引文献17

1VARUN C, ARINDAM B, KUMAR V. Anomaly detection for discrete sequences: a survey[J]. IEEE Transactions on Knowl- edge and Data Engineering, 2012, 24(5), 823 -839.
2GALEANO D, PENA R, TSAY R S. Outlier detection in multivariate time series by projection pursuit [J]. Journal of the Amer- ican Statistical Association, 2006, 101 (474), 654 -669.
3BARAGONA R, BATrAGLIA F. Outlier detection in multivariate time series by independent component analysis [J]. Neural Computation, 2007, 19(7) : 1 962 - 1 984.
4WENG X, SHEN J. Finding discordant subsequence in multivariate time series [C]//2007 IEEE International Conference on Automation and Logistics. Piscataway :IEEE, 2007 : 1 731 - 1 735. DOI : 10.1109/ICA L. 27. 4338852.
5YaNG K, SHAHABI C. A PCA - based similarity measure for multivariate time series [C]//Proceedings of the Second ACM International Workshop on Multimedia Databases. Washington: ACM, 2004 : 65 - 74. DOI : 10.1145/1032604. 1032616.
6SINGHA A , SEBORG D E. Pattern matching in historical batch data using PCA [J]. IEEE Control Systems Magazine, 2002, 22(5) : 53 -63.
7CHENG H B, TAN P N, POTYER C, et al. A robust graph - based algorithm for detection and characterization of anomalies in noisy multivariate time series [C]// Workshops Proceedings of the 8th IEEE International Conference on Data Mining. Pisa: IEEE, 2008 : 349 - 358. DOI: 10.1109/ICDMW. 2008.48.
8YAO Z, MARK P, RABBAT M. Anomaly detection using proximity graph and PageRank algorithm[J]. IEEE Transactions on Information Forensics and Security, 2012, 7(4) : 1 288 -1 300.
9PAGE L, BRIN S, MOTWANI R, et al. The PageRank citation ranking: bring order to the web[R]. Stanford: Stanford University, 1998.
10PageRank[M][EB/O L]. [2013 - 04 - 23]. http ://en. wikipedia, org/wiki/Pageaank.

引证文献3

1王翔宇,张引琼.基于MATLAB的时间序列异常检测方法探讨[J].电脑知识与技术,2012,8(2):866-872. 被引量：5
2董红玉,陈晓云.基于改进ADPP的多变量时间序列异常检测[J].福州大学学报（自然科学版）,2016,44(2):164-169.
3戈宁振,翁小清,袁子璇.基于子空间重构的无监督时间序列异常检测[J].智能计算机与应用,2023,13(11):119-127.

二级引证文献5

1成万里,熊豪,曲翠兰.前兆仪器数据异常实时监控系统研究[J].数字技术与应用,2012,30(10):12-14. 被引量：1
2邓森林,陈卫东.基于一类支持向量机的财务数据异常模式识别[J].信息工程大学学报,2015,16(2):251-256. 被引量：9
3徐永红,侯晓颖,李书亭,崔洁.基于黎曼流形的多元时间序列异常检测[J].生物医学工程学杂志,2015,32(3):542-547. 被引量：1
4李艺,华静,刘保双,张裕芬,冯银厂.大气污染物监测数据异常值判别方法研究[J].环境科学学报,2022,42(12):341-352. 被引量：4
5逯昊舒,张文轩,王婧,杨志鹏,王斌,寇鸿博.基于数据特征的接触网补偿装置卡滞识别方法研究[J].铁道机车车辆,2024,44(1):117-124.

1翁小清,沈钧毅.基于滑动窗口的多变量时间序列异常数据的挖掘[J].计算机工程,2007,33(12):102-104. 被引量：16
2吉训生,刘永祥.基于局部稀疏表示的二尺度块目标跟踪方法[J].小型微型计算机系统,2017,38(5):1139-1142.
3贾晨科,邱保志.基于局部孤立系数的孤立点挖掘[J].微计算机信息,2005,21(12X):107-109. 被引量：3
4翁小清,沈钧毅.多变量时间序列例外模式的识别[J].模式识别与人工智能,2007,20(3):336-342. 被引量：2
5邱保志,尚俊平,贾晨科.基于局部最大距离的孤立点检测算法的研究[J].河南教育学院学报（自然科学版）,2005,14(1):55-58. 被引量：2
6周大镯,吴晓丽,闫红灿.一种高效的多变量时间序列相似查询算法[J].计算机应用,2008,28(10):2541-2543. 被引量：16
7宫琳琳.电子地图管理系统设计[J].产业与科技论坛,2011,0(17):85-86. 被引量：1
8王静.DELPHI数据库查询方法的探索[J].石家庄铁路职业技术学院学报,2006,5(1):116-120. 被引量：2
9常鑫,陆源.谈谈索引[J].赤峰学院学报（自然科学版）,2012,28(6):35-37.
10夏小玲,朱文术.基于iPhone应用的图片请求匹配方案[J].东华大学学报（自然科学版）,2013,39(1):88-93.

模式识别与人工智能

2007年第4期

浏览历史

内容加载中请稍等...

多变量时间序列异常样本的识别被引量：3

参考文献18

二级参考文献9

共引文献15

同被引文献17

引证文献3

二级引证文献5

相关作者

相关机构

相关主题

浏览历史

多变量时间序列异常样本的识别 被引量：3

参考文献18

二级参考文献9

共引文献15

同被引文献17

引证文献3

二级引证文献5

相关作者

相关机构

相关主题

浏览历史

多变量时间序列异常样本的识别被引量：3