基于特征再抽象(FRA)的多元时序预测方法

Multivariate Time Series Forecasting Method Based on FRA

下载PDF

导出

摘要科技领域的衍生行业因普遍存在强时间约束的特性而累积了海量的高维时间序列数据,严峻的数据压力导致传统的数据建模预测方法受制于数据规模和属性维度。支撑高质量的服务对大数据智能预测技术提出了更高的要求,如何在数据层面上实现预测性能的提升是现阶段亟待解决的主要问题。针对上述问题,提出了针对多元时序数据的特征再抽象(Feature Re-Abstraction,FRA)算法,首先通过RobustSTL分解算法提取趋势性和季节性特征(Trend and Seasonality Features,TSFs),实现多元数据的特征二阶抽象,以“抽象即特征”替代传统“标签即特征”的提取策略,再通过Pearson相关系数的运算结果评估再抽象技术捕捉的TSFs与目标参数间的相关强度,证实TSF的数据价值。在FRA算法的基础上结合深度学习模型构建基于数据驱动的多元时序预测算法,通过预测效果验证FRA算法的有效性。实验结果表明,引入TSFs作为数据驱动模型的训练向量能够兼具数据降维、降噪及强相关特性地维持,从而避免模型过拟合并缓解模型欠拟合,提高时序预测算法的准确性和鲁棒性。 Derivative industries in the field of science and technology have accumulated a large amount of high-dimensional time series data due to the general existence of strong time constraints.Severe data pressure makes traditional data modeling and prediction methods limited by data scale and attribute dimensions.Services supporting high-quality put forward higher requirements for big data intelligent prediction technology.How to improve the prediction performance at the data level is a main problem that needs to be solved urgently at this stage.Combined with the above problems,a feature re-abstraction(FRA)algorithm for multivariate time series data is proposed.First,the RobustSTL decomposition algorithm is used to extract trend and seasonality features(TSFs),realize the second-order abstraction of features of multivariate data,and replace the traditional extraction strategy of“labels are features”with“abstract is features”.Then,the correlation strength between the TSFs captured by the re-abstract technology and the target parameters is evaluated by the calculation result of the Pearson correlation coefficient,which confirms the data value of the TSF.On the basis of FRA algorithm,combined with deep learning model,a data-driven multivariate time series prediction algorithm is constructed,and the effectiveness of FRA algorithm is verified by the prediction effect.Experimental results show that the introduction of TSFs as the training vector of the data-driven model can maintain the characteristics of data dimensionality reduction,noise reduction and strong correlation,so as to avoid model overfitting and alleviate model underfitting,and improve the accuracy and robustness of time series prediction algorithms.

作者王昊周建涛郝昕毓王飞宇 WANG Hao;ZHOU Jiantao;HAO Xinyu;WANG Feiyu(College of Computer Science,Inner Mongolia University,Hohhot 010021,China;National&Local Joint Engineering Research Center of Intelligent Information Processing Technology for Mongolian,Hohhot 010021,China;Engineering Research Center of Ecological Big Data,Ministry of Education,Hohhot 010021,China;Inner Mongolia Engineering Laboratory for Cloud Computing and Service Software,Hohhot 010021,China Inner Mongolia Key Laboratory of Social Computing and Data Processing,Hohhot 010021,China;Inner Mongolia Engineering Laboratory for Big Data Analysis Technology,Hohhot 010021,China;Inner Mongolia Key Laboratory of Discipline Inspection and Supervision Big Data,Hohhot 010021,China;Inner Mongolia Big Data Analysis Technology Engineering Laboratory,Hohhot 010021,China)

机构地区内蒙古大学计算机学院蒙古文智能信息处理技术国家地方联合工程研究中心生态大数据教育部工程研究中心内蒙古自治区云计算与服务软件工程实验室内蒙古自治区社会计算与数据处理重点实验室内蒙古自治区大数据分析技术工程实验室内蒙古自治区纪检监察大数据重点实验室

出处《计算机科学》 CSCD 北大核心 2023年第S02期650-657,共8页 Computer Science

基金国家自然科学基金(62162046) 内蒙古科技攻关项目(2021GG0155) 内蒙古自然科学基金重大项目(2019ZD15) 内蒙古自然科学基金(2019GG372) 内蒙古大学/内蒙古自治区研究生科研创新项目(11200-121024)。

关键词多元时序数据多元时序预测算法特征再抽象趋势性和季节性特征相关性评估 Multivariate time series data Multivariate time series forecasting algorithms Feature re-abstraction(FRA) Trend and seasonality feature(TSF) Correlation assessment

分类号 TP311.1 [自动化与计算机技术—计算机软件与理论]

引文网络
相关文献

参考文献10

1赵丹枫,黄雁玲,黄冬梅,林俊辰,宋巍.基于AR_TSM的时间序列motif关联规则挖掘方法研究[J].计算机应用研究,2021,38(2):403-408. 被引量：9
2杨虎,王会琦,程代杰.基于预测的序列异常数据挖掘[J].计算机科学,2004,31(4):117-119. 被引量：6
3万晨,李文中,丁望祥,张治杰,叶保留,陆桑璐.一种基于自演化预训练的多变量时间序列预测算法[J].计算机学报,2022,45(3):513-525. 被引量：9
4贾俊,胡晓松,邓忠伟,徐华池,肖伟,韩锋.数据驱动的锂离子电池健康状态综合评分及异常电池筛选[J].机械工程学报,2021,57(14):141-149. 被引量：28
5刘立,朱健成,韩光洁,毕远国.基于1D-CNN联合特征提取的轴承健康监测与故障诊断[J].软件学报,2021,32(8):2379-2390. 被引量：16
6马陈城,杜学绘,曹利峰,吴蓓.基于深度神经网络burst特征分析的网站指纹攻击方法[J].计算机研究与发展,2020,57(4):746-766. 被引量：22
7邹小云.基于图拉普拉斯变换和极限学习机的时间序列预测算法[J].计算机应用与软件,2021,38(4):288-294. 被引量：3
8贾子钰,林友芳,刘天航,杨凯昕,张鑫旺,王晶.基于多尺度特征提取与挤压激励模型的运动想象分类方法[J].计算机研究与发展,2020,57(12):2481-2489. 被引量：6
9刘意杨,李俊朋,白洪飞,王智凝.基于转折点和趋势段的时间序列趋势特征提取[J].计算机应用,2020,40(S01):92-97. 被引量：12
10周黔,吴铁军.基于重要点的时间序列趋势特征提取方法[J].浙江大学学报（工学版）,2007,41(11):1782-1787. 被引量：20

二级参考文献58

1刘世元,江浩.面向相似性搜索的时间序列表示方法述评[J].计算机工程与应用,2004,40(27):53-59. 被引量：14
2杨金芳,翟永杰,王东风,徐大平.基于支持向量回归的时间序列预测[J].中国电机工程学报,2005,25(17):110-114. 被引量：65
3詹艳艳,徐荣聪,陈晓云.基于斜率提取边缘点的时间序列分段线性表示方法[J].计算机科学,2006,33(11):139-142. 被引量：46
4[1]AGRAWAL R,FALOUTSOS C,SWAMI A.Efficient similarity search in sequence databases[C]∥Proceedings of 4th International Conference on Foundation of Data Organization and Algorithms.Chicago:Springer,1993:69-84.
5[2]AGRAWAL R,LIN K,SAWHNEY H,et al.Fast similarity search in the presence of noise,scaling,and translation in time-series databases[C]∥ Proceedings of the VLDB Conference.San Francisco:Morgan-Kaufmann,1995:490-501.
6[3]GUHA S,RASTOGI R,SHIM K.ROCK:a robust clustering algorithm for categorical attributes[J].Information Systems,2000,25(5):345-366.
7[4]WIJSEN J.Trends in databases:reasoning and mining[J].IEEE Transactions on Knowledge and Data Engineering,2001,13(3):426-438.
8[5]KEOGH E.A fast and robust method for pattern matching in time series databases[C]∥Proceedings of the 9th International Conference on Tools with Artificial Intelligence.Newport Beach:IEEE,1997:578-584.
9[10]PERNG C S,WANG H,ZHANG S R,et al.Landmarks:a new model for similarity-based pattern querying in time series databases[C]∥ Proceedings of the 16th International Conference on Data Engineering.San Diego:IEEE,2000:33-42.
10[11]KEOGH E,CHU S,HART D,et al.An online algorithm for segmenting time series[C]∥ Proceedings of the 1st IEEE International Conference on Data Mining.San Jose:IEEE,2001:289-296.

共引文献119

1张伟,李军霞,吴磊,李斌.基于1DCNN-ELM的带式输送机托辊轴承故障诊断研究[J].煤炭科学技术,2023,51(S01):383-389. 被引量：2
2车云弘,邓忠伟,李佳承,谢翌,胡晓松.基于数据驱动的电池系统泛化SOH估计方法[J].机械工程学报,2022,58(24):253-263. 被引量：3
3黄晓玲,张德平.基于通道拆分CLAHE和自适应阈值残差网络的变工况故障诊断[J].计算机科学,2022,49(S02):907-913. 被引量：2
4贾鑫,梅劲松.一种强噪声背景下地铁车轮轴承故障信号的特征提取方法[J].电子测量技术,2022,45(10):133-139. 被引量：4
5张明辉,于丽萍.人工智能时代下造纸机械设备故障智能监测系统设计[J].造纸科学与技术,2022,41(6):35-39. 被引量：5
6刘佳宝,梁奕,徐漫江.一种过程数据趋势特征提取方法[J].化工自动化及仪表,2012,39(7):850-853. 被引量：1
7肖红,尚福华.基于趋势转折点的时间序列模式表示[J].科学技术与工程,2010,10(13):3254-3257. 被引量：2
8尚福华,孙达辰.基于时间序列趋势转折点的分段线性表示[J].计算机应用研究,2010,27(6):2075-2077. 被引量：21
9贺力克,周华祥.一种基于统计模型和时序分析的弹痕比对方法[J].湖南师范大学自然科学学报,2011,34(2):30-36. 被引量：5
10薛京生,李瑞卿.基于应急报警事件的先重构后建模预测法的探索研究[J].计算机应用研究,2012,29(7):2488-2490. 被引量：1

1张艳琴,黄义恒.互动视频对于戏曲传播的效能分析——以哔哩哔哩平台为探讨中心[J].戏友,2022(6):8-11.
2崔西明,邱志鹏,魏嘉,张弛,宋凯,李喆,王树鹏.基于数据驱动的结构钢表面应力磁巴克豪森噪声表征方法[J].航空学报,2023,44(8):246-257. 被引量：1
3危慧敏,徐宁,杨国云.康复锻炼联合营养干预对慢性阻塞性肺疾病患者营养状况的影响[J].海军医学杂志,2023,44(9):943-947. 被引量：1
4邵晓宏,彭珍珍,靳千千,马秀良.镁合金LPSO/SFs结构间{1012}孪晶交汇机制的原子尺度研究[J].金属学报,2023,59(4):556-566.
5孙可,杨翾,徐祥海,卫炜,张禄亮,陈佳佳.虚拟聚合下多微电网功率交互提升配电系统灵活性研究[J].可再生能源,2023,41(10):1360-1367. 被引量：2
6王双成,郑飞,赵大平.具有不充分信息的高维时间序列因果关系网络研究[J].小型微型计算机系统,2023,44(5):981-990. 被引量：1
7刘子言,戴志辉,湛志飞,杨浩,孙倩莱,王娟,卜哲妮,何方玲,陈生宝,刘荣娇,林慧君,罗垲炜.2017—2022年湖南省登革热流行特征分析[J].热带病与寄生虫学,2023,21(5):245-249. 被引量：5
8战柏成.序决策系统下基于图顶点最小覆盖的属性约简[J].数据挖掘,2023,13(4):327-334.
9戚建功,李大营.导管架脚靴与钢管桩之间竖向错动位移监测及成果分析[J].中文科技期刊数据库（引文版）工程技术,2023(10):9-12.
10邹卓明,张敬贵.设计“黑马”,步履不停——对话Oft Interiors联合创始人邹卓明&张敬贵[J].现代装饰,2023(4):100-105.

计算机科学

2023年第S02期

浏览历史

内容加载中请稍等...

基于特征再抽象(FRA)的多元时序预测方法

参考文献10

二级参考文献58

共引文献119

相关作者

相关机构

相关主题

浏览历史