期刊文献+

基于特征再抽象(FRA)的多元时序预测方法

Multivariate Time Series Forecasting Method Based on FRA
下载PDF
导出
摘要 科技领域的衍生行业因普遍存在强时间约束的特性而累积了海量的高维时间序列数据,严峻的数据压力导致传统的数据建模预测方法受制于数据规模和属性维度。支撑高质量的服务对大数据智能预测技术提出了更高的要求,如何在数据层面上实现预测性能的提升是现阶段亟待解决的主要问题。针对上述问题,提出了针对多元时序数据的特征再抽象(Feature Re-Abstraction,FRA)算法,首先通过RobustSTL分解算法提取趋势性和季节性特征(Trend and Seasonality Features,TSFs),实现多元数据的特征二阶抽象,以“抽象即特征”替代传统“标签即特征”的提取策略,再通过Pearson相关系数的运算结果评估再抽象技术捕捉的TSFs与目标参数间的相关强度,证实TSF的数据价值。在FRA算法的基础上结合深度学习模型构建基于数据驱动的多元时序预测算法,通过预测效果验证FRA算法的有效性。实验结果表明,引入TSFs作为数据驱动模型的训练向量能够兼具数据降维、降噪及强相关特性地维持,从而避免模型过拟合并缓解模型欠拟合,提高时序预测算法的准确性和鲁棒性。 Derivative industries in the field of science and technology have accumulated a large amount of high-dimensional time series data due to the general existence of strong time constraints.Severe data pressure makes traditional data modeling and prediction methods limited by data scale and attribute dimensions.Services supporting high-quality put forward higher requirements for big data intelligent prediction technology.How to improve the prediction performance at the data level is a main problem that needs to be solved urgently at this stage.Combined with the above problems,a feature re-abstraction(FRA)algorithm for multivariate time series data is proposed.First,the RobustSTL decomposition algorithm is used to extract trend and seasonality features(TSFs),realize the second-order abstraction of features of multivariate data,and replace the traditional extraction strategy of“labels are features”with“abstract is features”.Then,the correlation strength between the TSFs captured by the re-abstract technology and the target parameters is evaluated by the calculation result of the Pearson correlation coefficient,which confirms the data value of the TSF.On the basis of FRA algorithm,combined with deep learning model,a data-driven multivariate time series prediction algorithm is constructed,and the effectiveness of FRA algorithm is verified by the prediction effect.Experimental results show that the introduction of TSFs as the training vector of the data-driven model can maintain the characteristics of data dimensionality reduction,noise reduction and strong correlation,so as to avoid model overfitting and alleviate model underfitting,and improve the accuracy and robustness of time series prediction algorithms.
作者 王昊 周建涛 郝昕毓 王飞宇 WANG Hao;ZHOU Jiantao;HAO Xinyu;WANG Feiyu(College of Computer Science,Inner Mongolia University,Hohhot 010021,China;National&Local Joint Engineering Research Center of Intelligent Information Processing Technology for Mongolian,Hohhot 010021,China;Engineering Research Center of Ecological Big Data,Ministry of Education,Hohhot 010021,China;Inner Mongolia Engineering Laboratory for Cloud Computing and Service Software,Hohhot 010021,China Inner Mongolia Key Laboratory of Social Computing and Data Processing,Hohhot 010021,China;Inner Mongolia Engineering Laboratory for Big Data Analysis Technology,Hohhot 010021,China;Inner Mongolia Key Laboratory of Discipline Inspection and Supervision Big Data,Hohhot 010021,China;Inner Mongolia Big Data Analysis Technology Engineering Laboratory,Hohhot 010021,China)
出处 《计算机科学》 CSCD 北大核心 2023年第S02期650-657,共8页 Computer Science
基金 国家自然科学基金(62162046) 内蒙古科技攻关项目(2021GG0155) 内蒙古自然科学基金重大项目(2019ZD15) 内蒙古自然科学基金(2019GG372) 内蒙古大学/内蒙古自治区研究生科研创新项目(11200-121024)。
关键词 多元时序数据 多元时序预测算法 特征再抽象 趋势性和季节性特征 相关性评估 Multivariate time series data Multivariate time series forecasting algorithms Feature re-abstraction(FRA) Trend and seasonality feature(TSF) Correlation assessment
  • 相关文献

参考文献10

二级参考文献58

  • 1刘世元,江浩.面向相似性搜索的时间序列表示方法述评[J].计算机工程与应用,2004,40(27):53-59. 被引量:14
  • 2杨金芳,翟永杰,王东风,徐大平.基于支持向量回归的时间序列预测[J].中国电机工程学报,2005,25(17):110-114. 被引量:65
  • 3詹艳艳,徐荣聪,陈晓云.基于斜率提取边缘点的时间序列分段线性表示方法[J].计算机科学,2006,33(11):139-142. 被引量:46
  • 4[1]AGRAWAL R,FALOUTSOS C,SWAMI A.Efficient similarity search in sequence databases[C]∥Proceedings of 4th International Conference on Foundation of Data Organization and Algorithms.Chicago:Springer,1993:69-84.
  • 5[2]AGRAWAL R,LIN K,SAWHNEY H,et al.Fast similarity search in the presence of noise,scaling,and translation in time-series databases[C]∥ Proceedings of the VLDB Conference.San Francisco:Morgan-Kaufmann,1995:490-501.
  • 6[3]GUHA S,RASTOGI R,SHIM K.ROCK:a robust clustering algorithm for categorical attributes[J].Information Systems,2000,25(5):345-366.
  • 7[4]WIJSEN J.Trends in databases:reasoning and mining[J].IEEE Transactions on Knowledge and Data Engineering,2001,13(3):426-438.
  • 8[5]KEOGH E.A fast and robust method for pattern matching in time series databases[C]∥Proceedings of the 9th International Conference on Tools with Artificial Intelligence.Newport Beach:IEEE,1997:578-584.
  • 9[10]PERNG C S,WANG H,ZHANG S R,et al.Landmarks:a new model for similarity-based pattern querying in time series databases[C]∥ Proceedings of the 16th International Conference on Data Engineering.San Diego:IEEE,2000:33-42.
  • 10[11]KEOGH E,CHU S,HART D,et al.An online algorithm for segmenting time series[C]∥ Proceedings of the 1st IEEE International Conference on Data Mining.San Jose:IEEE,2001:289-296.

共引文献119

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部