摘要
与确定时间序列相比,不确定时间序列在每个时间点上的取值不是一个确定的值,而是一个可能值的集合,这种不确定给时间数据的降维处理带来了巨大的挑战。加之时间序列固有的数据规模大、数据维度高的特点,对不确定时间序列进行预处理必不可少,现有的针对确定时间序列的降维方法已经不再适用。为解决此问题,建立适当的数据描述统计模型,将原始不确定时间序列归约为三条确定时间序列。同时,针对该模型,提出基于关键点的不确定时序数据线性降维算法。该算法综合考虑体现时序数据特征的极值点与转折点,在进行高效数据降维的同时避免了过度除噪的弊端。实验结果表明,该描述统计模型与基于关键点的线性降维算法的结合具有良好的降维效果,且对于不同领域的数据具有较好的普适性。
Compared with traditional time series,the value of uncertain time series at each timestamp is a set of many possible values,which brings great challenges to linear dimensionality reduction for uncertain time series. Considering that uncertain time series data is large-scaled and multidimensional,it is necessary to preprocess raw data before proceeding to the next step. Traditional methods for uncertain time series dimensionality reduction are no longer applicable. To deal with the problem,we propose a descriptive statistical model which reduces the origin uncertain time series into three certain time series. In addition,a new time series data segmentation algorithm is proposed based on the model. The algorithm takes both extreme point and turning point into consideration,which makes efficient data dimensionality reduction while avoiding excessive noise cancellation. Experiment shows that the combination of linear dimensionality reduction method and statistical model has a great effect on dimensionality reduction. Furthermore,the method is also universal for data in different fields.
作者
汤其婕
朱小萍
TANG Qi-jie;ZHU Xiao-ping(School of Computer Science and Technology,Nanjing University of Aeronautics and Astronautics,Nanjing 211106,China)
出处
《计算机技术与发展》
2018年第8期22-26,31,共6页
Computer Technology and Development
基金
国家自然科学基金(61772269)
关键词
不确定时间序列
描述统计模型
关键点
线性降维
uncertain time series
descriptive statistical model
key points
linear dimensionality reduction