摘要
近年来,基于符号表示的时间序列分类方法受到广泛关注,大部分现有方法对原始数据进行符号表示时,没有使用类别的标签信息。提出基于线性判别分析(LDA)的时间序列符号表示方法,考虑最大化类间区分度,使用LDA对原始数据集进行维数约减。再利用信息增益寻找降维后数据的符号投影区间,采用多重系数分箱(MCB)技术将维数约简后数据表示成符号序列。该方法在20个时间序列数据集上的分类效果好于已有方法,有监督的符号表示方法能有效提高分类性能。
In recent years,the time series classification method based on symbolic representation has received extensive attention.Most of the existing methods do not use the tag information of the category when they symbolize the original data.Based on the linear discriminant analysis(LDA),the symbol representation method of time series is proposed to maximize the inter-class discrimination.LDA was used to reduce the dimension of the original data set,and then information gain was used to find the symbol projection interval of the reduced dimension data.Using MCB technology to reduce dimension,the data was expressed as symbol sequence.Our method is better than the existing method in 20 time series data sets,and the supervised symbol representation method can effectively improve the classification performance.
作者
武天鸿
翁小清
单中南
Wu Tianhong;Weng Xiaoqing;Shan Zhongnan(College of Information Technology,Hebei University of Economics and Business,Shijiazhuang 050061,Hebei,China)
出处
《计算机应用与软件》
北大核心
2020年第2期259-265,307,共8页
Computer Applications and Software
关键词
时间序列分类
线性判别分析
符号表示
Time series classification
Linear discriminant analysis
Symbolic representation