摘要
在北斗用户机的位置数据采集过程中,容易出现数据冗余现象。为此,分析导致数据冗余的原因,提出一种基于时序聚类的冗余数据压缩算法。该算法采用基于密度的聚类方法将数据集进行分簇,把属于同一类运动特征的位置数据归为一类,根据簇直径判断该簇是否为冗余数据,并对冗余数据进行压缩。实验结果表明,该算法可以正确标识冗余数据,实现数据压缩。
Aiming at data redundancy problems appeared in the data collection process of Beidou user machine position, the paper analyzes the reason caused by data redundancy. Compression algorithm of redundant data based on time series clustering is proposed. The algorithm which adopts the clustering method based on density puts the data sets into the same cluster, which have the same movement characteristics. According to the cluster diameter to determine whether the cluster is redundant data, then compress the redundant data. Experimental results show the algorithm can correctly identify the redundant data and implement data compression.
出处
《计算机工程》
CAS
CSCD
2012年第4期40-42,共3页
Computer Engineering
关键词
冗余数据
时序数据
聚类
数据压缩
redundant data
time series data
clustering
data compression