摘要
在复杂多变网络环境下,传统算法不能有效地挖掘出异常数据,准确率较低,严重威胁网络正常数据安全。提出一种基于半结构化时间序列的异常数据挖掘算法。首先通过临近算法对半结构化时间序列的数据进行特征提取,整合为特征数据序列并做信息熵值计算,获得数据熵值的空间信号,其次采用线性斜率算法将空间信号做降维处理,并使用分段聚合符号变换对降维后的信号做符号转换,识别出正常数据与异常数据,最后采用频繁项集算法挖掘出异常数据。实验表明,半结构化时间序列方法计算简单,能够有效挖掘出异常数据,准确率较高,具有良好的鲁棒性,实现高效挖掘异常数据,保障网络正常数据的安全。
In complex and changeable network environment, the traditional algorithm can’t effectively mine the abnormal data, resulting in low accuracy. This paper focuses on an abnormal data mining algorithm based on semi-structured time series. Firstly, the data of semi-structured time series was extracted by proximity algorithm, which was integrated into feature data series. Then, the entropy value of information was calculated to obtain the spatial signal of data entropy. Secondly, the linear slope algorithm was used to reduce the dimension of space signal and the piecewise aggregation symbol transform was used to perform the symbol conversion on the signal after the dimension reduction. Moreover, the normal data and abnormal data could be found. Finally, the frequent item sets algorithm was used to mine the abnormal data. Simulation results show that the semi-structured time series method is simple which can effectively mine abnormal data with high accuracy and good robustness. Thus, this method can efficiently mine abnormal data and ensure the security of network normal data.
作者
杨柳
YANG Liu(Chongqing Nanfang Translators College,SISU,Chongqing 401120,China)
出处
《计算机仿真》
北大核心
2020年第10期230-234,共5页
Computer Simulation
关键词
半结构化
时间序列
异常数据
数据挖掘算法
信息熵值
Semi-structured
Time series
Abnormal data
Data mining algorithm
Information entropy value