摘要
面对未知协议下的报文数据,由于不能通过协议规范获得相关特征,导致传统的模式匹配方法在报文提取和协议识别过程中存在着难题;提出了以数据挖掘理论为基础的数据报指纹特征提取方案;在特征序列挖掘过程中引入自适应权值,对源数据中的序列模式进行加权统计得到判决结果;再利用提升率对特征序列进行关联规则验证,输出数据报的指纹特征;最后,采用ARP广播帧和ICMP数据包作为原始数据,测试提取数据报指纹特征;实验结果表明,自适应权值的引入能够有效减小报文中冗余数据段的干扰,提高指纹特征提取的正确率,并对报文的长度变化有一定的鲁棒性。
Faced with the packet data under unknown protocol, it brought problems in the process of packet extraction and protocol identification for the traditional pattern matching method, for the reason that it couldn' t obtain the relevant characteristics through protocol specification. A method for the extraction of datagram fingerprint characteristics was proposed based on data mining theory. In the process of characteristic sequence mining, it introduced the self-adaptive weights to get the verdict after the weighted statistics of sequence model from the original data. And it used Up-rate to verify the association rules between the characteristic sequence. Then fingerprint characteristics was exported. Finally, ARP broadcast frames and ICMP packets were used as raw data, and the fingerprint characteristics were extracted. Experiment results show that, the self-adaptive weights could reduce the interference of redundant data segments, improve the accuracy of the extraction of fingerprint characteristics, and have some robustness to the packet length change.
出处
《计算机测量与控制》
北大核心
2014年第7期2288-2290,2294,共4页
Computer Measurement &Control
基金
国家自然科学基金(61202490)
关键词
权值
自适应
未知协议
指纹特征
比特流
weights
self-adaptive
unknown protocol
fingerprint characteristics
bit stream