摘要
字段划分是协议格式推断的基础,协议格式推断的后续步骤,如报文结构识别、字段语义推断和字段取值约束判定,高度依赖于字段划分质量。二进制协议缺少字符编码和定界符,字段长度取值灵活,值域变化丰富,因此字段划分难度较大。针对相关研究存在的特征构造维度单一和判决规则简单等问题,提出了一种基于概率模型的二进制协议字段划分方法。以二进制协议报文为研究对象,从报文内在结构、报文间取值变化等维度构造字段边界约束关系,然后用概率的方式将各种约束组合在一起,利用因子图模型计算各个位置成为边界的概率,从中得出最有可能的字段边界。实验结果表明,相比传统方法,所提方法在二进制协议字段边界识别中精准度更高、鲁棒性更强。
Field segmentation is the basis of protocol format inference.The subsequent steps of protocol format inference, such as message structure identification, field semantic inference and field value constraint inference, highly depend on the quality of field segmentation.Field segmentation of binary protocol is a big challenge because of the lack of character coding and delimitation, the flexibility of field length and the expansiveness of field range.To improve feature construction and decision rules, this paper proposes a novel binary protocol field segmentation method based on probability model.First, it constructs the field boundary constraint relationship of binary protocol messages from the internal structure of message and the value change between messages.Then, it combines various constraints in the way of probability, calculating the probability of each position becoming the boundary by factor graph model.Finally, the most likely field boundaries are obtained from probability.Experiments show that the proposed method can achieve more accurate and robust results than the traditional methods in binary protocol field segmentation.
作者
杨资集
潘雁
祝跃飞
李小伟
YANG Zi-ji;PAN Yan;ZHU Yue-fei;LI Xiao-wei(Strategic Support Force Information Engineering University,Zhengzhou 450001,China;State Key Laboratory of Mathematical Engineering and Advanced Computing,Zhengzhou 450001,China)
出处
《计算机科学》
CSCD
北大核心
2022年第10期319-326,共8页
Computer Science
基金
国家重点研发计划(2019QY1300)。
关键词
字段划分
因子图
概率模型
协议逆向
Field segmentation
Factor graph
Probability model
Protocol reverse