期刊文献+

基于离散序列报文的协议格式特征可变域挖掘算法 被引量:1

Separate Protocol Message-Based Format Signature Construction Method for Variable Field
下载PDF
导出
摘要 针对格式特征提取算法无法挖掘出具有可变取值的协议格式特征问题,提出一种基于离散序列报文的协议格式特征可变域自动提取算法(VFSC)。VFSC在对离散序列报文进行聚类的基础上,通过改进的频繁模式挖掘算法提取出具有可变域的协议关键字,筛选出具有可变域的协议格式特征。仿真结果表明,VFSC在以单个报文为颗粒度的识别中对7种协议的识别率达到95%以上,在与Apriori算法的比较中证明拥有识别新型报文种类的能力。实验结果表明,VFSC不依赖完整会话,能够发现识别新类型报文,更符合实际应用中由于接收条件限制导致会话信息及训练数据集不完整的情形。 To solve the problem that format signature extraction algorithms can' t get variable value signatures, a novel variable field signature construction (VFSC) based on separate protocol message algorithm was proposed. VFSC extracted the protocol variable field format signature automatically on the basis of protocol' s separate messages instead of flows. First, VFSC put the protocol' s separate messages into clusters. Then within each message cluster, VFSC extracted the variable field key words using modified frequent pattern mining algorithm VariableSpan. Last, VFSC acquired the var- iable field format signature by filtering and choosing the variable field key words using heuristic rules proposed in this paper. Simulation results show that VFSC is quite accurate and reliable. The accu- racy for each of seven protocols reaches above 95% when VFSC is used in protocol' s separate mes- sage classification. In comparison with Apriori, VFSC shows it can classify new kind of messages. Experimental results indicate that the proposed VFSC doesn' t depend on flow and can classify new kind of messages. VFSC is more practical in situations where separate protocol messages are received and training dataset is incomplete.
作者 李阳 李青 张霞 LI Yang,LI Qing,ZHANG Xia(Information Engineering University, Zhengzhou 450001 , China)
机构地区 信息工程大学
出处 《信息工程大学学报》 2018年第1期30-38,共9页 Journal of Information Engineering University
基金 科研基金资助项目(2014500901)
关键词 离散序列报文 协议格式特征 可变域 separate protocol message tormat signature variable field
  • 相关文献

参考文献5

二级参考文献64

  • 1赵咏,姚秋林,张志斌,郭莉,方滨兴.TPCAD:一种文本类多协议特征自动发现方法[J].通信学报,2009,30(S1):28-35. 被引量:10
  • 2刘立芳,霍红卫,王宝树.PHGA-COFFEE:多序列比对问题的并行混合遗传算法求解[J].计算机学报,2006,29(5):727-733. 被引量:11
  • 3金婷,王攀,张顺颐,陆青莲,陈东.基于DPI和会话关联技术的QQ语音业务识别模型和算法[J].重庆邮电学院学报(自然科学版),2006,18(6):789-792. 被引量:10
  • 4THOMAS K, ANDRE B, NEVIL B. File-sharing in the Intemet: a Characterization of P2P Traffic in the Backbone[R]. UC, Riverside, 2003.
  • 5SUBHABRATA S, OLIVER S, WANG D M. Accurate, scalable in network identification of P2P traffic using application signatures[A]. International World Wide Web Conference[C]. New York,2004.
  • 6KARAGIANNIS T, PAPAGIANNAKI K, FALOUTSOS M. BLINC: multilevel tratfic classification in the dark[A]. Proc of ACM SIGCOMM[C]. Philadelphia, PA, 2005.
  • 7KARAGIANNIS T, BROIDO A, FALOUTSOS M. Transport layer identification of P2P traffic[A]. Proc of ACM SIGCOMM IMC[C]. Taormina, Sicily, Italy, 2004.
  • 8ZANDER S, NGUYENI T, ARMITAGEI G.Self-learning IP traffic classification based on statistical flow characteristics[A]. Proc of PAM[C]. Boston, MA, 2005.
  • 9ZUEV D, MOORE A W. Traffic classification using a statistical approach[A]. Proc of PAM[C]. Boston, 2005.
  • 10HERN E NOBEL A B, SMITH F D. Statistical clustering of intemet communication patterns[A]. Proceedings of the 35th Symposium on the Interface of Computing Science and Statistics, Computing Science and Statistics[C]. 2003.

共引文献68

同被引文献3

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部