摘要
互联网流量分析是网络管理与安全的核心途径,传统基于明文的分析方法在加密流量大势所趋的环境下已基本失效.虽有部分针对加密流量的分析方法,但其忽略了不同加密流量分析目标需求内在的逻辑性与层次性,并缺乏对加密流量本质特征的研究,难以系统化地解决加密流量分析的难题.本文首先面向网络管理与安全监管的实际需求,将互联网加密流量分析按照目标需求划分为检测、分类、识别三个阶段,并描述其目标与方法上的差异;接着基于现有研究成果,分别对现有检测、分类、识别方法从多个粒度、角度进行划分,系统性地归纳与比较现有研究的优缺点;最后,本文基于目前研究,结合未来互联网网络环境发展趋势和加密流量概念漂移的实际问题,从加密流量样本数据集完善、复杂新型网络协议下的加密流量分类与识别、基于应用层特征的加密流量分类与识别、多点协同分布式加密流量分类与识别四个方面分析与展望了未来互联网加密流量检测、分类与识别中可能的研究方向.
Network traffic measurement and analysis is an essential support for network security management and traffic engineering.With the continuous development of encrypted traffic technology,Internet traffic encryption has become an inevitable trend of Internet development.However,network traffic encryption brings privacy and security to users and enterprises and challenges network security protection and traffic management.Traditional traffic measurement and analysis methods such as Deep Packet Inspection(DPI)are not suitable for encrypted traffic environments,so it is of great significance to study encrypted traffic analysis on the Internet.At present,current research in encrypted traffic analysis is classified according to the classification method of encrypted traffic and its input or output.There is no unified standard of the granularity of encrypted traffic analysis or a systematic theoretical definition of it.Moreover,the inconsistency of concepts has brought troubles to the direction subdivision and work refinement in encrypted traffic analysis to some extent.Therefore,because of Internet traffic’s characteristics and analysis requirements,this paper first divides Internet traffic analysis into three stages by definition:encrypted traffic detection,encrypted traffic classification,and encrypted traffic identification,and elaborates the characterization of these three stages from the perspective of users.Encrypted traffic detection refers to the process of screening out encrypted traffic from network traffic,which has nothing to do with the generalized application carried by the traffic,the generalized content transmitted by the traffic,and the rate of the traffic itself,but is only related to the nature of the traffic itself.Encrypted traffic classification represents the generalized application classification of encrypted traffic,which refers to classifying the generalized application carried by the encrypted traffic on the basis that the traffic is known as the encrypted traffic,which has nothing to do with the data transmitted by the traffic.According to the progressive granularity,the generalized application can be divided into service,application,and function.Encrypted traffic identification describes encrypted traffic data and metadata identification,which identifies the actual payload data,the user behavior,the QoE,and other metadata corresponding to the traffic on the premise that the traffic is encrypted traffic and the application type of the traffic is known.Then we analyze and compare the existing Internet encrypted traffic detection methods,classification methods,and identification methods from multiple perspectives and summarize their advantages and disadvantages,respectively.Finally,we combine the development trend of the Internet network environment in the future to analyze and outlook the possible research directions in the three stages of Internet encrypted traffic analysis,from the perspective of concept drift.We summarize the future research directions as encrypted traffic dataset perfection,encrypted traffic classification and identification under new complex network protocols(including TLS-1.3,encrypted DNS,HTTP-2.0,and QUIC),application layer feature based encrypted traffic classification and identification,and multi-point cooperative distributed encrypted traffic classification and identification.
作者
陈子涵
程光
徐子恒
徐珂雅
仇星
钮丹丹
CHEN Zi-Han;CHENG Guang;XU Zi-Heng;XU Ke-Ya;QIU Xing;NIU Dan-Dan(School of Cyber Science and Engineering,Southeast University,Nanjing 211189;Key Laboratory of Computer Network and Information Integration of Ministry of Education(Southeast University),Nanjing211189;Jiangsu Province Engineering Research Center of Security for Ubiquitous Network(Southeast University),Nanjing 211189;International Governance Research Base of Cyberspace(Southeast University),Nanjing 211189)
出处
《计算机学报》
EI
CAS
CSCD
北大核心
2023年第5期1060-1085,共26页
Chinese Journal of Computers
基金
国家自然科学基金面上项目(62172093)
国家重点研发计划项目课题(2020YFB1804604)
2019年工信部工业互联网创新发展工程项目(6709010003)资助.
关键词
互联网加密流量分析
加密流量检测
加密流量分类与识别
概念漂移
复杂新型网络协议
Internet encrypted traffic analysis
encrypted traffic detection
encrypted traffic classification and identification
concept drift
new complex network protocols