摘要
提出了一种基于机器学习的ShadowSocksR代理下的App流量识别方案。目的是识别出智能手机产生的ShadowSocksR代理流量来源于哪款APP。该方案包含流量预处理、特征提取和模型构建。首先将智能手机产生的ShadowSocksR流量对应的数据包集合,按照到达时间间隔以及源目的IP地址和端口这两部分信息将其划分为细粒度的流数据分组;再将包含数据包较少的流数据分组进行进一步的过滤,目的是过滤掉后台App或者智能手机操作系统产生的干扰流量识别的噪音流量;之后,从过滤后的流数据分组集合中提取数据包长度统计特征与分布特征、时间统计特征、数据包频率特征、数据包过滤比例特征、前后流结合特征的特征向量组成特征矩阵,输入机器学习算法得到App流量识别模型,对于需要识别的ShadowSocksR流量经过相同处理步骤得到特征矩阵后,输入App流量识别模型即可得到流量识别结果。实验结果表明,该流量识别方法对于ShadowSocksR代理下的App流量识别可以到达97%以上的准确率。
An App traffic identification scheme based on machine learning under ShadowSocksR(SSR)proxy is proposed with the purpose being to identify from which APP the ShadowSocksR proxy traffic generated by the smartphone originates.The scheme consists of three steps:traffic preprocessing,feature extraction and model construction.First,the packet set corresponding to the ShadowSocksR traffic generated by smartphones is divided into fine-grained stream data groups according to the arrival time interval,source and destination IP address and port,and then the stream data groups containing fewer packets are further filtered with the purpose being to filter out noise traffic generated by the background App or smart phone operating system that interferes with traffic identification.Then,from the filtered flow data grouping set,the statistical features and distribution features of packet length,time statistical features,packet frequency features,packet filtering ratio features,and the combined features of the front and rear streams are extracted to form a feature matrix,which is input into the machine learning algorithm.An app traffic identification model for the ShadowSocksR traffic that needs to be identified is obtained,and after the feature matrix is obtained through the same processing steps,the flow identification results can be obtained by inputting the App traffic identification model.Experimental results show that the traffic identification method can reach an accuracy rate of more than 97%for App traffic identification under ShadowSocksR proxy.
作者
郭刚
杨超
陈明哲
马建峰
GUO Gang;YANG Chao;CHEN Mingzhe;MA Jianfeng(School of Cyber Engineering,Xidian University,Xi’an 710071,China)
出处
《西安电子科技大学学报》
EI
CAS
CSCD
北大核心
2023年第2期138-146,共9页
Journal of Xidian University
基金
国家自然科学基金青年基金(61906143,61702398)
陕西省重点研发计划(重点产业创新链)(2018ZDCXL-G-9-5)
陕西省2022年自然科学基础研究计划-青年项目(2022JQ-658)。