摘要
物联网(IoT)设备流量分类对网络资产管理有重要意义,基于流量统计的分类技术是当前研究热点。已有算法主要基于流信息建立特征向量,而对数据包信息利用较少。改进了基于随机森林的物联网设备流量分类算法,基于流信息和流数据包信息共同建立特征向量。实验结果表明:所提算法与其他算法相比,所提算法的平均分类准确率由56%提高到82%,平均召回率由47%提高到67%,平均F_(1)得分由0.43提高到0.74,混淆矩阵对比也有明显提升,因此具备更好的分类效果。
The traffic classification of Internet of things(IoT)devices is very important to the management of cyberspace assets.The classification technology based on statistical identification is a hot spot in current academic research.The previous algorithms were mainly based on the flow information to set up the feature vectors,but lesson the packet information.In this paper,we improve the traffic classification algorithm of IoT devices based on random forest.We set up the feature vectors with both the flow information and the flow's packet information.The experimental results show that,compared with previous algorithms,the classification accuracy of the proposed algorithm increases from 56%to 82%,the recall rate improves from 47%to 67%,the F_(1) score increases from 0.43 to 0.74,and the confusion matrix correlation is also significantly improved.As a result,the proposed algorithm has a better classification effect than previous ones.
作者
李锐光
段鹏宇
沈蒙
祝烈煌
LI Ruiguang;DUAN Pengyu;SHEN Meng;ZHU Liehuang(School of Cyberspace Science and Technology,Beijing Institute of Technology,Beijing 100081,China;National Computer Network Emergency Response Technical Team/Coordination Center of China,Beijing 100029,China)
出处
《北京航空航天大学学报》
EI
CAS
CSCD
北大核心
2022年第2期233-239,共7页
Journal of Beijing University of Aeronautics and Astronautics
关键词
物联网(IoT)
流量分类算法
随机森林
特征向量
流信息
数据包信息
Internet of things(IoT)
traffic classification algorithm
random forest
feature vector
flow information
packet information