摘要
针对当前业务流量的分类方式过于简略、识别结果不够确切的问题,提出基于状态特征的分类识别方法以精确识别流量数据中的用户行为。定义了网络通信中的用户行为并分析其特征,通过矢量量化技术结合主题模型方法从流量序列中提取行为状态特征,利用机器学习算法对状态特征建模,并按照用户行为的分类对流量进行识别。实验结果显示按照行为分类能更加详细地描述流量特点。在相同机器学习算法下,基于状态特征的行为识别方法准确度优于传统方法。
General traffic classification methodologies cannot describe the detail information of network communication. This paper proposed a novel method based on users' online activities to classify traffic. This method described different traffic with characteristics of activity status, and combined vector quantization with topic model to extract characteristics of activity status from traffic. Then it built classification model and identified traffic with these characteristics by some machine learning algorithms. The experimental results show that category of users' activities give a more elaborate scheme to traffic classification. And with the same machine learning algorithm, the recognition rate of classification based on characteristics of activity status is better than traditional ways' rate.
出处
《计算机应用研究》
CSCD
北大核心
2015年第2期560-564,578,共6页
Application Research of Computers
关键词
流量识别
用户行为
行为状态特征
主题模型
traffic classification
user activity
characteristics of activity status
topic model