摘要
近年来,许多机器学习的方法被广泛应用于网络流量分类识别的问题中,结合有监督学习与无监督学习的特点,提出一种基于半监督学习的流量分类识别方法,该方法改进K均值聚类算法中初始簇中心的选取,通过基于密度因子的相似性函数来满足聚类数据的全局一致性要求以获取更适合的初始簇中心,并通过最大似然估计方法标记聚类结果实现与相关应用类型或协议的对应匹配过程,实验结果表明,该算法提升了网络流量分类识别结果的准确性和分类识别效率,能够有效满足流量分类识别的应用需求。
In recent years,many machine learning methods have been widely used in network traffic classification issues.Combining the characteristics of supervised learning and unsupervised learning,this paper proposes a network traffic classification method based on semi-supervised learning,which can improve the selection of initial cluster centers of K-means clustering algorithm.It is chosen by the similarity factor based on density function to meet the requirement for global consistency of clustering process,and more suitable initial cluster centers are selected.Then clustering results are tagged through the maximum likelihood estimation method and achieved with the correspondence matching process of relevant application or protocol type.The experimental results show that the algorithm can improve the accuracy and efficiency of network traffic classification and identification,which can effectively meet the application requirements of traffic classification.
出处
《电子测量与仪器学报》
CSCD
2014年第4期381-386,共6页
Journal of Electronic Measurement and Instrumentation
基金
"十一五"国家科技支撑计划重点资助项目"国际贸易区域经贸合作与流通促进关键支撑技术研究"(2009BAH46B03)
关键词
网络流量
半监督学习
分类识别
聚类中心点
network traffic
semi-supervised learning
traffic classification
clustering center