期刊文献+

利用半监督近邻传播聚类算法实现P2P流量识别 被引量:6

P2P traffic identification based on semi-supervised affinity propagation clustering
下载PDF
导出
摘要 为了解决利用少量标记样本实现准确的P2P流量识别,提出了一种基于半监督近邻传播(AP)聚类算法的P2P流量识别方法.首先对少量样本进行标记,然后在聚类过程中为标记样本和非标记样本设置不同的参考度,使标记样本能够优先成为类代表点,进而通过样本间的消息加权更新完成聚类,最后按照相应的"标记-类别映射"规则实现对P2P流量的识别.研究了参考度与消息加权更新对识别性能的影响,实验结果显示:当标记样本的比例为5%时,对P2P流量的识别准确率高于90%,误识别率低于3%;当标记样本的比例达到15%后,识别准确率高于95%,最高可达98%,而误识别率则低于1%;识别性能随标记样本比例的提高而提高. Abstract : In this study a method for P2P traffic identification was proposed based on utilizing semi-supervised affin- ity propagation clustering aimed at accurately identifying P2P traffic with as few labeled samples as possible. First- ly, a small amount of samples were labeled. Secondly, the labeled as well as the unlabeled samples were con- figured with different "preference" parameters, which made it more likely for the labeled samples to become exem- plars, as opposed to the unlabeled samples. Thirdly, all samples were clustered by a weighted message passing. Finally, P2P traffic was identified by the preset "marks-category" mapping rules based on the clustering results. The influence of the "preference" parameter and weighted message passing on the identification was examined. Ex- perimental results show the true positive rate (TPrate) was above 90% and the false positive rate (FPrate) was be- low 3% when the proportion of labeled samples was 5% ; the TPrate was above 95% with a maximum of 98% while the FPrate was below 1% when the proportion of labeled samples increased to 15% and above; and increase of the proportion of labeled samples may result to performance improvement of the proposed method.
作者 于明 朱超
出处 《哈尔滨工程大学学报》 EI CAS CSCD 北大核心 2013年第5期653-657,661,共6页 Journal of Harbin Engineering University
基金 国家自然科学基金资助项目(61172059) 辽宁省博士启动基金资助项目(20111022)
关键词 P2P流量识别 半监督聚类 近邻传播 机器学习 网络安全 P2P traffic identification semi-supervised clustering affinity propagation machine learning network security
  • 相关文献

参考文献4

二级参考文献44

  • 1赵咏,姚秋林,张志斌,郭莉,方滨兴.TPCAD:一种文本类多协议特征自动发现方法[J].通信学报,2009,30(S1):28-35. 被引量:10
  • 2李伟男,鄂跃鹏,葛敬国,钱华林.多模式匹配算法及硬件实现[J].软件学报,2006,17(12):2403-2415. 被引量:42
  • 3TIAN Zheng,LI XiaoBin,JU YanWei.Spectral clustering based on matrix perturbation theory[J].Science in China(Series F),2007,50(1):63-81. 被引量:19
  • 4Ahlswede R, Cai N, Li S-Y R, et al. Network information flow. IEEE Trans Inf Theory, 2000, 46:1204-1216.
  • 5Ho T, Medard M, Koetter R, et al. A random linear network coding approach to multicast. IEEE Trans Inf Theory, 2006, 52:4413-4430.
  • 6Li S-Y R, Yeung R W, Cai N. Linear network coding. IEEE Trans Inf Theory, 2003, 49:371-381.
  • 7Yeung R W. Information Theory and Network Coding. Hong Kong: Springer, 2008.
  • 8Jaggi S, Sanders P, Chou P A, et al. Polynomial time algorithms for multicast network code construction. IEEE Trans Inf Theory, 2003, 51:1973-1982.
  • 9Koetter R, Medard M. An algebraic approach to network coding. IEEE/ACM Trans Network, 2003, 11:782-795.
  • 10Lima L, Medard M, Barros J. Random linear network coding: A free cypher? In: Proc of the IEEE Inter Sym on Infor Theory, Nice. 2007. 546-550.

共引文献224

同被引文献44

  • 1杨林,刘聪,徐慧,张宵龙.P2P流实时识别技术研究[J].计算机科学,2012,39(S2):86-87. 被引量:3
  • 2董旭,魏振军.一种加权欧氏距离聚类方法[J].信息工程大学学报,2005,6(1):23-25. 被引量:32
  • 3宫婧,孙知信,顾强.基于行为特征描述的P2P流识别方法的研究[J].小型微型计算机系统,2007,28(1):48-53. 被引量:5
  • 4王开军,张军英,李丹,张新娜,郭涛.自适应仿射传播聚类[J].自动化学报,2007,33(12):1242-1246. 被引量:144
  • 5Frey B J,Ducck D. Clustering by Passing Messages between Data Points [J]. Scicncc,2007,315: 972-976.
  • 6Sumcdha M L, Wcigt M. Clustering by Soft-Constraint Affinity Propagation Applications to Gene-Expression Data [J]. Bioinformatics,2007,23(20.) : 2708-27 1 5.
  • 7DU Chunhua, YANGJic,WU Qiang, et al. Locality Preserving Projections Plus Afinity Propagation:A Fast Method for Face Recognition [J]. Optical Engineering,2008,47(4 ): 040501.
  • 8Lazic N,Frey B J, Aarabi P. Solving the Incapacitatcd Facility Location Problem Using Message Passing Algorithms [C]//Proceedings of the 13th International Confcrcnce on Artificial Intelligence and Statistics. Sardinia: [s. n.], 2010: 429-436.
  • 9Ahmad W, Narayanan A. Feature Weighing for Eficicnt Clustering [C]//Proceedings of the 6th International Confcrcnce on Advanced Information Management and Scrvice. Piscataway: IEEE Press, 2010: 236-242.
  • 10Amorim R C, dc, Mirkin B. Minkowski Mctric, Feature Weighting and Anomalous Cluster Initializing in K-Mcans Clustering [J]. Pattern Recognition,2012,45(3.): 1061-1075.

引证文献6

二级引证文献4

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部