基于GainRatio降维算法的流量聚类研究被引量：2

Traffic Clustering Research Based on GainRatio Dimesionality Reduction Algorithm

下载PDF

导出

摘要随着网络数据流量的快速增长,需要高效的流量分类技术来实现网络管理、流量控制和安全检测。传统基于端口和有效负载的流量分类方法准确率低,无监督学习方法往往仅采用单一的聚类算法对数据进行聚类分析,且较少研究对数据本身的处理。为了解决上述问题,提出了先运用GainRatio信息增益率方法对原始数据进行降维处理,再将降维后的数据进行聚类的方法。实验结果表明:提出的方法不仅有效地提高了运行效率,而且随着聚类个数的增加,也明显地提高了高准确率的收敛速度。 With the rapid growth of network data traffic, efficient traffic classification technologies are required to implement network management, flow control and security detection.The traditional port-based and payload-based classification methods have low accuracy, and the unsupervised learning method often adopts only a single clustering algorithm to cluster the data. To solve problems mentioned above, a method of reducing the dimensionality of the original data by using the GainRatio information gain rate method and then clustering the dimensionality-reduced data is proposed. The results show that the proposed method not only effectively improves the operating rate, but also accelerates the convergence rate of high accuracy with the increase of the number of clusters.

作者高锐刘北水李丹刘杰尤博 GAO Rui;LIU Beishui;LI Dan;LIU Jie;YOU Bo(CEPREI,Guangzhou 510610,China)

机构地区工业和信息化部电子第五研究所信息安全中心工业和信息化部电子第五研究所

出处《电子产品可靠性与环境试验》 2020年第S02期51-55,共5页 Electronic Product Reliability and Environmental Testing

基金 2018年工业转型升级资金项目-信息编码核心算法检测评估能力建设广州市科技计划一般项目(201804010316) 国家重点研发计划项目(2019YFC0118800) 国家重点研发计划项目(2018YFC1201104)资助。

关键词机器学习流量聚类网络安全维度下降信息增益 machine learning traffic clustering network security dimensionality reduction information gain ratio

分类号 TP301.6 [自动化与计算机技术—计算机系统结构]

引文网络
相关文献

参考文献1

1李敏,卡米力.木依丁.特征选择方法与算法的研究[J].计算机技术与发展,2013,23(12):16-21. 被引量：23

二级参考文献13

1Azizi A. Efficient IRIS recognition through improvement of fea-ture extraction and subset selection [ J ]. International journalof computer science and information security ,2009,2( 1) :72-73.
2Yu Lei, Liu Huan. Efficient feature selection via analysis ofrelevance and redundancy [ J ]. Machine learning research,2004,5(1):1207-1220.
3Hall M A,Smith L A. Feature subset selection: A correlationbased filter approach [ C ] //Proc of international conference onneural information processing. [ s. 1. ] : [ s. n. ] ,1997 :2-4.
4Hsu W H. Genetic wrappers for feature selection in decisiontree induction and variable ordering in Bayesian networkstructure learning [ J ]. Information sciences,2004, 163 ( 1):105-120.
5Karegowda A G, Manjunath A S, Jayaram M A. Comparativestudy of attribute selection using gain ratio and correlationbased feature selection [ J ]. International journal of informationtechnology and knowledge management, 2010, 2(2): 271 -274.
6Chena Y,Abrahama A’Yanga B. Feature selection and classi-fication using flexible neural tree [ J ]. Neurocomputing,2006,70(1):306-308.
7Koller D, Sahami M. Toward optimal feature selection[ C ]//Proceedings of international conference on machine learning,[s.l. ].[s. n. ],1996:162-187.
8Forman G. An extensive empirical study of feature selectionmetrics for text classification[J]. Machine research learning,2003(3) :1289-1305.
9Sheikhi N, Rahmani A, Mohsenzadeh M. An unsupervised fea-ture selection method based on genetic algorithm[ J]. Interna-tional journal of computer science and information security,2011,24(3) ;117-120.
10Yuan Jinhui,Li Jianmin, Zhang Bo. Learning concepts fromlarge scale imbalanced data sets using support cluster ma-chines [C ]//Proceedings of the 14th annual ACM internation-al conference on multimedia, [s. 1. ] :ACM,2006:441-450.

共引文献22

1刘逸竹,李晴,吴文斌.遥感提取灌溉耕地的特征优选——以中国北方为例[J].中国农业资源与区划,2021,42(9):27-35. 被引量：2
2刘飞飞.特征选择算法及应用综述[J].办公自动化,2018,23(21):47-49. 被引量：4
3石慧,贾代平,苗培.基于词频信息的改进信息增益文本特征选择算法[J].计算机应用,2014,34(11):3279-3282. 被引量：16
4朱旭东,梁光明,冯雁.基于改进SFS特征选择BP识别算法[J].现代电子技术,2015,38(12):1-4. 被引量：3
5刘彩红,陈宏涛.基于约简数据集的网络入侵检测技术研究[J].漯河职业技术学院学报,2016,15(2):38-40.
6汪学明,季薇,李云.M3-SVM在帕金森疾病UPDRS分类中的应用[J].计算机技术与发展,2018,28(3):178-182. 被引量：2
7李伟宁,王磊.基于ListNet排序学习的特征处理方法[J].计算机技术与发展,2018,28(9):30-33. 被引量：2
8黄铉.特征选择研究综述[J].信息与电脑,2017,29(24):67-68. 被引量：3
9刘蕾,杜建强,朱志鹏,聂斌,罗计根,贺佳,喻芳,余日跃.基于特征子集相关度和偏最小二乘法的特征选择策略[J].江西中医药大学学报,2019,31(2):88-91. 被引量：1
10林钢,季薇.基于迭代决策树的帕金森UPDRS预测模型研究[J].计算机技术与发展,2019,29(1):216-220. 被引量：4

同被引文献20

1庞双龙,陈晓丹,曾德生,邵翠.云计算环境下基于SDN的数据中心网络架构研究[J].电子技术（上海）,2020(8):31-33. 被引量：4
2赵涛,张太红,陈燕红.中文农业网页去重及相似度判断研究[J].计算机技术与发展,2015,25(1):191-194. 被引量：2
3陈海彪,黄声勇,蔡洁锐,黄恬.基于软件定义网络的电网数据中心路由策略研究[J].电子设计工程,2019,27(4):88-93. 被引量：3
4阳凯,林海涛,黎海雪.基于SDN的流量控制算法综述[J].通信技术,2019,52(4):773-781. 被引量：5
5钟百胜,姜利群.软件定义网络中利用IMKVS结合NFV的分布式网络负载均衡策略[J].计算机应用研究,2019,36(5):1504-1509. 被引量：7
6金勇,刘亦星,王欣欣.基于SDN的数据中心网络多路径流量调度算法[J].计算机科学,2019,46(6):90-94. 被引量：18
7于立婷,谭小波,吴艳梅.SDN网络中基于改进粒子群的最优路径规划算法[J].沈阳理工大学学报,2019,38(2):20-25. 被引量：4
8熊一才,张晶晶,刘轶.软件定义网络中分布式控制器的在线更换机制[J].小型微型计算机系统,2019,40(7):1468-1473. 被引量：1
9万静,吴凡,何云斌,李松.新的降维标准下的高维数据聚类算法[J].计算机科学与探索,2020,14(1):96-107. 被引量：18
10陈希,李玲娟.基于降维和聚类的协同过滤推荐算法[J].计算机技术与发展,2020,30(2):138-142. 被引量：9

引证文献2

1钟掖,龙玉江,赵威扬,张光益.基于软件定义网络的电力云数据中心流量控制技术[J].科学技术创新,2021(15):80-81. 被引量：3
2信晓艺.基于大数据分析的影音推荐系统研究[J].渭南师范学院学报,2021,36(11):87-93.

二级引证文献3

1谢可,郭文静,祝文军,张楠,琚贇.面向电力物联网海量终端接入技术研究综述[J].电力信息与通信技术,2021,19(9):57-69. 被引量：14
2张华洪.基于自主安全的云数据中心网络技术探索与解决方案研究[J].电子技术应用,2021,47(12):47-50.
3张霖,张媛媛,刘星.一种最小化网络能耗的冗余消除路由策略[J].首都师范大学学报（自然科学版）,2023,44(5):37-40.

1张旺,廖丽华.数据通信网网元拓扑动态管理设计研究[J].通信管理与技术,2020(3):43-45.
2李鸿鑫,曾江,张华赢,艾精文,李艳,余涛,丘国斌.基于端口补偿法的配网故障下主配网暂降分析[J].电力系统保护与控制,2020,48(16):45-53. 被引量：3
3焦隽隽.不同还原铁粉掺量下混凝土的电磁屏蔽和吸波特性[J].中国科技论文,2020,15(8):895-899.

电子产品可靠性与环境试验

2020年第S02期

浏览历史

内容加载中请稍等...

基于GainRatio降维算法的流量聚类研究被引量：2

参考文献1

二级参考文献13

共引文献22

同被引文献20

引证文献2

二级引证文献3

相关作者

相关机构

相关主题

浏览历史

基于GainRatio降维算法的流量聚类研究 被引量：2

参考文献1

二级参考文献13

共引文献22

同被引文献20

引证文献2

二级引证文献3

相关作者

相关机构

相关主题

浏览历史

基于GainRatio降维算法的流量聚类研究被引量：2