Support Vector Machine and Random Forest Modeling for Intrusion Detection System (IDS) 被引量：17

Support Vector Machine and Random Forest Modeling for Intrusion Detection System (IDS)

下载PDF

导出

摘要 The success of any Intrusion Detection System (IDS) is a complicated problem due to its nonlinearity and the quantitative or qualitative network traffic data stream with many features. To get rid of this problem, several types of intrusion detection methods have been proposed and shown different levels of accuracy. This is why the choice of the effective and robust method for IDS is very important topic in information security. In this work, we have built two models for the classification purpose. One is based on Support Vector Machines (SVM) and the other is Random Forests (RF). Experimental results show that either classifier is effective. SVM is slightly more accurate, but more expensive in terms of time. RF produces similar accuracy in a much faster manner if given modeling parameters. These classifiers can contribute to an IDS system as one source of analysis and increase its accuracy. In this paper, KDD’99 Dataset is used and find out which one is the best intrusion detector for this dataset. Statistical analysis on KDD’99 dataset found important issues which highly affect the performance of evaluated systems and results in a very poor evaluation of anomaly detection approaches. The most important deficiency in the KDD’99 dataset is the huge number of redundant records. To solve these issues, we have developed a new dataset, KDD99Train+ and KDD99Test+, which does not include any redundant records in the train set as well as in the test set, so the classifiers will not be biased towards more frequent records. The numbers of records in the train and test sets are now reasonable, which make it affordable to run the experiments on the complete set without the need to randomly select a small portion. The findings of this paper will be very useful to use SVM and RF in a more meaningful way in order to maximize the performance rate and minimize the false negative rate. The success of any Intrusion Detection System (IDS) is a complicated problem due to its nonlinearity and the quantitative or qualitative network traffic data stream with many features. To get rid of this problem, several types of intrusion detection methods have been proposed and shown different levels of accuracy. This is why the choice of the effective and robust method for IDS is very important topic in information security. In this work, we have built two models for the classification purpose. One is based on Support Vector Machines (SVM) and the other is Random Forests (RF). Experimental results show that either classifier is effective. SVM is slightly more accurate, but more expensive in terms of time. RF produces similar accuracy in a much faster manner if given modeling parameters. These classifiers can contribute to an IDS system as one source of analysis and increase its accuracy. In this paper, KDD’99 Dataset is used and find out which one is the best intrusion detector for this dataset. Statistical analysis on KDD’99 dataset found important issues which highly affect the performance of evaluated systems and results in a very poor evaluation of anomaly detection approaches. The most important deficiency in the KDD’99 dataset is the huge number of redundant records. To solve these issues, we have developed a new dataset, KDD99Train+ and KDD99Test+, which does not include any redundant records in the train set as well as in the test set, so the classifiers will not be biased towards more frequent records. The numbers of records in the train and test sets are now reasonable, which make it affordable to run the experiments on the complete set without the need to randomly select a small portion. The findings of this paper will be very useful to use SVM and RF in a more meaningful way in order to maximize the performance rate and minimize the false negative rate.

作者 Md. Al Mehedi Hasan Mohammed Nasser Biprodip Pal Shamim Ahmad

机构地区 Department of Computer Science and Engineering Department of Computer Science and Engineering Department of Computer Science and Engineering Department of Statistics

出处《Journal of Intelligent Learning Systems and Applications》 2014年第1期45-52,共8页 智能学习系统与应用（英文）

关键词 INTRUSION Detection KDD’99 SVM KERNEL Random FOREST Intrusion Detection KDD’99 SVM Kernel Random Forest

分类号 R73 [医药卫生—肿瘤]

引文网络
相关文献

同被引文献90

1孙焕良,鲍玉斌,于戈,赵法信,王大玲.一种基于划分的孤立点检测算法[J].软件学报,2006,17(5):1009-1016. 被引量：16
2孙云,李舟军,陈火旺.孤立点检测算法及其在数据流挖掘中的可用性[J].计算机科学,2007,34(10):200-203. 被引量：15
3Alvarez J M, Lopez A M. Combining priors, appearance, and context for road detection[J]. IEEE Transactions on Intelligent Transportation Systems, 2013, 15(3): 1168-1178.
4Nguyen D V, Kuhnert L, Thamke S, et al. A novel approach for a double-check of passable vegetation detection in au- tonomous ground vehicles[C]//l 5th IEEE International Confer- ence on Intelligent Transportation Systems. Piscataway, USA: IEEE, 2012: 230-236.
5Nguyen D V, Kuhnert L, Jiang T, et al. Vegetation detection for outdoor automobile[C]//IEEE International Conference Guid- ance on Industrial Technology. Piscataway, USA: IEEE, 21)11: 358-364.
6Bradley D M, Unnikrishnan R, Bagnell J. Vegetation detec- tion for driving in complex environments[C]//IEEE Internation- al Conference on Robotics and Automation. Piscataway, USA: IEEE, 2007: 503-508.
7Zhao Y P, Wang H, Yan R C. Unstructured road edge detec- tion and initial positioning approach based on monocular vi- sion[C]//AASRI Conference on Computational Intelligence and Bioinformatics. Amsterdam, Netherlands: Elsevier Science, 2012: 486-491.
8Salim N N A, Cheng X, Xiao D G. Improved shadow re- moval for unstructured road detection[C/OL]//Proceedings of the International Conference on Image Processing, Comput- er Vision, and Pattern Recognition. 2013: 1-5. [2015-01-01]. http://worldcomp-proceedings.com/proc/p2013/IPC4037.pdf.
9Gu Y J, Jin Z. Grass detection based on color features[C[// Proceedings of Chinese Conference on Pattern Recognition. Piscataway, USA: IEEE, 2010: 1-5.
10Ren X, Malik J. Learning a classification model for segmen- tation[C]//9th IEEE International Conference on Computer Vi- sion. Piscataway, USA: IEEE, 2003: 10-17.

引证文献17

1周植宇,杨明,薛林继,王春香,王冰.一种基于高斯核支持向量机的非结构化道路环境植被检测方法[J].机器人,2015,37(6):702-707. 被引量：11
2任晓芳,赵德群,秦健勇.基于随机森林和加权K均值聚类的网络入侵检测系统[J].微型电脑应用,2016,32(7):21-24. 被引量：7
3李秀丽,李星毅.多断面相关性区间预测法在短期交通流预测中的应用[J].电子设计工程,2017,25(19):10-15. 被引量：3
4钱雪忠,秦静,宋威.改进的并行随机森林算法及其包外估计[J].计算机应用研究,2018,35(6):1651-1654. 被引量：4
5阴爱英,吴运兵,杨晓花.面向制造业不平衡数据的混合采样算法[J].计算机工程与设计,2018,39(4):1053-1058. 被引量：2
6刘金平,何捷舟,马天雨,张五霞,唐朝晖,徐鹏飞.基于KELM选择性集成的复杂网络环境入侵检测[J].电子学报,2019,47(5):1070-1078. 被引量：25
7沈焱萍,伍淳华,罗捷,高方平.基于元优化的KNN入侵检测模型[J].北京工业大学学报,2020,46(1):24-32. 被引量：6
8梅莹莹,梁月放.基于数据流聚类挖掘算法的入侵检测系统研究[J].信阳农林学院学报,2020,30(3):113-118. 被引量：2
9林涛,张达,王建君.改进LSTM-RF算法的传感器故障诊断与数据重构研究[J].计算机工程与科学,2021,43(5):845-852. 被引量：9
10张全龙,王怀彬.基于膨胀卷积和门控循环单元组合的入侵检测模型[J].计算机应用,2021,41(5):1372-1377. 被引量：10

二级引证文献98

1张超群,韦川源,梁刚,黑小龙,朱旭东.基于深度学习技术的恶意攻击的分析与识别[J].计算机应用研究,2020,37(S01):283-286. 被引量：6
2赵嘉昕,崔喆.面向法律判决文书的长文档抽取式文摘方法——BIGDCNN[J].计算机应用,2023,43(S01):67-74. 被引量：1
3路凯,钟跃崎,朱俊平,柴新玉.基于视觉词袋模型的羊绒与羊毛快速鉴别方法[J].纺织学报,2017,38(7):130-134. 被引量：9
4张成梁,李蕾,董全成,葛荣雨.应用区域颜色分割的机采棉杂质检测方法[J].纺织学报,2017,38(7):135-141. 被引量：1
5王波,王怀彬.基于主动学习的非均衡异常数据分类算法研究[J].信息网络安全,2017(10):42-49. 被引量：1
6李鹏,周文欢.基于k-means和决策树的混合入侵检测算法[J].计算机与现代化,2017(12):12-16. 被引量：4
7魏金太,高穹.基于信息增益和随机森林分类器的入侵检测系统研究[J].中北大学学报（自然科学版）,2018,39(1):74-79. 被引量：4
8赵立明,叶川,张毅,徐晓东,陈婧.非结构化环境下机器人视觉导航的路径识别方法[J].光学学报,2018,38(8):259-268. 被引量：20
9高筱娴,龙春,魏金侠,赵静,宋丹劼.基于端到端记忆神经网络的可解释入侵检测模型[J].计算机系统应用,2018,27(10):170-176. 被引量：3
10赵润林,李奋华.基于模糊自适应PID控制的交通运输路线规划系统[J].科学技术与工程,2018,18(35):174-178. 被引量：4

1NUMBERS[J].Beijing Review,2019,62(8):39-39.
2Wafa Alsharafat.A New FLAME Selection Method for Intrusion Detection (FLAME-ID)[J].Communications and Network,2019,11(1):11-20.
3YANG Qi,ZHANG Shu-Ping,JIANG Yan.Studies on a Series of Coumarin Derivatives for Anticancer Activity against Breast Carcinoma Cell Line MCF-7 and Their Molecular Design[J].Chinese Journal of Structural Chemistry,2018,37(12):1891-1898.
4Xiaotang Li.C2C E-commerce Credit Model Research Based on IDS System[J].国际计算机前沿大会会议论文集,2018(1):49-49.
5Ying-Ying Su,Wei-Bi Chen,Gang Liu,Lin-Lin Fan,Yan Zhang,Hong Ye,Dai-Quan Gao,Yi-Fei Liu,Meng-Di jiang.An Investigation and Suggestions for the Improvement of Brain Death Determination in China[J].Chinese Medical Journal,2018(24):2910-2914. 被引量：12
6Erin Bard,Wei Hu.Identification of a 12-Gene Signature for Lung Cancer Prognosis through Machine Learning[J].Journal of Cancer Therapy,2011,2(2):148-156.
7Jie Yang,Sen Li,Tong-Yang Zhu,Xiao-Ning Wang,Zhen Zhang.Discovery and validation of potential drug targets based on the phylogenetic evolution of GPCRs[J].Natural Science,2012,4(12):1109-1152.
8Agbotiname L. Imoize,Taiwo Oyedare,Michael E. Otuokere,Sachin Shetty.Software Intrusion Detection Evaluation System: A Cost-Based Evaluation of Intrusion Detection Capability[J].Communications and Network,2018,10(4):211-229.
9Matti Kutila,Pasi Pyykonen,Aarno Lybeck,Pirita Niemi,Erik Nordin.Towards Autonomous Vehicles with Advanced Sensor Solutions[J].World Journal of Engineering and Technology,2015,3(3):6-17.
10Mohammad Hossein Bisjerdi,Alireza Behrad.Video Compression USING a New Active Mesh Based Motion Compensation Algorithm in Wavelet Sub-Bands[J].Journal of Signal and Information Processing,2012,3(3):368-376.

Journal of Intelligent Learning Systems and Applications

2014年第1期

浏览历史

内容加载中请稍等...

Support Vector Machine and Random Forest Modeling for Intrusion Detection System (IDS) 被引量：17

同被引文献90

引证文献17

二级引证文献98

相关作者

相关机构

相关主题

浏览历史