摘要
kddcup99数据集的网络连接数据量很大,各特征属性的取值范围较广,决策类型的种类也很多。因此,如果直接在原数据集上进行数据预处理或是数据挖掘都将是一件十分困难的事情。通过对kddcup99进行数据分析,提出一种对其按照service属性的不同进行分块的新思路,在不失真的前提下,将大问题转化成小问题,从根本上解决了数据集过大的难题。
There is a mass of network connections in kddcup99 dataset.The values of the feature attributes are widely ranged,and the descriptions of decision types are also rich.For this reason,it will be very difficult to pre-process or mine the data on original data set.In this paper,through analysing the data of kddcup99,we put forward a new idea of blocking the dataset according to service attribute,which converts the big problem into some small problems without distortion,and fundamentally solves the difficulty of too large the dataset to be.
出处
《计算机应用与软件》
CSCD
北大核心
2014年第11期321-325,共5页
Computer Applications and Software