摘要
针对传统方法在进行多元成分大数据分类时存在平均百分比分类误差大的问题,提出基于旋转森林算法的多元成分大数据分类方法.首先给出马尔科夫链蒙特卡罗方法、旋转森林方法等相关技术分析;再将多元成分大数据分类问题看作一个概率密度估计问题,采用马尔科夫链蒙特卡罗方法对多元成分大数据进行概率分布统计,并通过核密度估计对多元成分数据的概率模型做先验假设,引入旋转森林方法对多元成分数据进行分类;最后在异构网络正常、异常两种情况下进行试验对比分析.实验结果表明,在异构网络正常情况下采用改进分类方法,其分类精度较高.
A multi-component big data classification method based on rotational forest algorithm is proposed to solve the problem of large average percentage classification error in multi-component big data classification of the traditional method. The technical analyses related to Markov chain Monte Carlo method and rotating forest method are performed. The classification of multi-component big data is regarded as a probability density estimation to carry out probability distribution statistics for multi-component big data by means of Markov chain Monte Carlo method. The priori assumption is made for the probability model of multi-component big data by kernel density estimation. The rotating forest method is introduced into classification of the multi- component data. The experiments for contrastive analysis were conducted under the normal and abnormal conditions of heteroge-neous networks. The experimental results show that the improved method has high classification accuracy under the normal condition of heterogeneous networks.
作者
董洁
DONG Jie(College of Computer and Information Engineering,Chifeng University,Chifeng 024000,China)
出处
《现代电子技术》
北大核心
2019年第18期164-167,共4页
Modern Electronics Technique
基金
内蒙古自治区自然科学基金资助项目(2016MS0618)
内蒙古自治区高等科学研究项目(NJZY18208)~~
关键词
异构网络
多元成分
大数据分类
概率密度
对比分析
先验假设
heterogeneous network
multi-component
big data classification
probability density
contrastive analysis
priori assumption