摘要
与一般数据类型不同,属性混合大数据在分类提取过程中易受边缘数据的干扰,存在分类提取误差较大和抗噪能力不佳的问题,于是提出基于朴素贝叶斯算法的属性混合大数据分类提取方法。采用离散小波变换法消除大数据中存在的噪声,避免分类提取过程受到噪声干扰。通过监督判别投影法对大数据实行降维处理,将预处理后的属性混合大数据输入到朴素贝叶斯分类器中,通过先验知识与后验概率的结合,完成属性混合大数据的分类提取。实验结果表明,所提方法的运算耗时短、分类提取误差小、抗噪声能力强,验证了所提方法的应用效果。
Different from general data types,the big data with mixed attribute is vulnerable to interference from edge data during classification and extraction,leading to the problems about large classification and extraction error and poor anti-noise ability.Based on naive Bayesian algorithm,a method of classifying and extracting the big data with mixed attribute was proposed.Firstly,discrete wavelet transform was used to eliminate the noise from big data and thus to avoid the noise interference during the classification and extraction.And then,the supervised discriminant projection method was adopted to reduce the dimension of big data.Moreover,the preprocessed big data was input into Naive Bayesian classifier.Finally,prior knowledge was combined with posterior probability to complete the classification and extraction for the big data with mixed attribute.The experimental results show that the proposed method has the advantages of short operation time,small classification and extraction error as well as strong anti-noise ability.
作者
吴京朋
刘伟
WU Jing-peng;LIU Wei(College of Biomedical Information and Engineering,Hainan Medical University,Haikou Hainan 571199,China;School of Information and Communication Engineering,Hainan University,Haikou Hainan 570228,China)
出处
《计算机仿真》
2024年第2期517-521,共5页
Computer Simulation
关键词
离散小波变换
监督判别投影
局部近邻图
先验知识
后验概率
Discrete wavelet transforms
Supervised discriminant projection
Local nearest neighbor graph
Prior knowledge
Posterior probability