摘要
考虑因子数据的数据特征,采用连续属性服从正态分布的朴素贝叶斯分类方法,对因子分析降维前后数据集的分类性能变化进行了研究.实验结果表明:因子分析中的KM O(K a iser-M eyer-O lk in)统计值和变量共同度与分类性能紧密相关,当KM O统计值大于0.8,并且只有很少属性的变量共同度值小于80%时,采用因子分析作为分类之前的降维是适宜的.
Considering the inherent feature of factor data, the Naive Bayes classfier, which makes me assumption that numeric attributes are generated by a single Normal distribution, is adopted. The classification performance of factor data sets is studied both before and after dimension reduction. Experimental results show that the statistic value of Kaiser-Meyer-Olkin(KMO) and the communalities of factor analysis are related with classification accuracy. When the value of KMO is larger than 0. 8 and the little part of communalities are smaller than 80%, it is appropriated for the classification to use factor analysis as dimension reduction method.
出处
《中北大学学报(自然科学版)》
EI
CAS
2007年第6期556-561,共6页
Journal of North University of China(Natural Science Edition)
基金
国家自然科学基金资助项目(60503017)
山西省自然科学基金资助项目(20051046)
关键词
因子分析
分类
朴素贝叶斯
降维
factor analysis
classification
Naive Bayes
dimension reduction