朴素贝叶斯分类器是机器学习领域中一种重要的分类算法,根据该算法的前提,利用Foley-Sammon变换算法进行特征提取,提出了一种基于Foley-Sammon变换的朴素贝叶斯分类器NBFST(Na ve Bayesian Classifier with Foley-Sammon Transform).结...朴素贝叶斯分类器是机器学习领域中一种重要的分类算法,根据该算法的前提,利用Foley-Sammon变换算法进行特征提取,提出了一种基于Foley-Sammon变换的朴素贝叶斯分类器NBFST(Na ve Bayesian Classifier with Foley-Sammon Transform).结果表明,NBFST能够在大多数数据集上具有较高的分类准确率.展开更多
Foley-Sammon linear discriminant analysis (FSLDA) and uncorrelated linear discriminant analysis (ULDA) are two well-known kinds of linear discriminant analysis. Both ULDA and FSLDA search the kth discriminant vector i...Foley-Sammon linear discriminant analysis (FSLDA) and uncorrelated linear discriminant analysis (ULDA) are two well-known kinds of linear discriminant analysis. Both ULDA and FSLDA search the kth discriminant vector in an n-k+1 dimensional subspace, while they are subject to their respective constraints. Evidenced by strict demonstration, it is clear that in essence ULDA vectors are the covariance-orthogonal vectors of the corresponding eigen-equation. So, the algorithms for the covariance-orthogonal vectors are equivalent to the original algorithm of ULDA, which is time-consuming. Also, it is first revealed that the Fisher criterion value of each FSLDA vector must be not less than that of the corresponding ULDA vector by theory analysis. For a discriminant vector, the larger its Fisher criterion value is, the more powerful in discriminability it is. So, for FSLDA vectors, corresponding to larger Fisher criterion values is an advantage. On the other hand, in general any two feature components extracted by FSLDA vectors are statistically correlated with each other, which may make the discriminant vectors set at a disadvantageous position. In contrast to FSLDA vectors, any two feature components extracted by ULDA vectors are statistically uncorrelated with each other. Two experiments on CENPARMI handwritten numeral database and ORL database are performed. The experimental results are consistent with the theory analysis on Fisher criterion values of ULDA vectors and FSLDA vectors. The experiments also show that the equivalent algorithm of ULDA, presented in this paper, is much more efficient than the original algorithm of ULDA, as the theory analysis expects. Moreover, it appears that if there is high statistical correlation between feature components extracted by FSLDA vectors, FSLDA will not perform well, in spite of larger Fisher criterion value owned by every FSLDA vector. However, when the average correlation coefficient of feature components extracted by FSLDA vectors is at a low level, the performance of FSLDA are comparable with ULDA.展开更多
文摘朴素贝叶斯分类器是机器学习领域中一种重要的分类算法,根据该算法的前提,利用Foley-Sammon变换算法进行特征提取,提出了一种基于Foley-Sammon变换的朴素贝叶斯分类器NBFST(Na ve Bayesian Classifier with Foley-Sammon Transform).结果表明,NBFST能够在大多数数据集上具有较高的分类准确率.
基金The National Natural Science Foundation of China (Grant No.60472060 ,60473039 and 60472061)
文摘Foley-Sammon linear discriminant analysis (FSLDA) and uncorrelated linear discriminant analysis (ULDA) are two well-known kinds of linear discriminant analysis. Both ULDA and FSLDA search the kth discriminant vector in an n-k+1 dimensional subspace, while they are subject to their respective constraints. Evidenced by strict demonstration, it is clear that in essence ULDA vectors are the covariance-orthogonal vectors of the corresponding eigen-equation. So, the algorithms for the covariance-orthogonal vectors are equivalent to the original algorithm of ULDA, which is time-consuming. Also, it is first revealed that the Fisher criterion value of each FSLDA vector must be not less than that of the corresponding ULDA vector by theory analysis. For a discriminant vector, the larger its Fisher criterion value is, the more powerful in discriminability it is. So, for FSLDA vectors, corresponding to larger Fisher criterion values is an advantage. On the other hand, in general any two feature components extracted by FSLDA vectors are statistically correlated with each other, which may make the discriminant vectors set at a disadvantageous position. In contrast to FSLDA vectors, any two feature components extracted by ULDA vectors are statistically uncorrelated with each other. Two experiments on CENPARMI handwritten numeral database and ORL database are performed. The experimental results are consistent with the theory analysis on Fisher criterion values of ULDA vectors and FSLDA vectors. The experiments also show that the equivalent algorithm of ULDA, presented in this paper, is much more efficient than the original algorithm of ULDA, as the theory analysis expects. Moreover, it appears that if there is high statistical correlation between feature components extracted by FSLDA vectors, FSLDA will not perform well, in spite of larger Fisher criterion value owned by every FSLDA vector. However, when the average correlation coefficient of feature components extracted by FSLDA vectors is at a low level, the performance of FSLDA are comparable with ULDA.