摘要
用离散化方法处理连续变量的Bayes分类方法存在着离散区段个数不好确定、无法利用某些先验信息以及会或多或少降低分类精度等问题。针对上述问题,论文提出将概率密度估计技术应用于连续变量Bayes分类,研究了如何直接利用参数化方法、非参数化方法以及半参数化方法构造连续变量的Bayes分类器,最后分析了3种构造分类器方法的优缺点,为构造连续变量的Bayes分类器和Bayesian网络分类器奠定了理论基础。计算实例表明所述方法是可行的和有效的。
Bayes classification approaches that handle continuous attributes by discretization suffer from the problems of the hardness of determining the number of the discrete intervals, the inability to utilize some prior information and the reduced classification accuracy. As for the above problems, this paper applyied the probabilistic density estimation techniques to Bayes classification with continuous attributes. And studied how to utilize directly the parametric, nonparametric and semiparametric methods for constructing Bayes classifiers containing continuous attributes. Finally, the advantages and disadvantages of the three constructing methods were analyzed, which established the theoretical basis for Bayes classifiers and Bayesian Network classifiers with continuous attributes. The computational example shows the feasibility and effectiveness of the approaches.
出处
《清华大学学报(自然科学版)》
EI
CAS
CSCD
北大核心
2003年第1期75-78,共4页
Journal of Tsinghua University(Science and Technology)
基金
国家自然科学基金资助项目(79990580)
国家"九七三"重点基础研究项目(G1998030414)