摘要
文本分类是文本挖掘的基础与核心,分类器的构建是文本分类的关键,利用贝叶斯网络可以构造出分类性能较好的分类器。文中利用Matlab构造出了两种分类器:朴素贝叶斯分类器NBC,用互信息测度和条件互信息测度构建了TANC。用UCI上下载的标准数据集验证所构造的分类器,实验结果表明,所建构的几种分类器的性能总体比文献中列的高些,从而表明所建立的分类器的有效性和正确性。笔者对所建构的分类器进行优化并应用于文本分类中。
Text classification has been considered as foundation and a hot research in text mining.Constructing classifier plays a key role in text categorization and effective classifier can be constructed using Bayesian network.Naive Bayesian classier and Tree Augmented Naive Bayesian classifier have been constructed using Matlab. TANC was measured using mutual information (MI) and conditional mutual information (CMI).Results of effective performance have been achieved in experiments for standard data sets from UCI.It demonstrats the effectiveness and correctness of these classifiers.These classifiers would be optimized and applied to text categorization.
出处
《微机发展》
2004年第9期33-35,39,共4页
Microcomputer Development