摘要
本文提出一种文本分类的新方法 ,该方法将模糊聚类与基于NaiveBayes的EM分类算法相结合 ,从而大大提高了EM分类算法的准确性 ,并解决了使用字符匹配引起的不完整性和不准确性问题。该方法首先给出每个类的一些关键词 ,并把这些关键词作为聚类中心进行聚类 。
This paper presents a new method which combines fuzzy clustering and the Naive Bayes based EM classification algorithm.The new method improves the exactness of the algorithm and solves the problems of incompletion and inaccuracy of using term matching. First, someof each class are given and regarded as clustering centers. Then, a bootstrapping process using the texts which have a shorter distance to the centers is started and used to train a EM classifier
出处
《计算机工程与科学》
CSCD
2002年第5期18-21,共4页
Computer Engineering & Science