摘要
提出了一种基于超椭球的兼类文本分类算法。对每一类样本,在特征空间求得一个包围该类样本的最小超椭球,使得各类样本之间通过超椭球隔开。对待分类样本,通过判断其是否在超椭球内确定其类别。若没有超椭球包围待分类样本,则通过隶属度确定其所属类别。在标准数据集Reuters 21578上的实验结果表明,该方法较超球方法提高了分类精度和分类速度。
A new multi-label text classification algorithm based on hyper ellipsoidal was proposed in this paper.For every class,the smallest hyper ellipsoidal that contains the samples of the class is structured,which can divide the class samples from others.For the sample to be classified,its class is confirmed by the hyper ellipsoidal that surrounds it.If the sample is not surrounded by any hyper ellipsoidal,the membership is used to confirmed its class.The experiments were done on Reuters 21578 and the experiment results show that the algorithm has a higher performance on classification speed and classification precision compare with hyper sphere algorithm.
出处
《计算机科学》
CSCD
北大核心
2011年第11期204-205,224,共3页
Computer Science
基金
国家自然科学基金项目(60603023)
国家基础研究重大项目(973)研究专项(2001CCA00700)
辽宁省教育厅重点实验室项目(LS2010180)资助
关键词
超椭球
兼类分类
缩放因子
隶属度
Hyper ellipsoidal
Multi-label classification
Extension factor
Membership