摘要
研究文本分类问题,传统方法对文本信息分拣的效率和准确性偏低。为了克服干扰,提高分类精度,提出一种基于二次模糊聚类的文本分拣仿真算法。利用传递闭包方法得到源文本的初始分类,得到初始分划矩阵,然后结合特征指标的不等权重因子对文本的属性相关数据进行迭代计算,从而使文本分拣的结果更接近于实际情况。仿真结果证明,算法能有效地提升文本分拣效率和准确性,对于提升海量文本信息的快速智能分拣有较高的实用价值。
To inprove the efficiency and accuracy of the text sorting,a text-calssification simulation algorthm based on the quadratic fuzzy clustering is proposed.The transitive closure method is used to get the initial classification of source texts,and the initial partition matrix is obtained after that.Then combined with the unequal weighting factor of feature index,iterative calculation is applied on the relevant data of the text property,so that the text sorting results are much closer to the actual situation.Experiments show that the proposed algorithm can effectively improve the efficiency and accuracy of text-classification,and is very useful for rapid and intelligent sorting of massive text information.
出处
《计算机仿真》
CSCD
北大核心
2010年第11期165-167,344,共4页
Computer Simulation
关键词
二次模糊聚类
文本信息
仿真算法
智能分拣
Quadratic fuzzy clustering
Text information
Simulation algorithm
Intelligence sorting