摘要
研究了文本对象在不可分辨关系下的自动聚类方法.在自动聚类过程中,首先把文本集转化为让机器可以处理的布尔文本信息系统;其次在信息系统上定义对象间的不可分辨关系,提出利用不可分辨关系进行聚类的理论基础;然后对算法进行描述,并用实验进行验证;最后分析该算法的时间复杂度和缺点,并提出具体的改进措施.基于不可分辨关系的文本自动聚类算法具有理论基础和较好的实验效果表明该方法具有较好的应用性.
This paper studied the automatic clustering method under the Indiscernibility relation of the text objects. In the clustering process, the text sets were converted to the Boolean text information system that the machine may process; secondly the Indiscemibility relation was defined in information systems, and the Indiscernibility relation clustering theory was proposed; then the algorithm was described, which was proved by experiment; Analyzing the time complexity and disadvantages of the algorithm, gives the concrete improvement measures. Based on Indiscernibility relation automatic text clustering algorithm has a theoretical foundation and good experimental results show that this method has better application.
出处
《计算机系统应用》
2012年第12期190-192,共3页
Computer Systems & Applications
关键词
文本信息系统
不可分辨关系
文本自动聚类
text information system
indiscemibility relation
text automatic clustering