摘要
为了能生成高质量的自动文摘,最优子主题划分非常必要。在介绍现有子主题划分方法的基础上,提出了一种基于S2AFCM的子主题划分方法。首先将文档以段落为单位进行向量化,然后使用S2AFCM对各段落向量进行自适应模糊聚类计算,得到隶属度矩阵,从而得到文档的最优子主题划分。实验结果表明,该方法有效地提高了子主题的识别率。
In order to generate high quality automatic summarization, the optimal sub-topic partition is very important. Based on the introduction of existing sub-topic partition method, the sub-topic partition based on S2AFCM method is proposed. Firstly, the document is quantified in paragraphs, and then the adaptive fuzzy clustering algorithm is performed for each paragraph vector using S2AFCM to obtain the membership matrix, so as to get the optimal sub-topic partition of document. The experimental results show that this method can improve the recognition rate of sub-topic partition effectively.
出处
《计算机与网络》
2013年第11期61-63,共3页
Computer & Network
关键词
子主题划分
FCM聚类算法
S2AFCM聚类算法
隶属度矩阵
sub-topic partition
fuzzy c-means (FCM) clustering algorithm
section set adaptive fuzzy c-means (S2AFCM) clustering algorithm
membership matrix