This paper proposed an incremental textclustering algorithm based on semantic sequence. Using similarity relation of semantic sequences and calculating the cover of similarity semantic sequences set, the candidate clu...This paper proposed an incremental textclustering algorithm based on semantic sequence. Using similarity relation of semantic sequences and calculating the cover of similarity semantic sequences set, the candidate cluster with minimum entropy overlap value was selected as a result cluster every time in this algorithm. The comparison of experimental results shows that the precision of the algorithm is higher than other algorithms under same conditions and this is obvious especially on long documents set.展开更多
基金Supported by the National Natural Science Funda-tion of China (60173058)
文摘This paper proposed an incremental textclustering algorithm based on semantic sequence. Using similarity relation of semantic sequences and calculating the cover of similarity semantic sequences set, the candidate cluster with minimum entropy overlap value was selected as a result cluster every time in this algorithm. The comparison of experimental results shows that the precision of the algorithm is higher than other algorithms under same conditions and this is obvious especially on long documents set.