摘要
为用后缀树聚类算法对维吾尔文网页进行聚类,通过分析可扩展后缀树和维吾尔文的特点设计了维吾尔文后缀树构造算法。实验结果证明该方法能够在线性的时间范围内构造维吾尔文后缀树,并用它来对维吾尔文网页进行聚类。
Suffix Tree Clustering(STC) have been applied to web page clustering problems. In order to use the STC algorithm to cluster Uighur page, this paper analyzes the characteristics of the generalized suffix tree and Uighur features to design the Ui- ghur generalized suffix tree construction algorithm. The experimental result shows that the method can construct Uighur suffix tree in linear time range, and it can be used to cluster Uighur web page.
出处
《计算机工程与应用》
CSCD
2013年第8期9-11,16,共4页
Computer Engineering and Applications
基金
国家自然科学基金(No.61262063
No.61142004)
新疆多种语种重点实验室开放课题(No.049807)