摘要
在现有的搜索文本中,存在大量的不确定文本结构和内容,使得常规的聚类算法难以实现,并且文本搜索的结果没有进行类聚,造成搜索结果集合数据量非常庞大。提出了基于模糊集的文本搜索的聚类分析的方法,通过模糊技术对异构数据进行处理,可以改善算法实现的时间和空间的复杂度,减少文本处理的维度,提高算法的鲁棒性,对算法的实现给出了实例分析。通过与其他聚类算法的实测数据的比对分析,验证了算法实现的精确性和效率性。
There are a large number of non-certain and non-structure contents in the web searching text.It is difficult to cluster the searching text by some normal classification methods.Because the searching text is not clustered,the searching result of the text is very enormous.A technique of searching text clustering analysis based on fuzzy set is proposed,and the algorithm has been described in detail by example.It can improve the algorithm complexity of time and space,decrease the dimensions of the algorithm,which should increase the robustness of the algorithm.To check the accuracy and efficiency of the algorithm,the comparative analysis of the sample and test data is provided.
出处
《计算机工程与应用》
CSCD
北大核心
2009年第33期135-137,共3页
Computer Engineering and Applications
基金
国家自然科学基金No.60263005~~
关键词
聚类分析
文本挖掘
模糊集
隶属函数
clustering analysis
text mining
fuzzy set
membership function