摘要
目前多数基于文本聚类搜索引擎的研究对于聚类产生的小聚类簇查询未能给出深入查询解决方案,针对此类问题提出了一种基于聚类的查询扩展算法。此算法利用簇关系树结构改进相似度公式,对目标簇提取主题词并进行二次查询后,通过K中值聚类算法对查询结果进行聚类以对其进行扩展。此算法全部过程均为离线运算,旨在避免在线运算影响查询响应效率,并通过实验验证了该算法的有效性。
Most of the researches on search engine based on text clustering doesn't provide a good solution for deep searching with small clusters.To solve this kind of problems,a query recommendation algorithm based on clustering is proposed.This algorithm improves the similarity formula utilizing the hierarchical clustering results generated by text clustering,then searches for the target clusters using the extracted key-words,processes the result set using K-median clustering algorithm for recommendation.All the processes are done offline to avoid online computing.The algorithm is proved effective by experiment.
出处
《计算机工程与应用》
CSCD
2012年第3期129-132,共4页
Computer Engineering and Applications
关键词
K中值聚类
主题词提取
相似度计算
查询扩展
K-median
key-words extraction
similarity formula
query recommendation