摘要
针对搜索引擎检索大规模数据时结果聚类的性能有限问题,提出一种查询日志辅助的改进K-Means算法。将传统的K-Means聚类扩展为多层次聚类的形式,实现检索对象与检索结果之间的聚类;通过引入检索日志,辅助提升聚类的效果,实现检索结果推送的高相关性。实现结果表明,基于该算法的检索结果聚类,有着较高的准确率,检索过程的时间开销较低,综合效率与准确率而言,该算法是一种理想的检索结果聚类方法。
Focusing on the problem of limited performance in search result clustering, an improved K-Means algorithm was pre- sented which used query log as an additional tool. Traditional K-Means was extended to a multi-layer format, achieving cluste- ring using both search objects and results. Query logs were introduced for accuracy enhancement, improving the relevance of search results. Experimental results reveal the proposed algorithm has higher clustering precision and lower time consumption for searching. It is an ideal search result clustering method.
出处
《计算机工程与设计》
北大核心
2017年第4期1067-1070,1080,共5页
Computer Engineering and Design