摘要
针对现有搜索引擎的搜索结果数目庞大要从中找到有用信息十分困难的问题,基于将Web搜索结果进行聚类可以方便用户快速浏览搜索结果的思想,提出了一种基于形式概念分析的Web搜索结果聚类方法。首先从搜索结果中集中提取关键短语和非关键短语,然后从搜索结果集到关键短语和非关键短语集上建立形式背景,采用一种较快概念格生成算法在该形式背景上生成概念格,概念格上的一个概念表达了具有确定意义的主题,即得到Web搜索结果的一个类:每个概念内涵具有的关键短语或非关键短语作为类标记;概念的外延所包含的搜索结果文档作为该类的内容;搜索结果各个类之间的层次关系通过概念格上的层次关系得以体现。
Organizing Web search results into clusters facilitates users' quick browing. In this paper , the authors present a method for Web search result clustering based on formal concept analysis. For a given query and the documents (typically a list of titles and snippets) returned by a certain Web search engine, the method first extracts key phrases and general phrases as attributes of the formal concept , then uses the list of titles and snippets as objects of the formal concept, the concept lattice can be constructed on many - continuous - valued formal context using an algorithm of low time complex ith. Since one formal concept on the concept lattice represents a certain theme, one formal concept can be as one cluster of the Web search result.
出处
《西华大学学报(自然科学版)》
CAS
2013年第3期54-58,共5页
Journal of Xihua University:Natural Science Edition
基金
国家自然科学基金项目(61271413)
教育部"春晖计划"项目(12226531)
四川省人事厅学术与技术带头人培养计划(12226463)
关键词
Web搜索结果组织
文档聚类
形式概念分析
Web search result organization
document clustering
formal concept analysis