摘要
针对用户利用常用搜索引擎查询信息时,搜索引擎返回海量杂乱、无序的网页,用户难以从中快速、准确地获得真正关心的信息的现状,从Internet用户的兴趣度出发,设计了一种基于近似网页聚类算法的智能搜索系统。该系统在用户利用常用搜索引擎系统进行信息检索时,消除搜索引擎返回的重复页,对剩余页面进行聚类,返回给用户聚类后的网页簇,这样用户就可以选择浏览自己感兴趣的页面,从而大大提高了信息检索的查准率;实验证明该系统在保证查全率和查准率的基础上大大提高了搜索效率。
Internet has become the main information source and people can get relevant information from it by using existing searching engines. But the information is always massive and disorder and users can difficultly obtain what they truly concern. An intelligent searching system based on usual search engines is designed according to the interests of the users by using approximately pages clustering algorithm. The detail designing process and the system structure are presented. The data source comes from the usual search engines. The experiment shows that this system can improved searching efficiency greatly while ensure recall and precision.
出处
《微计算机应用》
2007年第2期166-169,共4页
Microcomputer Applications
基金
教育部科学技术研究重点项目(教技司2001224)
关键词
WEB信息搜索
智能搜索系统
近似网页聚类
Web information, intelligent searching, approximately pages clustering