摘要
全文检索技术能提高从海量数据中查找特定信息的效率,但传统的检索技术极大地消耗资源。以Emule、BT为代表的P2P软件实现了对文件的定位和高速下载,但它们对多种格式的中文文档解析及关键字提取能力不够,且网络路由中存在热点效应问题。提出一种基于P2P分布式网络的全文检索系统,并讲述了该系统的整体结构、关键技术、系统实现。实践证明该系统能有效地解决这些问题。
Full-text retrieval technology can improve the efficiency of finding specific information from the massive data.However, traditional retrieval techniques greatly consume system resources.P2P file-sharing software,such as Emule, BT,realize file position- ing and high-speed downloads,but they have no enough capacity to analysis variety format Chinese documents and to extract key words.At the same time,it has the existence of hot issues in network routing.This paper proposes a full-text retrieval system based on P2P network,which can effectively solve these problems.
出处
《计算机工程与应用》
CSCD
北大核心
2010年第10期70-72,77,共4页
Computer Engineering and Applications
基金
国家自然科学基金No.60573145
广州市科技攻关计划No.2007J1-C0401~~
关键词
全文检索
对等网络
分词算法
两级路由
full-text retrieval
P2P networks
segmentation algorithm
two level routing