摘要
介绍了一个并行数字图书馆原型系统PDL,和用于该系统的针对文本信息检索的一些数据结构,包括倒排索引,结构索引,RANK索引和词典等,并在此结构的基础上设计实现了基于内容和结构的查询算法.这些算法以计算机机群并行环境为基础.实验表明并行数据查询具有良好的性能.
This paper introduce designs new data structures including inverted index, structure index, RANK index and lexicon for document retrieval on digital library. Based on these structures, new query and maintenance algorithms are designed. All these algorithms are running on parallel processor environment. In order to meet the need of maintenance, inverted index and forward index are used. The experiments show that the parallel method has a good performance.
出处
《哈尔滨工业大学学报》
EI
CAS
CSCD
北大核心
2005年第7期1007-1010,共4页
Journal of Harbin Institute of Technology
关键词
元数据模式
倒排索引
并行文本处理
基于内容和结构查询
metadata model
inverted index
parallel text system
search based on content and structure