摘要
为在网络数据中搜索到所需相关数据,通过对基于后缀数组的全文索引结构的改进研究,设计和实现一种降低空间占用率并有效提高索引速度的全文索引结构———加权有向词图。通过实验证明,加权有向词图在相同问题规模下能降低存储空间,同时不影响检索的效率,是一种更为高效的全文索引结构。
How to search the data needed in the vast network data becomes the dominant Web search technology. Study on effective information retrieval algorithms and data structures becomes an important issue in this article suffix array-based full-text indexing structure. The goal is to design and implement a reduce space occupancy rate and effective full-text indexing speed to improve the index structure- WDWG (Weighted Directed Word Graph). Experiments show that the WDWG with the same size of the problem can reduce the word graph storage space, while not affecting the retrieval efficiency, a more efficient full-text index structure.
出处
《吉林大学学报(信息科学版)》
CAS
2013年第2期183-186,共4页
Journal of Jilin University(Information Science Edition)
基金
吉林省教育厅科技发展规划基金资助项目(2012373)
关键词
后缀自动机
全文索引结构
加权有向词图
suffix automaton
full-text index structure
weighted directed word graph(WDWG)