期刊文献+

基于后缀数组改进的全文索引结构研究

Improved Suffix Array-Based Full-Text Indexing Structures
下载PDF
导出
摘要 为在网络数据中搜索到所需相关数据,通过对基于后缀数组的全文索引结构的改进研究,设计和实现一种降低空间占用率并有效提高索引速度的全文索引结构———加权有向词图。通过实验证明,加权有向词图在相同问题规模下能降低存储空间,同时不影响检索的效率,是一种更为高效的全文索引结构。 How to search the data needed in the vast network data becomes the dominant Web search technology. Study on effective information retrieval algorithms and data structures becomes an important issue in this article suffix array-based full-text indexing structure. The goal is to design and implement a reduce space occupancy rate and effective full-text indexing speed to improve the index structure- WDWG (Weighted Directed Word Graph). Experiments show that the WDWG with the same size of the problem can reduce the word graph storage space, while not affecting the retrieval efficiency, a more efficient full-text index structure.
作者 刘畅 张猛
出处 《吉林大学学报(信息科学版)》 CAS 2013年第2期183-186,共4页 Journal of Jilin University(Information Science Edition)
基金 吉林省教育厅科技发展规划基金资助项目(2012373)
关键词 后缀自动机 全文索引结构 加权有向词图 suffix automaton full-text index structure weighted directed word graph(WDWG)
  • 相关文献

参考文献8

  • 1ANSELM BLUMER, BLUMER J, DAVID HAUSSLER, et al. The Smallest Automation Recognizing the Subwords of a Text [ J]. Theoretical Computer Science, 1985, 40(1 ) : 31-55.
  • 2CORASICK A M. Efficient String Matching: An Aid to Bibliographic Search [J]. Communications of the ACM, 1975, 18 (6) : 333-343.
  • 3BOYER R, MOORE J. A Fast String Searching Algorithm [ J]. Commun ACM, 1977, 20(10) : 762-772.
  • 4贺龙涛,方滨兴,余翔湛.一种时间复杂度最优的精确串匹配算法[J].软件学报,2005,16(5):676-683. 被引量:25
  • 5刘传汉,王永成,刘德荣,李党林.基于混合策略的单模式匹配算法[J].上海交通大学学报,2007,41(1):36-41. 被引量:3
  • 6UDI MANBER, EUGENE W MYERS. Suffix Arrays: A New Method for On-Line String Searches [ J ]. SIAM Journal on Computing, 1993, 22(5): 935-948.
  • 7PAOLO FERRAGINA, GIOVANNI MANZINI, VELI MAKINEN, et al. An Alphabet-Friendly FM-Index [ C]///Proceedings: llth International Conference, SPIRE 2004. Padova, Italy: [s. n. ], 2004: 150-160.
  • 8ZHANG Meng, HU Liang, LI Qiang. Weighted Directed Word Graph [ C] //Proceedings 16th Annual Symposium, CPM 2005. Jeju Island, Korea: Springer, 2005 : 156-167.

二级参考文献29

  • 1贺龙涛,方滨兴,余翔湛.一种时间复杂度最优的精确串匹配算法[J].软件学报,2005,16(5):676-683. 被引量:25
  • 2Charras C, Lecroq TT. Handbook of Exact String Matching Algorithms. London: King's College London Publications, 2004.
  • 3Knuth DE, Jr. Morris JH, Pratt VR. Fast pattern matching in strings. SIAM Journal on Computing, 1977,6(1):323-350.
  • 4Baeza-Yates RA, Gonnet GH. A new approach to text searching. Communications of the ACM, 1992,35(10):74-82.
  • 5Boyer RS, Moore JS. A fast string searching algorithm. Communications of the ACM, 1977,20(10):762-772.
  • 6Sunday DM. A very fast substring search algorithm. Communications of the ACM, 1990,33(8):132-142.
  • 7Cantone D, Faro S. Fast-Search: A new efficient variant of the Boyer-Moore string matching algorithm. In: Alberto A, Massimo M, eds. Proc. of the 2nd Int'l Workshop on Experimental and Efficient Algorithms (WEA 2003). Lecture Notes in Computer Science 2647, Heidelberg: Springer-Verlag, 2003.47-58.
  • 8Crochemore M, Rytter W. Text Algorithms. Oxford: Oxford University Press, 1994.
  • 9Lecroq T. A variation on the Boyer-Moore algorithm. Theoretical Computer Science, 1992,92(1):119-144.
  • 10Crochemore M, Hancart C. Automata for matching patterns. In: Rosenberg G, Salomaa A, eds. Handbook of Formal Languages,volume 2: Linear Modeling: Background and Application. Heidelberg: Springer-Verlag, 1997. 399-462.

共引文献24

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部