期刊文献+

基于改进哈夫曼编码的全文索引结构压缩算法 被引量:4

Compressed Format Full-Text Index Based on Improved Huffman Code and Its Implement
下载PDF
导出
摘要 为解决全文索引的索引结构压缩问题,提出了文本的基于正规哈夫曼编码小波树形式,并将该结构与后缀数组结合,实现了基于正规哈夫曼编码的小波树和高效构造算法。实验结果表明,在不降低运行效率的前提下,存储空间得到有效的压缩,从而证明了改进方法的有效性。 To solve the problem of index structures compression of full-text indexes, we introduce the canonical Huffman code to encode the BWT ( Burrows-Wheeler Transform) of a text. In the end, we present an efficient construction algorithm for this index, which is on-line and linear. Experimental results show that, without reducing the efficiency, the effective storage space compression is gained, which improves the effectiveness of the method.
作者 阚君满
出处 《吉林大学学报(信息科学版)》 CAS 2011年第5期473-476,共4页 Journal of Jilin University(Information Science Edition)
基金 吉林省教育厅科技发展规划基金资助项目(2008158)
关键词 全文索引 压缩 正规哈夫曼编码 full-text indexes compressed canonical Huffman code
  • 相关文献

参考文献8

  • 1GUSFIELD D. Algorithms on Strings Trees and Sequences [ M ]. New York: Cambridge University Press, 1997.
  • 2BLUMER A, BLUMER J, HAUSSLER D, et al. The Smallest Automation Recognizing the Subwords of a Text [J]. Theoret- ical Computer Science, 1985, 40 (1) : 31-55.
  • 3MANBER U, MYERS G. Suffix Arrays: A New Method for On-Line String Searches [ C] //Proceedings of the First Annual ACM-SIAM Symposium on Discrete Algorithms. PA, USA: Society for Industrial and Applied Mathematics Philadelphia, 1990 : 319-327.
  • 4HE Meng, MUNRO J I, RAO S S. A Categorization Theorem on Suffix Arrays with Applications to Space Efficient Text Inde- xes [ C] ///SIAM Symposium on Discrete Algorithms (SODA). PA, USA: Society for Industrial and Applied Mathematics Philadelphia, 2005: 23-32.
  • 5PAOLO FERRAGINA, GIOVANNI MANZINI, VELI MAKINEN, et al. An Alphabet-Friendly FM-Index [ C ] fflntemational Conference on String Processing and Information Retrieval. Berlin: Springer, 2004: 150-160.
  • 6FERRAGINA P, MANZINI G. Opportunistic Data Structures with Applications [ C ] //Proceedings of the 41st Annual IEEE Symposium on Foundations of Computer Science. Washington, DC, USA: IEEE Computer Society, 2000: 390-398.
  • 7JACOBSON G. Succinct Static Data Structures [ R]. [ S. l. ] : Department of Computer Science, Carnegie-Mellon Universi- ty, 1989.
  • 8GROSSI R, VITTER J. Compressed Suffix Arrays and Suffix Trees with Applications to Text Indexing and String Matching [C] ///Proceedings of the 32nd Acm Symposium on Theory of Computing. New York: ACM, 2000: 397-406.

同被引文献23

引证文献4

二级引证文献16

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部