摘要
对随机文件中分段字节频率分布规律进行了研究,发现对于20k分段字节出现的频率大多分布在长度为64的连续区域内,偏离点个数非常少,借助该规律可以对频率表进行压缩。此外采用01标识法对附加信息进行优化,减少了附加信息存储空间。频率表及附加信息的压缩存储对整个常数级压缩技术的实现具有重要意义。
The distributing rule of the subsection byte frequency in stochastic file is researched in this paper.The rule is that the most frequencies of 20 k subsection byte distribute in a continuous scale which length is 64.The number of departure nodes is little.This rule can be used to compress a frequency table.Further,01-sign method is used to optimize affixation information and the space of affixation information is reduced.The compression storage of the frequency table and the affixation information is of great significance for the whole realization of constant grade compression technology.
出处
《计算机工程与应用》
CSCD
北大核心
2008年第3期175-177,185,共4页
Computer Engineering and Applications
基金
国家自然科学基金( the National Natural Science Foundation of China under Grant No.60673110)
关键词
数据压缩
排列组合
频率
常数级压缩
data compression
permutation and combination
frequency
constant grade compression