We propose a novel, lossless compression algorithm, based on the 2D Discrete Fast Fourier Transform, to approximate the Algorithmic (Kolmogorov) Complexity of Elementary Cellular Automata. Fast Fourier transforms are ...We propose a novel, lossless compression algorithm, based on the 2D Discrete Fast Fourier Transform, to approximate the Algorithmic (Kolmogorov) Complexity of Elementary Cellular Automata. Fast Fourier transforms are widely used in image compression but their lossy nature exclude them as viable candidates for Kolmogorov Complexity approximations. For the first time, we present a way to adapt fourier transforms for lossless image compression. The proposed method has a very strong Pearsons correlation to existing complexity metrics and we further establish its consistency as a complexity metric by confirming its measurements never exceed the complexity of nothingness and randomness (representing the lower and upper limits of complexity). Surprisingly, many of the other methods tested fail this simple sanity check. A final symmetry-based test also demonstrates our method’s superiority over existing lossless compression metrics. All complexity metrics tested, as well as the code used to generate and augment the original dataset, can be found in our github repository: ECA complexity metrics<sup>1</sup>.展开更多
To improve the classical lossless compression of low efficiency,a method of image lossless compression with high efficiency is presented.Its theory and the algorithm implementation are introduced.The basic approach of...To improve the classical lossless compression of low efficiency,a method of image lossless compression with high efficiency is presented.Its theory and the algorithm implementation are introduced.The basic approach of medical image lossless compression is then briefly described.After analyzing and implementing differential plus code modulation(DPCM)in lossless compression,a new method of combining an integer wavelet transform with DPCM to compress medical images is discussed.The analysis and simulation results show that this new method is simpler and useful.Moreover,it has high compression ratio in medical image lossless compression.展开更多
In this document, we present new techniques for near-lossless and lossy compression of SAR imagery saved in PNG and binary formats of magnitude and phase data based on the application of transforms, dimensionality red...In this document, we present new techniques for near-lossless and lossy compression of SAR imagery saved in PNG and binary formats of magnitude and phase data based on the application of transforms, dimensionality reduction methods, and lossless compression. In particular, we discuss the use of blockwise integer to integer transforms, subsequent application of a dimensionality reduction method, and Burrows-Wheeler based lossless compression for the PNG data and the use of high correlation based modeling of sorted transform coefficients for the raw floating point magnitude and phase data. The gains exhibited are substantial over the application of different lossless methods directly on the data and competitive with existing lossy approaches. The methods presented are effective for large scale processing of similar data formats as they are heavily based on techniques which scale well on parallel architectures.展开更多
For protecting the copyright of a text and recovering its original content harmlessly,this paper proposes a novel reversible natural language watermarking method that combines arithmetic coding and synonym substitutio...For protecting the copyright of a text and recovering its original content harmlessly,this paper proposes a novel reversible natural language watermarking method that combines arithmetic coding and synonym substitution operations.By analyzing relative frequencies of synonymous words,synonyms employed for carrying payload are quantized into an unbalanced and redundant binary sequence.The quantized binary sequence is compressed by adaptive binary arithmetic coding losslessly to provide a spare for accommodating additional data.Then,the compressed data appended with the watermark are embedded into the cover text via synonym substitutions in an invertible manner.On the receiver side,the watermark and compressed data can be extracted by decoding the values of synonyms in the watermarked text,as a result of which the original context can be perfectly recovered by decompressing the extracted compressed data and substituting the replaced synonyms with their original synonyms.Experimental results demonstrate that the proposed method can extract the watermark successfully and achieve a lossless recovery of the original text.Additionally,it achieves a high embedding capacity.展开更多
The capacity and the scale of smart substation are expanding constantly,with the characteristics of information digitization and automation,leading to a quantitative trend of data.Aiming at the existing processing sho...The capacity and the scale of smart substation are expanding constantly,with the characteristics of information digitization and automation,leading to a quantitative trend of data.Aiming at the existing processing shortages in the big data processing,the query and analysis of smart substation,a data compression processing method is proposed for analyzing smart substation and Hive.Experimental results show that the compression ratio and query time of RCFile storage format are better than those of TextFile and SequenceFile.The query efficiency is improved for data compressed by Deflate,Gzip and Lzo compression formats.The results verify the correctness of adjacent speedup defined as the index of cluster efficiency.Results also prove that the method has a significant theoretical and practical value for big data processing of smart substation.展开更多
We describe practical improvements for parallel BWT-based lossless compressors frequently utilized in modern day big data applications.We propose a clustering-based data permutation approach for improving compression...We describe practical improvements for parallel BWT-based lossless compressors frequently utilized in modern day big data applications.We propose a clustering-based data permutation approach for improving compression ratio for data with significant alphabet variation along with a faster string sorting approach based on the application of the O(n)complexity counting sort with permutation reindexing.展开更多
文摘We propose a novel, lossless compression algorithm, based on the 2D Discrete Fast Fourier Transform, to approximate the Algorithmic (Kolmogorov) Complexity of Elementary Cellular Automata. Fast Fourier transforms are widely used in image compression but their lossy nature exclude them as viable candidates for Kolmogorov Complexity approximations. For the first time, we present a way to adapt fourier transforms for lossless image compression. The proposed method has a very strong Pearsons correlation to existing complexity metrics and we further establish its consistency as a complexity metric by confirming its measurements never exceed the complexity of nothingness and randomness (representing the lower and upper limits of complexity). Surprisingly, many of the other methods tested fail this simple sanity check. A final symmetry-based test also demonstrates our method’s superiority over existing lossless compression metrics. All complexity metrics tested, as well as the code used to generate and augment the original dataset, can be found in our github repository: ECA complexity metrics<sup>1</sup>.
基金supported by the National Natural Science Foundation of China (Grant No.60475036).
文摘To improve the classical lossless compression of low efficiency,a method of image lossless compression with high efficiency is presented.Its theory and the algorithm implementation are introduced.The basic approach of medical image lossless compression is then briefly described.After analyzing and implementing differential plus code modulation(DPCM)in lossless compression,a new method of combining an integer wavelet transform with DPCM to compress medical images is discussed.The analysis and simulation results show that this new method is simpler and useful.Moreover,it has high compression ratio in medical image lossless compression.
文摘In this document, we present new techniques for near-lossless and lossy compression of SAR imagery saved in PNG and binary formats of magnitude and phase data based on the application of transforms, dimensionality reduction methods, and lossless compression. In particular, we discuss the use of blockwise integer to integer transforms, subsequent application of a dimensionality reduction method, and Burrows-Wheeler based lossless compression for the PNG data and the use of high correlation based modeling of sorted transform coefficients for the raw floating point magnitude and phase data. The gains exhibited are substantial over the application of different lossless methods directly on the data and competitive with existing lossy approaches. The methods presented are effective for large scale processing of similar data formats as they are heavily based on techniques which scale well on parallel architectures.
基金This project is supported by National Natural Science Foundation of China(No.61202439)partly supported by Scientific Research Foundation of Hunan Provincial Education Department of China(No.16A008)partly supported by Hunan Key Laboratory of Smart Roadway and Cooperative Vehicle-Infrastructure Systems(No.2017TP1016).
文摘For protecting the copyright of a text and recovering its original content harmlessly,this paper proposes a novel reversible natural language watermarking method that combines arithmetic coding and synonym substitution operations.By analyzing relative frequencies of synonymous words,synonyms employed for carrying payload are quantized into an unbalanced and redundant binary sequence.The quantized binary sequence is compressed by adaptive binary arithmetic coding losslessly to provide a spare for accommodating additional data.Then,the compressed data appended with the watermark are embedded into the cover text via synonym substitutions in an invertible manner.On the receiver side,the watermark and compressed data can be extracted by decoding the values of synonyms in the watermarked text,as a result of which the original context can be perfectly recovered by decompressing the extracted compressed data and substituting the replaced synonyms with their original synonyms.Experimental results demonstrate that the proposed method can extract the watermark successfully and achieve a lossless recovery of the original text.Additionally,it achieves a high embedding capacity.
基金This work is supported by National Natural Science Foundation of China(No.51267005)Jiangxi Province University Visiting Scholar Special Funds for Young Teacher Development Plan(No.G201415,No.GJJ13350).
文摘The capacity and the scale of smart substation are expanding constantly,with the characteristics of information digitization and automation,leading to a quantitative trend of data.Aiming at the existing processing shortages in the big data processing,the query and analysis of smart substation,a data compression processing method is proposed for analyzing smart substation and Hive.Experimental results show that the compression ratio and query time of RCFile storage format are better than those of TextFile and SequenceFile.The query efficiency is improved for data compressed by Deflate,Gzip and Lzo compression formats.The results verify the correctness of adjacent speedup defined as the index of cluster efficiency.Results also prove that the method has a significant theoretical and practical value for big data processing of smart substation.
文摘We describe practical improvements for parallel BWT-based lossless compressors frequently utilized in modern day big data applications.We propose a clustering-based data permutation approach for improving compression ratio for data with significant alphabet variation along with a faster string sorting approach based on the application of the O(n)complexity counting sort with permutation reindexing.