We describe practical improvements for parallel BWT-based lossless compressors frequently utilized in modern day big data applications.We propose a clustering-based data permutation approach for improving compression...We describe practical improvements for parallel BWT-based lossless compressors frequently utilized in modern day big data applications.We propose a clustering-based data permutation approach for improving compression ratio for data with significant alphabet variation along with a faster string sorting approach based on the application of the O(n)complexity counting sort with permutation reindexing.展开更多
文摘We describe practical improvements for parallel BWT-based lossless compressors frequently utilized in modern day big data applications.We propose a clustering-based data permutation approach for improving compression ratio for data with significant alphabet variation along with a faster string sorting approach based on the application of the O(n)complexity counting sort with permutation reindexing.