Scientific instruments and simulation programs are generating large amounts of multidimensional array data. Queries with value and dimension subsetting conditions are commonly used by scientists to find useful informa...Scientific instruments and simulation programs are generating large amounts of multidimensional array data. Queries with value and dimension subsetting conditions are commonly used by scientists to find useful information from big array data, and data storage and indexing methods play an important role in supporting queries on multidimensional array data efficiently. In this paper, we propose SwiftArray, a new storage layout with indexing techniques to accelerate queries with value and dimension subsetting conditions. In SwiftArray, the multidimensional array is divided into blocks and each block stores sorted values. Blocks are placed in the order of a Hilbert space-filling curve to improve data locality for dimension subsetting queries. We propose a 2-D-Bin method to build an index for the blocks' value ranges, which is an efficient way to avoid accessing unnecessary blocks for value subsetting queries. Our evaluations show that SwiftArray surpasses the NetCDF-4 format and FastBit indexing technique for queries on multidimensional arrays.展开更多
Considering the problem of mode ranks revealing of d-dimensional array (tensor) given in canonical form,we propose fast algorithm based on cross approximation of Gram matrices of unfoldings.
基金supported in part by the Natural Science Foundation of China (No. 41375102)the National Key Basic Research and Development (973) Program of China (No. 2014CB347800)the National HighTech Research and Development Program (863) of China (No. 2011AA01A203)
文摘Scientific instruments and simulation programs are generating large amounts of multidimensional array data. Queries with value and dimension subsetting conditions are commonly used by scientists to find useful information from big array data, and data storage and indexing methods play an important role in supporting queries on multidimensional array data efficiently. In this paper, we propose SwiftArray, a new storage layout with indexing techniques to accelerate queries with value and dimension subsetting conditions. In SwiftArray, the multidimensional array is divided into blocks and each block stores sorted values. Blocks are placed in the order of a Hilbert space-filling curve to improve data locality for dimension subsetting queries. We propose a 2-D-Bin method to build an index for the blocks' value ranges, which is an efficient way to avoid accessing unnecessary blocks for value subsetting queries. Our evaluations show that SwiftArray surpasses the NetCDF-4 format and FastBit indexing technique for queries on multidimensional arrays.
基金supported by RFBR grants 08-01-00115supported by RFBR grants 09- 01-12058+1 种基金RFBR/DFG grant 09-01-91332Priority Research Program of Dep.Math. RAS No.3 and 5
文摘Considering the problem of mode ranks revealing of d-dimensional array (tensor) given in canonical form,we propose fast algorithm based on cross approximation of Gram matrices of unfoldings.