How can we efficiently store and mine dynamically generated dense tensors for modeling the behavior of multidimensional dynamic data?Much of the multidimensional dynamic data in the real world is generated in the form...How can we efficiently store and mine dynamically generated dense tensors for modeling the behavior of multidimensional dynamic data?Much of the multidimensional dynamic data in the real world is generated in the form of time-growing tensors.For example,air quality tensor data consists of multiple sensory values gathered from wide locations for a long time.Such data,accumulated over time,is redundant and consumes a lot ofmemory in its raw form.We need a way to efficiently store dynamically generated tensor data that increase over time and to model their behavior on demand between arbitrary time blocks.To this end,we propose a Block IncrementalDense Tucker Decomposition(BID-Tucker)method for efficient storage and on-demand modeling ofmultidimensional spatiotemporal data.Assuming that tensors come in unit blocks where only the time domain changes,our proposed BID-Tucker first slices the blocks into matrices and decomposes them via singular value decomposition(SVD).The SVDs of the time×space sliced matrices are stored instead of the raw tensor blocks to save space.When modeling from data is required at particular time blocks,the SVDs of corresponding time blocks are retrieved and incremented to be used for Tucker decomposition.The factor matrices and core tensor of the decomposed results can then be used for further data analysis.We compared our proposed BID-Tucker with D-Tucker,which our method extends,and vanilla Tucker decomposition.We show that our BID-Tucker is faster than both D-Tucker and vanilla Tucker decomposition and uses less memory for storage with a comparable reconstruction error.We applied our proposed BID-Tucker to model the spatial and temporal trends of air quality data collected in South Korea from 2018 to 2022.We were able to model the spatial and temporal air quality trends.We were also able to verify unusual events,such as chronic ozone alerts and large fire events.展开更多
Nonnegative Tucker3 decomposition(NTD) has attracted lots of attentions for its good performance in 3D data array analysis. However, further research is still necessary to solve the problems of overfitting and slow ...Nonnegative Tucker3 decomposition(NTD) has attracted lots of attentions for its good performance in 3D data array analysis. However, further research is still necessary to solve the problems of overfitting and slow convergence under the anharmonic vibration circumstance occurred in the field of mechanical fault diagnosis. To decompose a large-scale tensor and extract available bispectrum feature, a method of conjugating Choi-Williams kernel function with Gauss-Newton Cartesian product based on nonnegative Tucker3 decomposition(NTD_EDF) is investigated. The complexity of the proposed method is reduced from o(nNlgn) in 3D spaces to o(RiR2nlgn) in 1D vectors due to its low rank form of the Tucker-product convolution. Meanwhile, a simultaneously updating algorithm is given to overcome the overfitting, slow convergence and low efficiency existing in the conventional one-by-one updating algorithm. Furthermore, the technique of spectral phase analysis for quadratic coupling estimation is used to explain the feature spectrum extracted from the gearbox fault data by the proposed method in detail. The simulated and experimental results show that the sparser and more inerratic feature distribution of basis images can be obtained with core tensor by the NTD EDF method compared with the one by the other methods in bispectrum feature extraction, and a legible fault expression can also be performed by power spectral density(PSD) function. Besides, the deviations of successive relative error(DSRE) of NTD_EDF achieves 81.66 dB against 15.17 dB by beta-divergences based on NTD(NTD_Beta) and the time-cost of NTD EDF is only 129.3 s, which is far less than 1 747.9 s by hierarchical alternative least square based on NTD (NTD_HALS). The NTD_EDF method proposed not only avoids the data overfitting and improves the computation efficiency but also can be used to extract more inerratic and sparser bispectrum features of the gearbox fault.展开更多
基金supported by the Institute of Information&Communications Technology Planning&Evaluation (IITP)grant funded by the Korean government (MSIT) (No.2022-0-00369)by the NationalResearch Foundation of Korea Grant funded by the Korean government (2018R1A5A1060031,2022R1F1A1065664).
文摘How can we efficiently store and mine dynamically generated dense tensors for modeling the behavior of multidimensional dynamic data?Much of the multidimensional dynamic data in the real world is generated in the form of time-growing tensors.For example,air quality tensor data consists of multiple sensory values gathered from wide locations for a long time.Such data,accumulated over time,is redundant and consumes a lot ofmemory in its raw form.We need a way to efficiently store dynamically generated tensor data that increase over time and to model their behavior on demand between arbitrary time blocks.To this end,we propose a Block IncrementalDense Tucker Decomposition(BID-Tucker)method for efficient storage and on-demand modeling ofmultidimensional spatiotemporal data.Assuming that tensors come in unit blocks where only the time domain changes,our proposed BID-Tucker first slices the blocks into matrices and decomposes them via singular value decomposition(SVD).The SVDs of the time×space sliced matrices are stored instead of the raw tensor blocks to save space.When modeling from data is required at particular time blocks,the SVDs of corresponding time blocks are retrieved and incremented to be used for Tucker decomposition.The factor matrices and core tensor of the decomposed results can then be used for further data analysis.We compared our proposed BID-Tucker with D-Tucker,which our method extends,and vanilla Tucker decomposition.We show that our BID-Tucker is faster than both D-Tucker and vanilla Tucker decomposition and uses less memory for storage with a comparable reconstruction error.We applied our proposed BID-Tucker to model the spatial and temporal trends of air quality data collected in South Korea from 2018 to 2022.We were able to model the spatial and temporal air quality trends.We were also able to verify unusual events,such as chronic ozone alerts and large fire events.
基金supported by National Natural Science Foundation of China(Grant Nos.50875048,51175079,51075069)
文摘Nonnegative Tucker3 decomposition(NTD) has attracted lots of attentions for its good performance in 3D data array analysis. However, further research is still necessary to solve the problems of overfitting and slow convergence under the anharmonic vibration circumstance occurred in the field of mechanical fault diagnosis. To decompose a large-scale tensor and extract available bispectrum feature, a method of conjugating Choi-Williams kernel function with Gauss-Newton Cartesian product based on nonnegative Tucker3 decomposition(NTD_EDF) is investigated. The complexity of the proposed method is reduced from o(nNlgn) in 3D spaces to o(RiR2nlgn) in 1D vectors due to its low rank form of the Tucker-product convolution. Meanwhile, a simultaneously updating algorithm is given to overcome the overfitting, slow convergence and low efficiency existing in the conventional one-by-one updating algorithm. Furthermore, the technique of spectral phase analysis for quadratic coupling estimation is used to explain the feature spectrum extracted from the gearbox fault data by the proposed method in detail. The simulated and experimental results show that the sparser and more inerratic feature distribution of basis images can be obtained with core tensor by the NTD EDF method compared with the one by the other methods in bispectrum feature extraction, and a legible fault expression can also be performed by power spectral density(PSD) function. Besides, the deviations of successive relative error(DSRE) of NTD_EDF achieves 81.66 dB against 15.17 dB by beta-divergences based on NTD(NTD_Beta) and the time-cost of NTD EDF is only 129.3 s, which is far less than 1 747.9 s by hierarchical alternative least square based on NTD (NTD_HALS). The NTD_EDF method proposed not only avoids the data overfitting and improves the computation efficiency but also can be used to extract more inerratic and sparser bispectrum features of the gearbox fault.