摘要
Vast amounts of heterogeneous data on marine observations have been accumulated due to the rapid development of ocean observation technology.Several state-of-art methods are proposed to manage the emerging Internet of Things(IoT)sensor data.However,the use of an inefficient data management strategy during the data storage process can lead to missing metadata;thus,part of the sensor data cannot be indexed and utilized(i.e.,‘data swamp’).Researchers have focused on optimizing storage procedures to prevent such disasters,but few have attempted to restore the missing metadata.In this study,we propose an AI-based algorithm to reconstruct the metadata of heterogeneous marine data in data swamps to solve the above problems.First,a MapReduce algorithm is proposed to preprocess raw marine data and extract its feature tensors in parallel.Second,load the feature tensors are loaded into a machine learning algorithm and clustering operation is implemented.The similarities between the incoming data and the trained clustering results in terms of clustering results are also calculated.Finally,metadata reconstruction is performed based on existing marine observa-tion data processing results.The experiments are designed using existing datasets obtained from ocean observing systems,thus verifying the effectiveness of the algorithms.The results demonstrate the excellent performance of our proposed algorithm for the metadata recon-struction of heterogenous marine observation data.
基金
supported by the Shandong Province Natural Science Foundation(No.ZR2020QF028).