The frequent missing values in radar-derived time-series tracks of aerial targets(RTT-AT)lead to significant challenges in subsequent data-driven tasks.However,the majority of imputation research focuses on random mis...The frequent missing values in radar-derived time-series tracks of aerial targets(RTT-AT)lead to significant challenges in subsequent data-driven tasks.However,the majority of imputation research focuses on random missing(RM)that differs significantly from common missing patterns of RTT-AT.The method for solving the RM may experience performance degradation or failure when applied to RTT-AT imputation.Conventional autoregressive deep learning methods are prone to error accumulation and long-term dependency loss.In this paper,a non-autoregressive imputation model that addresses the issue of missing value imputation for two common missing patterns in RTT-AT is proposed.Our model consists of two probabilistic sparse diagonal masking self-attention(PSDMSA)units and a weight fusion unit.It learns missing values by combining the representations outputted by the two units,aiming to minimize the difference between the missing values and their actual values.The PSDMSA units effectively capture temporal dependencies and attribute correlations between time steps,improving imputation quality.The weight fusion unit automatically updates the weights of the output representations from the two units to obtain a more accurate final representation.The experimental results indicate that,despite varying missing rates in the two missing patterns,our model consistently outperforms other methods in imputation performance and exhibits a low frequency of deviations in estimates for specific missing entries.Compared to the state-of-the-art autoregressive deep learning imputation model Bidirectional Recurrent Imputation for Time Series(BRITS),our proposed model reduces mean absolute error(MAE)by 31%~50%.Additionally,the model attains a training speed that is 4 to 8 times faster when compared to both BRITS and a standard Transformer model when trained on the same dataset.Finally,the findings from the ablation experiments demonstrate that the PSDMSA,the weight fusion unit,cascade network design,and imputation loss enhance imputation performance and confirm the efficacy of our design.展开更多
Missing values occur in bio-signal processing for various reasons,including technical problems or biological char-acteristics.These missing values are then either simply excluded or substituted with estimated values f...Missing values occur in bio-signal processing for various reasons,including technical problems or biological char-acteristics.These missing values are then either simply excluded or substituted with estimated values for further processing.When the missing signal values are estimated for electroencephalography (EEG) signals,an example where electrical signals arrive quickly and successively,rapid processing of high-speed data is required for immediate decision making.In this study,we propose an incremental expectation maximization principal component analysis (iEMPCA) method that automatically estimates missing values from multivariable EEG time series data without requiring a whole and complete data set.The proposed method solves the problem of a biased model,which inevitably results from simply removing incomplete data rather than estimating them,and thus reduces the loss of information by incorporating missing values in real time.By using an incremental approach,the proposed method alsominimizes memory usage and processing time of continuously arriving data.Experimental results show that the proposed method assigns more accurate missing values than previous methods.展开更多
Complete and reliable field traffic data is vital for the planning, design, and operation of urban traf- fic management systems. However, traffic data is often very incomplete in many traffic information systems, whic...Complete and reliable field traffic data is vital for the planning, design, and operation of urban traf- fic management systems. However, traffic data is often very incomplete in many traffic information systems, which hinders effective use of the data. Methods are needed for imputing missing traffic data to minimize the effect of incomplete data on the utilization. This paper presents an improved Local Least Squares (LLS) ap- proach to impute the incomplete data. The LLS is an improved version of the K Nearest Neighbor (KNN) method. First, the missing traffic data is replaced by a row average of the known values. Then, the vector angle and Euclidean distance are used to select the nearest neighbors. Finally, a regression step is used to get weights of the nearest neighbors and the imputation results. Traffic flow volume collected in Beijing was analyzed to compare this approach with the Bayesian Principle Component Analysis (BPCA) imputation ap- proach. Tests show that this approach provides slightly better performance than BPCA imputation to impute missing traffic data.展开更多
基金supported by Graduate Funded Project(No.JY2022A017).
文摘The frequent missing values in radar-derived time-series tracks of aerial targets(RTT-AT)lead to significant challenges in subsequent data-driven tasks.However,the majority of imputation research focuses on random missing(RM)that differs significantly from common missing patterns of RTT-AT.The method for solving the RM may experience performance degradation or failure when applied to RTT-AT imputation.Conventional autoregressive deep learning methods are prone to error accumulation and long-term dependency loss.In this paper,a non-autoregressive imputation model that addresses the issue of missing value imputation for two common missing patterns in RTT-AT is proposed.Our model consists of two probabilistic sparse diagonal masking self-attention(PSDMSA)units and a weight fusion unit.It learns missing values by combining the representations outputted by the two units,aiming to minimize the difference between the missing values and their actual values.The PSDMSA units effectively capture temporal dependencies and attribute correlations between time steps,improving imputation quality.The weight fusion unit automatically updates the weights of the output representations from the two units to obtain a more accurate final representation.The experimental results indicate that,despite varying missing rates in the two missing patterns,our model consistently outperforms other methods in imputation performance and exhibits a low frequency of deviations in estimates for specific missing entries.Compared to the state-of-the-art autoregressive deep learning imputation model Bidirectional Recurrent Imputation for Time Series(BRITS),our proposed model reduces mean absolute error(MAE)by 31%~50%.Additionally,the model attains a training speed that is 4 to 8 times faster when compared to both BRITS and a standard Transformer model when trained on the same dataset.Finally,the findings from the ablation experiments demonstrate that the PSDMSA,the weight fusion unit,cascade network design,and imputation loss enhance imputation performance and confirm the efficacy of our design.
基金supported by the Ministry of Knowledge Economy,Korea, under the Information Technology Research Center support program supervised by National IT Industry Promotion Agency (No.NIPA-2011-C1090-1111-0008)the Special Research Program of Chonnam National University,2009the LG Yonam Culture Foundation
文摘Missing values occur in bio-signal processing for various reasons,including technical problems or biological char-acteristics.These missing values are then either simply excluded or substituted with estimated values for further processing.When the missing signal values are estimated for electroencephalography (EEG) signals,an example where electrical signals arrive quickly and successively,rapid processing of high-speed data is required for immediate decision making.In this study,we propose an incremental expectation maximization principal component analysis (iEMPCA) method that automatically estimates missing values from multivariable EEG time series data without requiring a whole and complete data set.The proposed method solves the problem of a biased model,which inevitably results from simply removing incomplete data rather than estimating them,and thus reduces the loss of information by incorporating missing values in real time.By using an incremental approach,the proposed method alsominimizes memory usage and processing time of continuously arriving data.Experimental results show that the proposed method assigns more accurate missing values than previous methods.
基金Partially supported by the National High-Tech Research and Development (863) Program of China (Nos. 2009AA11Z206 and 2011AA110401)the National Natural Science Foundation of China (Nos. 60721003 and 60834001)Tsinghua University Innovation Research Program (No. 2009THZ0)
文摘Complete and reliable field traffic data is vital for the planning, design, and operation of urban traf- fic management systems. However, traffic data is often very incomplete in many traffic information systems, which hinders effective use of the data. Methods are needed for imputing missing traffic data to minimize the effect of incomplete data on the utilization. This paper presents an improved Local Least Squares (LLS) ap- proach to impute the incomplete data. The LLS is an improved version of the K Nearest Neighbor (KNN) method. First, the missing traffic data is replaced by a row average of the known values. Then, the vector angle and Euclidean distance are used to select the nearest neighbors. Finally, a regression step is used to get weights of the nearest neighbors and the imputation results. Traffic flow volume collected in Beijing was analyzed to compare this approach with the Bayesian Principle Component Analysis (BPCA) imputation ap- proach. Tests show that this approach provides slightly better performance than BPCA imputation to impute missing traffic data.