Time-series data provide important information in many fields,and their processing and analysis have been the focus of much research.However,detecting anomalies is very difficult due to data imbalance,temporal depende...Time-series data provide important information in many fields,and their processing and analysis have been the focus of much research.However,detecting anomalies is very difficult due to data imbalance,temporal dependence,and noise.Therefore,methodologies for data augmentation and conversion of time series data into images for analysis have been studied.This paper proposes a fault detection model that uses time series data augmentation and transformation to address the problems of data imbalance,temporal dependence,and robustness to noise.The method of data augmentation is set as the addition of noise.It involves adding Gaussian noise,with the noise level set to 0.002,to maximize the generalization performance of the model.In addition,we use the Markov Transition Field(MTF)method to effectively visualize the dynamic transitions of the data while converting the time series data into images.It enables the identification of patterns in time series data and assists in capturing the sequential dependencies of the data.For anomaly detection,the PatchCore model is applied to show excellent performance,and the detected anomaly areas are represented as heat maps.It allows for the detection of anomalies,and by applying an anomaly map to the original image,it is possible to capture the areas where anomalies occur.The performance evaluation shows that both F1-score and Accuracy are high when time series data is converted to images.Additionally,when processed as images rather than as time series data,there was a significant reduction in both the size of the data and the training time.The proposed method can provide an important springboard for research in the field of anomaly detection using time series data.Besides,it helps solve problems such as analyzing complex patterns in data lightweight.展开更多
With the development of the integration of aviation safety and artificial intelligence,research on the combination of risk assessment and artificial intelligence is particularly important in the field of risk manageme...With the development of the integration of aviation safety and artificial intelligence,research on the combination of risk assessment and artificial intelligence is particularly important in the field of risk management,but searching for an efficient and accurate risk assessment algorithm has become a challenge for the civil aviation industry.Therefore,an improved risk assessment algorithm(PS-AE-LSTM)based on long short-term memory network(LSTM)with autoencoder(AE)is proposed for the various supervised deep learning algorithms in flight safety that cannot adequately address the problem of the quality on risk level labels.Firstly,based on the normal distribution characteristics of flight data,a probability severity(PS)model is established to enhance the quality of risk assessment labels.Secondly,autoencoder is introduced to reconstruct the flight parameter data to improve the data quality.Finally,utilizing the time-series nature of flight data,a long and short-termmemory network is used to classify the risk level and improve the accuracy of risk assessment.Thus,a risk assessment experimentwas conducted to analyze a fleet landing phase dataset using the PS-AE-LSTMalgorithm to assess the risk level associated with aircraft hard landing events.The results show that the proposed algorithm achieves an accuracy of 86.45%compared with seven baseline models and has excellent risk assessment capability.展开更多
Ocean temperature is an important physical variable in marine ecosystems,and ocean temperature prediction is an important research objective in ocean-related fields.Currently,one of the commonly used methods for ocean...Ocean temperature is an important physical variable in marine ecosystems,and ocean temperature prediction is an important research objective in ocean-related fields.Currently,one of the commonly used methods for ocean temperature prediction is based on data-driven,but research on this method is mostly limited to the sea surface,with few studies on the prediction of internal ocean temperature.Existing graph neural network-based methods usually use predefined graphs or learned static graphs,which cannot capture the dynamic associations among data.In this study,we propose a novel dynamic spatiotemporal graph neural network(DSTGN)to predict threedimensional ocean temperature(3D-OT),which combines static graph learning and dynamic graph learning to automatically mine two unknown dependencies between sequences based on the original 3D-OT data without prior knowledge.Temporal and spatial dependencies in the time series were then captured using temporal and graph convolutions.We also integrated dynamic graph learning,static graph learning,graph convolution,and temporal convolution into an end-to-end framework for 3D-OT prediction using time-series grid data.In this study,we conducted prediction experiments using high-resolution 3D-OT from the Copernicus global ocean physical reanalysis,with data covering the vertical variation of temperature from the sea surface to 1000 m below the sea surface.We compared five mainstream models that are commonly used for ocean temperature prediction,and the results showed that the method achieved the best prediction results at all prediction scales.展开更多
Time series forecasting plays an important role in various fields, such as energy, finance, transport, and weather. Temporal convolutional networks (TCNs) based on dilated causal convolution have been widely used in t...Time series forecasting plays an important role in various fields, such as energy, finance, transport, and weather. Temporal convolutional networks (TCNs) based on dilated causal convolution have been widely used in time series forecasting. However, two problems weaken the performance of TCNs. One is that in dilated casual convolution, causal convolution leads to the receptive fields of outputs being concentrated in the earlier part of the input sequence, whereas the recent input information will be severely lost. The other is that the distribution shift problem in time series has not been adequately solved. To address the first problem, we propose a subsequence-based dilated convolution method (SDC). By using multiple convolutional filters to convolve elements of neighboring subsequences, the method extracts temporal features from a growing receptive field via a growing subsequence rather than a single element. Ultimately, the receptive field of each output element can cover the whole input sequence. To address the second problem, we propose a difference and compensation method (DCM). The method reduces the discrepancies between and within the input sequences by difference operations and then compensates the outputs for the information lost due to difference operations. Based on SDC and DCM, we further construct a temporal subsequence-based convolutional network with difference (TSCND) for time series forecasting. The experimental results show that TSCND can reduce prediction mean squared error by 7.3% and save runtime, compared with state-of-the-art models and vanilla TCN.展开更多
There are errors in multi-source uncertain time series data.Truth discovery methods for time series data are effective in finding more accurate values,but some have limitations in their usability.To tackle this challe...There are errors in multi-source uncertain time series data.Truth discovery methods for time series data are effective in finding more accurate values,but some have limitations in their usability.To tackle this challenge,we propose a new and convenient truth discovery method to handle time series data.A more accurate sample is closer to the truth and,consequently,to other accurate samples.Because the mutual-confirm relationship between sensors is very similar to the mutual-quote relationship between web pages,we evaluate sensor reliability based on PageRank and then estimate the truth by sensor reliability.Therefore,this method does not rely on smoothness assumptions or prior knowledge of the data.Finally,we validate the effectiveness and efficiency of the proposed method on real-world and synthetic data sets,respectively.展开更多
The aim of this study is to establish the prevailing conditions of changing climatic trends and change point dates in four selected meteorological stations of Uyo, Benin, Port Harcourt, and Warri in the Niger Delta re...The aim of this study is to establish the prevailing conditions of changing climatic trends and change point dates in four selected meteorological stations of Uyo, Benin, Port Harcourt, and Warri in the Niger Delta region of Nigeria. Using daily or 24-hourly annual maximum series (AMS) data with the Indian Meteorological Department (IMD) and the modified Chowdury Indian Meteorological Department (MCIMD) models were adopted to downscale the time series data. Mann-Kendall (MK) trend and Sen’s Slope Estimator (SSE) test showed a statistically significant trend for Uyo and Benin, while Port Harcourt and Warri showed mild trends. The Sen’s Slope magnitude and variation rate were 21.6, 10.8, 6.00 and 4.4 mm/decade, respectively. The trend change-point analysis showed the initial rainfall change-point dates as 2002, 2005, 1988, and 2000 for Uyo, Benin, Port Harcourt, and Warri, respectively. These prove positive changing climatic conditions for rainfall in the study area. Erosion and flood control facilities analysis and design in the Niger Delta will require the application of Non-stationary IDF modelling.展开更多
The COVID-19 pandemic continues to impact daily life worldwide.It would be helpful and valuable if we could obtain valid information from the COVID-19 pandemic sequential data itself for characterizing the pandemic.He...The COVID-19 pandemic continues to impact daily life worldwide.It would be helpful and valuable if we could obtain valid information from the COVID-19 pandemic sequential data itself for characterizing the pandemic.Here,we aim to demonstrate that it is feasible to analyze the patterns of the pandemic using a time-series clustering approach.In this work,we use dynamic time warping distance and hierarchical clustering to cluster time series of daily new cases and deaths from different countries into four patterns.It is found that geographic factors have a large but not decisive influence on the pattern of pandemic development.Moreover,the age structure of the population may also influence the formation of cluster patterns.Our proven valid method may provide a different but very useful perspective for other scholars and researchers.展开更多
In the field of global changes, the relationship between plant phenology and climate, which reflects the response of terrestrial ecosystem to global climate change, has become a key subject that is highly concerned. U...In the field of global changes, the relationship between plant phenology and climate, which reflects the response of terrestrial ecosystem to global climate change, has become a key subject that is highly concerned. Using the moderate-resolution imaging spectroradiometer (MODIS)/enhanced vegetation index(EVI) collected every eight days during January- July from 2005 to 2008 and the corresponding remote sensing data as experimental materials, we constructed cloud-free images via the Harmonic analysis of time series (HANTS). The cloud-free images were then treated by dynamic threshold method for obtaining the vegetation phenology in green up period and its distribution pattern. And the distribution pattern between freezing disaster year and normal year were comparatively analyzed for revealing the effect of freezing disaster on vegetation phenology in experimental plot. The result showed that the treated EVI data performed well in monitoring the effect of freezing disaster on vegetation phenology, accurately reflecting the regions suffered from freezing disaster. This result suggests that processing of remote sensing data using HANTS method could well monitor the ecological characteristics of vegetation.展开更多
Time series forecasting and analysis are widely used in many fields and application scenarios.Time series historical data reflects the change pattern and trend,which can serve the application and decision in each appl...Time series forecasting and analysis are widely used in many fields and application scenarios.Time series historical data reflects the change pattern and trend,which can serve the application and decision in each application scenario to a certain extent.In this paper,we select the time series prediction problem in the atmospheric environment scenario to start the application research.In terms of data support,we obtain the data of nearly 3500 vehicles in some cities in China fromRunwoda Research Institute,focusing on the major pollutant emission data of non-road mobile machinery and high emission vehicles in Beijing and Bozhou,Anhui Province to build the dataset and conduct the time series prediction analysis experiments on them.This paper proposes a P-gLSTNet model,and uses Autoregressive Integrated Moving Average model(ARIMA),long and short-term memory(LSTM),and Prophet to predict and compare the emissions in the future period.The experiments are validated on four public data sets and one self-collected data set,and the mean absolute error(MAE),root mean square error(RMSE),and mean absolute percentage error(MAPE)are selected as the evaluationmetrics.The experimental results show that the proposed P-gLSTNet fusion model predicts less error,outperforms the backbone method,and is more suitable for the prediction of time-series data in this scenario.展开更多
In this paper, we present a cluster-based algorithm for time series outlier mining.We use discrete Fourier transformation (DFT) to transform time series from time domain to frequency domain. Time series thus can be ma...In this paper, we present a cluster-based algorithm for time series outlier mining.We use discrete Fourier transformation (DFT) to transform time series from time domain to frequency domain. Time series thus can be mapped as the points in k -dimensional space.For these points, a cluster-based algorithm is developed to mine the outliers from these points.The algorithm first partitions the input points into disjoint clusters and then prunes the clusters,through judgment that can not contain outliers.Our algorithm has been run in the electrical load time series of one steel enterprise and proved to be effective.展开更多
On the assumption that random interruptions in the observation process are modeled by a sequence of independent Bernoulli random variables, we firstly generalize two kinds of nonlinear filtering methods with random in...On the assumption that random interruptions in the observation process are modeled by a sequence of independent Bernoulli random variables, we firstly generalize two kinds of nonlinear filtering methods with random interruption failures in the observation based on the extended Kalman filtering (EKF) and the unscented Kalman filtering (UKF), which were shortened as GEKF and CUKF in this paper, respectively. Then the nonlinear filtering model is established by using the radial basis function neural network (RBFNN) prototypes and the network weights as state equation and the output of RBFNN to present the observation equation. Finally, we take the filtering problem under missing observed data as a special case of nonlinear filtering with random intermittent failures by setting each missing data to be zero without needing to pre-estimate the missing data, and use the GEKF-based RBFNN and the GUKF-based RBFNN to predict the ground radioactivity time series with missing data. Experimental results demonstrate that the prediction results of GUKF-based RBFNN accord well with the real ground radioactivity time series while the prediction results of GEKF-based RBFNN are divergent.展开更多
Data Mining (DM) methods are being increasingly used in prediction with time series data, in addition to traditional statistical approaches. This paper presents a literature review of the use of DM with time series da...Data Mining (DM) methods are being increasingly used in prediction with time series data, in addition to traditional statistical approaches. This paper presents a literature review of the use of DM with time series data, focusing on shorttime stocks prediction. This is an area that has been attracting a great deal of attention from researchers in the field. The main contribution of this paper is to provide an outline of the use of DM with time series data, using mainly examples related with short-term stocks prediction. This is important to a better understanding of the field. Some of the main trends and open issues will also be introduced.展开更多
Clustering is used to gain an intuition of the struc tures in the data.Most of the current clustering algorithms pro duce a clustering structure even on data that do not possess such structure.In these cases,the algor...Clustering is used to gain an intuition of the struc tures in the data.Most of the current clustering algorithms pro duce a clustering structure even on data that do not possess such structure.In these cases,the algorithms force a structure in the data instead of discovering one.To avoid false structures in the relations of data,a novel clusterability assessment method called density-based clusterability measure is proposed in this paper.I measures the prominence of clustering structure in the data to evaluate whether a cluster analysis could produce a meaningfu insight to the relationships in the data.This is especially useful in time-series data since visualizing the structure in time-series data is hard.The performance of the clusterability measure is evalu ated against several synthetic data sets and time-series data sets which illustrate that the density-based clusterability measure can successfully indicate clustering structure of time-series data.展开更多
By employing the unique phenological feature of winter wheat extracted from peak before winter (PBW) and the advantages of moderate resolution imaging spectroradiometer (MODIS) data with high temporal resolution a...By employing the unique phenological feature of winter wheat extracted from peak before winter (PBW) and the advantages of moderate resolution imaging spectroradiometer (MODIS) data with high temporal resolution and intermediate spatial resolution, a remote sensing-based model for mapping winter wheat on the North China Plain was built through integration with Landsat images and land-use data. First, a phenological window, PBW was drawn from time-series MODIS data. Next, feature extraction was performed for the PBW to reduce feature dimension and enhance its information. Finally, a regression model was built to model the relationship of the phenological feature and the sample data. The amount of information of the PBW was evaluated and compared with that of the main peak (MP). The relative precision of the mapping reached up to 92% in comparison to the Landsat sample data, and ranged between 87 and 96% in comparison to the statistical data. These results were sufficient to satisfy the accuracy requirements for winter wheat mapping at a large scale. Moreover, the proposed method has the ability to obtain the distribution information for winter wheat in an earlier period than previous studies. This study could throw light on the monitoring of winter wheat in China by using unique phenological feature of winter wheat.展开更多
Time series forecasting has become an important aspect of data analysis and has many real-world applications.However,undesirable missing values are often encountered,which may adversely affect many forecasting tasks.I...Time series forecasting has become an important aspect of data analysis and has many real-world applications.However,undesirable missing values are often encountered,which may adversely affect many forecasting tasks.In this study,we evaluate and compare the effects of imputationmethods for estimating missing values in a time series.Our approach does not include a simulation to generate pseudo-missing data,but instead perform imputation on actual missing data and measure the performance of the forecasting model created therefrom.In an experiment,therefore,several time series forecasting models are trained using different training datasets prepared using each imputation method.Subsequently,the performance of the imputation methods is evaluated by comparing the accuracy of the forecasting models.The results obtained from a total of four experimental cases show that the k-nearest neighbor technique is the most effective in reconstructing missing data and contributes positively to time series forecasting compared with other imputation methods.展开更多
Underground coal fires are one of the most common and serious geohazards in most coal producing countries in the world. Monitoring their spatio-temporal changes plays an important role in controlling and preventing th...Underground coal fires are one of the most common and serious geohazards in most coal producing countries in the world. Monitoring their spatio-temporal changes plays an important role in controlling and preventing the effects of coal fires, and their environmental impact. In this study, the spatio-temporal changes of underground coal fires in Khanh Hoa coal field(North-East of Viet Nam) were analyzed using Landsat time-series data during the 2008-2016 period. Based on land surface temperatures retrieved from Landsat thermal data, underground coal fires related to thermal anomalies were identified using the MEDIAN+1.5×IQR(IQR: Interquartile range) threshold technique. The locations of underground coal fires were validated using a coal fire map produced by the field survey data and cross-validated using the daytime ASTER thermal infrared imagery. Based on the fires extracted from seven Landsat thermal imageries, the spatiotemporal changes of underground coal fire areas were analyzed. The results showed that the thermalanomalous zones have been correlated with known coal fires. Cross-validation of coal fires using ASTER TIR data showed a high consistency of 79.3%. The largest coal fire area of 184.6 hectares was detected in 2010, followed by 2014(181.1 hectares) and 2016(178.5 hectares). The smaller coal fire areas were extracted with areas of 133.6 and 152.5 hectares in 2011 and 2009 respectively. Underground coal fires were mainly detected in the northern and southern part, and tend to spread to north-west of the coal field.展开更多
In the era of big data,the general public is more likely to access big data,but they wouldn’t like to analyze the data.Therefore,the traditional data visualization with certain professionalism is not easy to be accep...In the era of big data,the general public is more likely to access big data,but they wouldn’t like to analyze the data.Therefore,the traditional data visualization with certain professionalism is not easy to be accepted by the general public living in the fast pace.Under this background,a new general visualization method for dynamic time series data emerges as the times require.Time series data visualization organizes abstract and hard-to-understand data into a form that is easily understood by the public.This method integrates data visualization into short videos,which is more in line with the way people get information in modern fast-paced lifestyles.The modular approach also facilitates public participation in production.This paper summarizes the dynamic visualization methods of time series data ranking,studies the relevant literature,shows its value and existing problems,and gives corresponding suggestions and future research prospects.展开更多
Based on the 16d-composite MODIS (moderate resolution imaging spectroradiometer)-NDVI(normalized difference vegetation index) time-series data in 2004, vegetation in North Tibet Plateau was classified and seasonal...Based on the 16d-composite MODIS (moderate resolution imaging spectroradiometer)-NDVI(normalized difference vegetation index) time-series data in 2004, vegetation in North Tibet Plateau was classified and seasonal variations on the pixels selected from different vegetation type were analyzed. The Savitzky-Golay filtering algorithm was applied to perform a filtration processing for MODIS-NDVI time-series data. The processed time-series curves can reflect a real variation trend of vegetation growth. The NDVI time-series curves of coniferous forest, high-cold meadow, high-cold meadow steppe and high-cold steppe all appear a mono-peak model during vegetation growth with the maximum peak occurring in August. A decision-tree classification model was established according to either NDVI time-series data or land surface temperature data. And then, both classifying and processing for vegetations were carried out through the model based on NDVI time-series curves. An accuracy test illustrates that classification results are of high accuracy and credibility and the model is conducive for studying a climate variation and estimating a vegetation production at regional even global scale.展开更多
In this study,we developed software for vehicle big data analysis to analyze the time-series data of connected vehicles.We designed two software modules:The rst to derive the Pearson correlation coefcients to analyze ...In this study,we developed software for vehicle big data analysis to analyze the time-series data of connected vehicles.We designed two software modules:The rst to derive the Pearson correlation coefcients to analyze the collected data and the second to conduct exploratory data analysis of the collected vehicle data.In particular,we analyzed the dangerous driving patterns of motorists based on the safety standards of the Korea Transportation Safety Authority.We also analyzed seasonal fuel efciency(four seasons)and mileage of vehicles,and identied rapid acceleration,rapid deceleration,sudden stopping(harsh braking),quick starting,sudden left turn,sudden right turn and sudden U-turn driving patterns of vehicles.We implemented the density-based spatial clustering of applications with a noise algorithm for trajectory analysis based on GPS(Global Positioning System)data and designed a long shortterm memory algorithm and an auto-regressive integrated moving average model for time-series data analysis.In this paper,we mainly describe the development environment of the analysis software,the structure and data ow of the overall analysis platform,the conguration of the collected vehicle data,and the various algorithms used in the analysis.Finally,we present illustrative results of our analysis,such as dangerous driving patterns that were detected.展开更多
The mining of the rules from the electrical load time series data which are collected from the EMS (Energy Management System) is discussed. The data from the EMS are too huge and sophisticated to be understood and use...The mining of the rules from the electrical load time series data which are collected from the EMS (Energy Management System) is discussed. The data from the EMS are too huge and sophisticated to be understood and used by the power system engineer, while useful information is hidden in the electrical load data. The authors discuss the use of fuzzy linguistic summary as data mining method to induce the rules from the electrical load time series. The data preprocessing techniques are also discussed in the paper.展开更多
基金This research was financially supported by the Ministry of Trade,Industry,and Energy(MOTIE),Korea,under the“Project for Research and Development with Middle Markets Enterprises and DNA(Data,Network,AI)Universities”(AI-based Safety Assessment and Management System for Concrete Structures)(ReferenceNumber P0024559)supervised by theKorea Institute for Advancement of Technology(KIAT).
文摘Time-series data provide important information in many fields,and their processing and analysis have been the focus of much research.However,detecting anomalies is very difficult due to data imbalance,temporal dependence,and noise.Therefore,methodologies for data augmentation and conversion of time series data into images for analysis have been studied.This paper proposes a fault detection model that uses time series data augmentation and transformation to address the problems of data imbalance,temporal dependence,and robustness to noise.The method of data augmentation is set as the addition of noise.It involves adding Gaussian noise,with the noise level set to 0.002,to maximize the generalization performance of the model.In addition,we use the Markov Transition Field(MTF)method to effectively visualize the dynamic transitions of the data while converting the time series data into images.It enables the identification of patterns in time series data and assists in capturing the sequential dependencies of the data.For anomaly detection,the PatchCore model is applied to show excellent performance,and the detected anomaly areas are represented as heat maps.It allows for the detection of anomalies,and by applying an anomaly map to the original image,it is possible to capture the areas where anomalies occur.The performance evaluation shows that both F1-score and Accuracy are high when time series data is converted to images.Additionally,when processed as images rather than as time series data,there was a significant reduction in both the size of the data and the training time.The proposed method can provide an important springboard for research in the field of anomaly detection using time series data.Besides,it helps solve problems such as analyzing complex patterns in data lightweight.
基金the National Natural Science Foundation of China(U2033213)the Fundamental Research Funds for the Central Universities(FZ2021ZZ01,FZ2022ZX50).
文摘With the development of the integration of aviation safety and artificial intelligence,research on the combination of risk assessment and artificial intelligence is particularly important in the field of risk management,but searching for an efficient and accurate risk assessment algorithm has become a challenge for the civil aviation industry.Therefore,an improved risk assessment algorithm(PS-AE-LSTM)based on long short-term memory network(LSTM)with autoencoder(AE)is proposed for the various supervised deep learning algorithms in flight safety that cannot adequately address the problem of the quality on risk level labels.Firstly,based on the normal distribution characteristics of flight data,a probability severity(PS)model is established to enhance the quality of risk assessment labels.Secondly,autoencoder is introduced to reconstruct the flight parameter data to improve the data quality.Finally,utilizing the time-series nature of flight data,a long and short-termmemory network is used to classify the risk level and improve the accuracy of risk assessment.Thus,a risk assessment experimentwas conducted to analyze a fleet landing phase dataset using the PS-AE-LSTMalgorithm to assess the risk level associated with aircraft hard landing events.The results show that the proposed algorithm achieves an accuracy of 86.45%compared with seven baseline models and has excellent risk assessment capability.
基金The National Key R&D Program of China under contract No.2021YFC3101603.
文摘Ocean temperature is an important physical variable in marine ecosystems,and ocean temperature prediction is an important research objective in ocean-related fields.Currently,one of the commonly used methods for ocean temperature prediction is based on data-driven,but research on this method is mostly limited to the sea surface,with few studies on the prediction of internal ocean temperature.Existing graph neural network-based methods usually use predefined graphs or learned static graphs,which cannot capture the dynamic associations among data.In this study,we propose a novel dynamic spatiotemporal graph neural network(DSTGN)to predict threedimensional ocean temperature(3D-OT),which combines static graph learning and dynamic graph learning to automatically mine two unknown dependencies between sequences based on the original 3D-OT data without prior knowledge.Temporal and spatial dependencies in the time series were then captured using temporal and graph convolutions.We also integrated dynamic graph learning,static graph learning,graph convolution,and temporal convolution into an end-to-end framework for 3D-OT prediction using time-series grid data.In this study,we conducted prediction experiments using high-resolution 3D-OT from the Copernicus global ocean physical reanalysis,with data covering the vertical variation of temperature from the sea surface to 1000 m below the sea surface.We compared five mainstream models that are commonly used for ocean temperature prediction,and the results showed that the method achieved the best prediction results at all prediction scales.
基金supported by the National Key Research and Development Program of China(No.2018YFB2101300)the National Natural Science Foundation of China(Grant No.61871186)the Dean’s Fund of Engineering Research Center of Software/Hardware Co-Design Technology and Application,Ministry of Education(East China Normal University).
文摘Time series forecasting plays an important role in various fields, such as energy, finance, transport, and weather. Temporal convolutional networks (TCNs) based on dilated causal convolution have been widely used in time series forecasting. However, two problems weaken the performance of TCNs. One is that in dilated casual convolution, causal convolution leads to the receptive fields of outputs being concentrated in the earlier part of the input sequence, whereas the recent input information will be severely lost. The other is that the distribution shift problem in time series has not been adequately solved. To address the first problem, we propose a subsequence-based dilated convolution method (SDC). By using multiple convolutional filters to convolve elements of neighboring subsequences, the method extracts temporal features from a growing receptive field via a growing subsequence rather than a single element. Ultimately, the receptive field of each output element can cover the whole input sequence. To address the second problem, we propose a difference and compensation method (DCM). The method reduces the discrepancies between and within the input sequences by difference operations and then compensates the outputs for the information lost due to difference operations. Based on SDC and DCM, we further construct a temporal subsequence-based convolutional network with difference (TSCND) for time series forecasting. The experimental results show that TSCND can reduce prediction mean squared error by 7.3% and save runtime, compared with state-of-the-art models and vanilla TCN.
基金National Natural Science Foundation of China(No.62002131)Shuangchuang Ph.D Award(from World Prestigious Universities)of Jiangsu Province,China(No.JSSCBS20211179)。
文摘There are errors in multi-source uncertain time series data.Truth discovery methods for time series data are effective in finding more accurate values,but some have limitations in their usability.To tackle this challenge,we propose a new and convenient truth discovery method to handle time series data.A more accurate sample is closer to the truth and,consequently,to other accurate samples.Because the mutual-confirm relationship between sensors is very similar to the mutual-quote relationship between web pages,we evaluate sensor reliability based on PageRank and then estimate the truth by sensor reliability.Therefore,this method does not rely on smoothness assumptions or prior knowledge of the data.Finally,we validate the effectiveness and efficiency of the proposed method on real-world and synthetic data sets,respectively.
文摘The aim of this study is to establish the prevailing conditions of changing climatic trends and change point dates in four selected meteorological stations of Uyo, Benin, Port Harcourt, and Warri in the Niger Delta region of Nigeria. Using daily or 24-hourly annual maximum series (AMS) data with the Indian Meteorological Department (IMD) and the modified Chowdury Indian Meteorological Department (MCIMD) models were adopted to downscale the time series data. Mann-Kendall (MK) trend and Sen’s Slope Estimator (SSE) test showed a statistically significant trend for Uyo and Benin, while Port Harcourt and Warri showed mild trends. The Sen’s Slope magnitude and variation rate were 21.6, 10.8, 6.00 and 4.4 mm/decade, respectively. The trend change-point analysis showed the initial rainfall change-point dates as 2002, 2005, 1988, and 2000 for Uyo, Benin, Port Harcourt, and Warri, respectively. These prove positive changing climatic conditions for rainfall in the study area. Erosion and flood control facilities analysis and design in the Niger Delta will require the application of Non-stationary IDF modelling.
基金jointly supported by the National Natural Science Foundation of China(Grant No.:11971074.61671005).
文摘The COVID-19 pandemic continues to impact daily life worldwide.It would be helpful and valuable if we could obtain valid information from the COVID-19 pandemic sequential data itself for characterizing the pandemic.Here,we aim to demonstrate that it is feasible to analyze the patterns of the pandemic using a time-series clustering approach.In this work,we use dynamic time warping distance and hierarchical clustering to cluster time series of daily new cases and deaths from different countries into four patterns.It is found that geographic factors have a large but not decisive influence on the pattern of pandemic development.Moreover,the age structure of the population may also influence the formation of cluster patterns.Our proven valid method may provide a different but very useful perspective for other scholars and researchers.
文摘In the field of global changes, the relationship between plant phenology and climate, which reflects the response of terrestrial ecosystem to global climate change, has become a key subject that is highly concerned. Using the moderate-resolution imaging spectroradiometer (MODIS)/enhanced vegetation index(EVI) collected every eight days during January- July from 2005 to 2008 and the corresponding remote sensing data as experimental materials, we constructed cloud-free images via the Harmonic analysis of time series (HANTS). The cloud-free images were then treated by dynamic threshold method for obtaining the vegetation phenology in green up period and its distribution pattern. And the distribution pattern between freezing disaster year and normal year were comparatively analyzed for revealing the effect of freezing disaster on vegetation phenology in experimental plot. The result showed that the treated EVI data performed well in monitoring the effect of freezing disaster on vegetation phenology, accurately reflecting the regions suffered from freezing disaster. This result suggests that processing of remote sensing data using HANTS method could well monitor the ecological characteristics of vegetation.
基金the Beijing Chaoyang District Collaborative Innovation Project(No.CYXT2013)the subject support of Beijing Municipal Science and Technology Key R&D Program-Capital Blue Sky Action Cultivation Project(Z19110900910000)+1 种基金“Research and Demonstration ofHigh Emission Vehicle Monitoring Equipment System Based on Sensor Integration Technology”(Z19110000911003)This work was supported by the Academic Research Projects of Beijing Union University(No.ZK80202103).
文摘Time series forecasting and analysis are widely used in many fields and application scenarios.Time series historical data reflects the change pattern and trend,which can serve the application and decision in each application scenario to a certain extent.In this paper,we select the time series prediction problem in the atmospheric environment scenario to start the application research.In terms of data support,we obtain the data of nearly 3500 vehicles in some cities in China fromRunwoda Research Institute,focusing on the major pollutant emission data of non-road mobile machinery and high emission vehicles in Beijing and Bozhou,Anhui Province to build the dataset and conduct the time series prediction analysis experiments on them.This paper proposes a P-gLSTNet model,and uses Autoregressive Integrated Moving Average model(ARIMA),long and short-term memory(LSTM),and Prophet to predict and compare the emissions in the future period.The experiments are validated on four public data sets and one self-collected data set,and the mean absolute error(MAE),root mean square error(RMSE),and mean absolute percentage error(MAPE)are selected as the evaluationmetrics.The experimental results show that the proposed P-gLSTNet fusion model predicts less error,outperforms the backbone method,and is more suitable for the prediction of time-series data in this scenario.
文摘In this paper, we present a cluster-based algorithm for time series outlier mining.We use discrete Fourier transformation (DFT) to transform time series from time domain to frequency domain. Time series thus can be mapped as the points in k -dimensional space.For these points, a cluster-based algorithm is developed to mine the outliers from these points.The algorithm first partitions the input points into disjoint clusters and then prunes the clusters,through judgment that can not contain outliers.Our algorithm has been run in the electrical load time series of one steel enterprise and proved to be effective.
基金Project supported by the State Key Program of the National Natural Science of China (Grant No. 60835004)the Natural Science Foundation of Jiangsu Province of China (Grant No. BK2009727)+1 种基金the Natural Science Foundation of Higher Education Institutions of Jiangsu Province of China (Grant No. 10KJB510004)the National Natural Science Foundation of China (Grant No. 61075028)
文摘On the assumption that random interruptions in the observation process are modeled by a sequence of independent Bernoulli random variables, we firstly generalize two kinds of nonlinear filtering methods with random interruption failures in the observation based on the extended Kalman filtering (EKF) and the unscented Kalman filtering (UKF), which were shortened as GEKF and CUKF in this paper, respectively. Then the nonlinear filtering model is established by using the radial basis function neural network (RBFNN) prototypes and the network weights as state equation and the output of RBFNN to present the observation equation. Finally, we take the filtering problem under missing observed data as a special case of nonlinear filtering with random intermittent failures by setting each missing data to be zero without needing to pre-estimate the missing data, and use the GEKF-based RBFNN and the GUKF-based RBFNN to predict the ground radioactivity time series with missing data. Experimental results demonstrate that the prediction results of GUKF-based RBFNN accord well with the real ground radioactivity time series while the prediction results of GEKF-based RBFNN are divergent.
文摘Data Mining (DM) methods are being increasingly used in prediction with time series data, in addition to traditional statistical approaches. This paper presents a literature review of the use of DM with time series data, focusing on shorttime stocks prediction. This is an area that has been attracting a great deal of attention from researchers in the field. The main contribution of this paper is to provide an outline of the use of DM with time series data, using mainly examples related with short-term stocks prediction. This is important to a better understanding of the field. Some of the main trends and open issues will also be introduced.
文摘Clustering is used to gain an intuition of the struc tures in the data.Most of the current clustering algorithms pro duce a clustering structure even on data that do not possess such structure.In these cases,the algorithms force a structure in the data instead of discovering one.To avoid false structures in the relations of data,a novel clusterability assessment method called density-based clusterability measure is proposed in this paper.I measures the prominence of clustering structure in the data to evaluate whether a cluster analysis could produce a meaningfu insight to the relationships in the data.This is especially useful in time-series data since visualizing the structure in time-series data is hard.The performance of the clusterability measure is evalu ated against several synthetic data sets and time-series data sets which illustrate that the density-based clusterability measure can successfully indicate clustering structure of time-series data.
基金supported by the open research fund of the Key Laboratory of Agri-informatics,Ministry of Agriculture and the fund of Outstanding Agricultural Researcher,Ministry of Agriculture,China
文摘By employing the unique phenological feature of winter wheat extracted from peak before winter (PBW) and the advantages of moderate resolution imaging spectroradiometer (MODIS) data with high temporal resolution and intermediate spatial resolution, a remote sensing-based model for mapping winter wheat on the North China Plain was built through integration with Landsat images and land-use data. First, a phenological window, PBW was drawn from time-series MODIS data. Next, feature extraction was performed for the PBW to reduce feature dimension and enhance its information. Finally, a regression model was built to model the relationship of the phenological feature and the sample data. The amount of information of the PBW was evaluated and compared with that of the main peak (MP). The relative precision of the mapping reached up to 92% in comparison to the Landsat sample data, and ranged between 87 and 96% in comparison to the statistical data. These results were sufficient to satisfy the accuracy requirements for winter wheat mapping at a large scale. Moreover, the proposed method has the ability to obtain the distribution information for winter wheat in an earlier period than previous studies. This study could throw light on the monitoring of winter wheat in China by using unique phenological feature of winter wheat.
基金This research was supported by Basic Science Research Program through the National Research Foundation of Korea(NRF)funded by the Ministry of Education(Grant Number 2020R1A6A1A03040583).
文摘Time series forecasting has become an important aspect of data analysis and has many real-world applications.However,undesirable missing values are often encountered,which may adversely affect many forecasting tasks.In this study,we evaluate and compare the effects of imputationmethods for estimating missing values in a time series.Our approach does not include a simulation to generate pseudo-missing data,but instead perform imputation on actual missing data and measure the performance of the forecasting model created therefrom.In an experiment,therefore,several time series forecasting models are trained using different training datasets prepared using each imputation method.Subsequently,the performance of the imputation methods is evaluated by comparing the accuracy of the forecasting models.The results obtained from a total of four experimental cases show that the k-nearest neighbor technique is the most effective in reconstructing missing data and contributes positively to time series forecasting compared with other imputation methods.
基金funded by the Ministry-level Scientific and Technological Key Programs of Ministry of Natural Resources and Environment of Viet Nam "Application of thermal infrared remote sensing and GIS for mapping underground coal fires in Quang Ninh coal basin" (Grant No. TNMT.2017.08.06)
文摘Underground coal fires are one of the most common and serious geohazards in most coal producing countries in the world. Monitoring their spatio-temporal changes plays an important role in controlling and preventing the effects of coal fires, and their environmental impact. In this study, the spatio-temporal changes of underground coal fires in Khanh Hoa coal field(North-East of Viet Nam) were analyzed using Landsat time-series data during the 2008-2016 period. Based on land surface temperatures retrieved from Landsat thermal data, underground coal fires related to thermal anomalies were identified using the MEDIAN+1.5×IQR(IQR: Interquartile range) threshold technique. The locations of underground coal fires were validated using a coal fire map produced by the field survey data and cross-validated using the daytime ASTER thermal infrared imagery. Based on the fires extracted from seven Landsat thermal imageries, the spatiotemporal changes of underground coal fire areas were analyzed. The results showed that the thermalanomalous zones have been correlated with known coal fires. Cross-validation of coal fires using ASTER TIR data showed a high consistency of 79.3%. The largest coal fire area of 184.6 hectares was detected in 2010, followed by 2014(181.1 hectares) and 2016(178.5 hectares). The smaller coal fire areas were extracted with areas of 133.6 and 152.5 hectares in 2011 and 2009 respectively. Underground coal fires were mainly detected in the northern and southern part, and tend to spread to north-west of the coal field.
基金This research is funded by the Open Foundation for the University Innovation Platform in the Hunan Province,Grant No.18K103Hunan Provincial Natural Science Foundation of China,Grant No.2017JJ20162016 Science Research Project of Hunan Provincial Department of Education,Grant No.16C0269.This research work is implemented at the 2011 Collaborative Innovation Center for Development and Utilization of Finance and Economics Big Data Property,Universities of Hunan Province.Open project,Grant Nos.20181901CRP03,20181901CRP04,20181901CRP05 National Social Science Fund Project:Research on the Impact Mechanism of China’s Capital Space Flow on Regional Economic Development(Project No.14BJL086).
文摘In the era of big data,the general public is more likely to access big data,but they wouldn’t like to analyze the data.Therefore,the traditional data visualization with certain professionalism is not easy to be accepted by the general public living in the fast pace.Under this background,a new general visualization method for dynamic time series data emerges as the times require.Time series data visualization organizes abstract and hard-to-understand data into a form that is easily understood by the public.This method integrates data visualization into short videos,which is more in line with the way people get information in modern fast-paced lifestyles.The modular approach also facilitates public participation in production.This paper summarizes the dynamic visualization methods of time series data ranking,studies the relevant literature,shows its value and existing problems,and gives corresponding suggestions and future research prospects.
基金the Frontier Program of the Knowledge Innovation Program of Chinese Academy of Sciences
文摘Based on the 16d-composite MODIS (moderate resolution imaging spectroradiometer)-NDVI(normalized difference vegetation index) time-series data in 2004, vegetation in North Tibet Plateau was classified and seasonal variations on the pixels selected from different vegetation type were analyzed. The Savitzky-Golay filtering algorithm was applied to perform a filtration processing for MODIS-NDVI time-series data. The processed time-series curves can reflect a real variation trend of vegetation growth. The NDVI time-series curves of coniferous forest, high-cold meadow, high-cold meadow steppe and high-cold steppe all appear a mono-peak model during vegetation growth with the maximum peak occurring in August. A decision-tree classification model was established according to either NDVI time-series data or land surface temperature data. And then, both classifying and processing for vegetations were carried out through the model based on NDVI time-series curves. An accuracy test illustrates that classification results are of high accuracy and credibility and the model is conducive for studying a climate variation and estimating a vegetation production at regional even global scale.
基金supported by the Technology Innovation Program(10083633,Development on Big Data Analysis Technology and Business Service for Connected Vehicles)funded by the Ministry of Trade,Industry&Energy(MOTIE,Korea)。
文摘In this study,we developed software for vehicle big data analysis to analyze the time-series data of connected vehicles.We designed two software modules:The rst to derive the Pearson correlation coefcients to analyze the collected data and the second to conduct exploratory data analysis of the collected vehicle data.In particular,we analyzed the dangerous driving patterns of motorists based on the safety standards of the Korea Transportation Safety Authority.We also analyzed seasonal fuel efciency(four seasons)and mileage of vehicles,and identied rapid acceleration,rapid deceleration,sudden stopping(harsh braking),quick starting,sudden left turn,sudden right turn and sudden U-turn driving patterns of vehicles.We implemented the density-based spatial clustering of applications with a noise algorithm for trajectory analysis based on GPS(Global Positioning System)data and designed a long shortterm memory algorithm and an auto-regressive integrated moving average model for time-series data analysis.In this paper,we mainly describe the development environment of the analysis software,the structure and data ow of the overall analysis platform,the conguration of the collected vehicle data,and the various algorithms used in the analysis.Finally,we present illustrative results of our analysis,such as dangerous driving patterns that were detected.
文摘The mining of the rules from the electrical load time series data which are collected from the EMS (Energy Management System) is discussed. The data from the EMS are too huge and sophisticated to be understood and used by the power system engineer, while useful information is hidden in the electrical load data. The authors discuss the use of fuzzy linguistic summary as data mining method to induce the rules from the electrical load time series. The data preprocessing techniques are also discussed in the paper.