Based on the 16d-composite MODIS (moderate resolution imaging spectroradiometer)-NDVI(normalized difference vegetation index) time-series data in 2004, vegetation in North Tibet Plateau was classified and seasonal...Based on the 16d-composite MODIS (moderate resolution imaging spectroradiometer)-NDVI(normalized difference vegetation index) time-series data in 2004, vegetation in North Tibet Plateau was classified and seasonal variations on the pixels selected from different vegetation type were analyzed. The Savitzky-Golay filtering algorithm was applied to perform a filtration processing for MODIS-NDVI time-series data. The processed time-series curves can reflect a real variation trend of vegetation growth. The NDVI time-series curves of coniferous forest, high-cold meadow, high-cold meadow steppe and high-cold steppe all appear a mono-peak model during vegetation growth with the maximum peak occurring in August. A decision-tree classification model was established according to either NDVI time-series data or land surface temperature data. And then, both classifying and processing for vegetations were carried out through the model based on NDVI time-series curves. An accuracy test illustrates that classification results are of high accuracy and credibility and the model is conducive for studying a climate variation and estimating a vegetation production at regional even global scale.展开更多
Clustering is used to gain an intuition of the struc tures in the data.Most of the current clustering algorithms pro duce a clustering structure even on data that do not possess such structure.In these cases,the algor...Clustering is used to gain an intuition of the struc tures in the data.Most of the current clustering algorithms pro duce a clustering structure even on data that do not possess such structure.In these cases,the algorithms force a structure in the data instead of discovering one.To avoid false structures in the relations of data,a novel clusterability assessment method called density-based clusterability measure is proposed in this paper.I measures the prominence of clustering structure in the data to evaluate whether a cluster analysis could produce a meaningfu insight to the relationships in the data.This is especially useful in time-series data since visualizing the structure in time-series data is hard.The performance of the clusterability measure is evalu ated against several synthetic data sets and time-series data sets which illustrate that the density-based clusterability measure can successfully indicate clustering structure of time-series data.展开更多
Underground coal fires are one of the most common and serious geohazards in most coal producing countries in the world. Monitoring their spatio-temporal changes plays an important role in controlling and preventing th...Underground coal fires are one of the most common and serious geohazards in most coal producing countries in the world. Monitoring their spatio-temporal changes plays an important role in controlling and preventing the effects of coal fires, and their environmental impact. In this study, the spatio-temporal changes of underground coal fires in Khanh Hoa coal field(North-East of Viet Nam) were analyzed using Landsat time-series data during the 2008-2016 period. Based on land surface temperatures retrieved from Landsat thermal data, underground coal fires related to thermal anomalies were identified using the MEDIAN+1.5×IQR(IQR: Interquartile range) threshold technique. The locations of underground coal fires were validated using a coal fire map produced by the field survey data and cross-validated using the daytime ASTER thermal infrared imagery. Based on the fires extracted from seven Landsat thermal imageries, the spatiotemporal changes of underground coal fire areas were analyzed. The results showed that the thermalanomalous zones have been correlated with known coal fires. Cross-validation of coal fires using ASTER TIR data showed a high consistency of 79.3%. The largest coal fire area of 184.6 hectares was detected in 2010, followed by 2014(181.1 hectares) and 2016(178.5 hectares). The smaller coal fire areas were extracted with areas of 133.6 and 152.5 hectares in 2011 and 2009 respectively. Underground coal fires were mainly detected in the northern and southern part, and tend to spread to north-west of the coal field.展开更多
The increasing penetration rate of electric kickboard vehicles has been popularized and promoted primarily because of its clean and efficient features.Electric kickboards are gradually growing in popularity in tourist...The increasing penetration rate of electric kickboard vehicles has been popularized and promoted primarily because of its clean and efficient features.Electric kickboards are gradually growing in popularity in tourist and education-centric localities.In the upcoming arrival of electric kickboard vehicles,deploying a customer rental service is essential.Due to its freefloating nature,the shared electric kickboard is a common and practical means of transportation.Relocation plans for shared electric kickboards are required to increase the quality of service,and forecasting demand for their use in a specific region is crucial.Predicting demand accurately with small data is troublesome.Extensive data is necessary for training machine learning algorithms for effective prediction.Data generation is a method for expanding the amount of data that will be further accessible for training.In this work,we proposed a model that takes time-series customers’electric kickboard demand data as input,pre-processes it,and generates synthetic data according to the original data distribution using generative adversarial networks(GAN).The electric kickboard mobility demand prediction error was reduced when we combined synthetic data with the original data.We proposed Tabular-GAN-Modified-WGAN-GP for generating synthetic data for better prediction results.We modified The Wasserstein GAN-gradient penalty(GP)with the RMSprop optimizer and then employed Spectral Normalization(SN)to improve training stability and faster convergence.Finally,we applied a regression-based blending ensemble technique that can help us to improve performance of demand prediction.We used various evaluation criteria and visual representations to compare our proposed model’s performance.Synthetic data generated by our suggested GAN model is also evaluated.The TGAN-Modified-WGAN-GP model mitigates the overfitting and mode collapse problem,and it also converges faster than previous GAN models for synthetic data creation.The presented model’s performance is compared to existing ensemble and baseline models.The experimental findings imply that combining synthetic and actual data can significantly reduce prediction error rates in the mean absolute percentage error(MAPE)of 4.476 and increase prediction accuracy.展开更多
In this study,we developed software for vehicle big data analysis to analyze the time-series data of connected vehicles.We designed two software modules:The rst to derive the Pearson correlation coefcients to analyze ...In this study,we developed software for vehicle big data analysis to analyze the time-series data of connected vehicles.We designed two software modules:The rst to derive the Pearson correlation coefcients to analyze the collected data and the second to conduct exploratory data analysis of the collected vehicle data.In particular,we analyzed the dangerous driving patterns of motorists based on the safety standards of the Korea Transportation Safety Authority.We also analyzed seasonal fuel efciency(four seasons)and mileage of vehicles,and identied rapid acceleration,rapid deceleration,sudden stopping(harsh braking),quick starting,sudden left turn,sudden right turn and sudden U-turn driving patterns of vehicles.We implemented the density-based spatial clustering of applications with a noise algorithm for trajectory analysis based on GPS(Global Positioning System)data and designed a long shortterm memory algorithm and an auto-regressive integrated moving average model for time-series data analysis.In this paper,we mainly describe the development environment of the analysis software,the structure and data ow of the overall analysis platform,the conguration of the collected vehicle data,and the various algorithms used in the analysis.Finally,we present illustrative results of our analysis,such as dangerous driving patterns that were detected.展开更多
Accurate mapping and timely monitoring of urban redevelopment are pivotal for urban studies and decisionmakers to foster sustainable urban development.Traditional mapping methods heavily depend on field surveys and su...Accurate mapping and timely monitoring of urban redevelopment are pivotal for urban studies and decisionmakers to foster sustainable urban development.Traditional mapping methods heavily depend on field surveys and subjective questionnaires,yielding less objective,reliable,and timely data.Recent advancements in Geographic Information Systems(GIS)and remote-sensing technologies have improved the identification and mapping of urban redevelopment through quantitative analysis using satellite-based observations.Nonetheless,challenges persist,particularly concerning accuracy and significant temporal delays.This study introduces a novel approach to modeling urban redevelopment,leveraging machine learning algorithms and remote-sensing data.This methodology can facilitate the accurate and timely identification of urban redevelopment activities.The study’s machine learning model can analyze time-series remote-sensing data to identify spatio-temporal and spectral patterns related to urban redevelopment.The model is thoroughly evaluated,and the results indicate that it can accurately capture the time-series patterns of urban redevelopment.This research’s findings are useful for evaluating urban demographic and economic changes,informing policymaking and urban planning,and contributing to sustainable urban development.The model can also serve as a foundation for future research on early-stage urban redevelopment detection and evaluation of the causes and impacts of urban redevelopment.展开更多
A simple data assimilation method for improving estimation of moderate resolution imaging spectroradiometer (MODIS) leaf area index (LAI) time-series data products based on the gradient inverse weighted filter and...A simple data assimilation method for improving estimation of moderate resolution imaging spectroradiometer (MODIS) leaf area index (LAI) time-series data products based on the gradient inverse weighted filter and object analysis is proposed. The properties and quality control (QC) of MODIS LAI data products are introduced. Also, the gradient inverse weighted filter and object analysis are analyzed. An experiment based on the simple data assimilation method is performed using MODIS LAI data sets from 2000 to 2005 of Guizhou Province in China.展开更多
A tremendous amount of data has been generated by global financial markets everyday,and such time-series data needs to be analyzed in real time to explore its potential value.In recent years,we have witnessed the succ...A tremendous amount of data has been generated by global financial markets everyday,and such time-series data needs to be analyzed in real time to explore its potential value.In recent years,we have witnessed the successful adoption of machine learning models on financial data,where the importance of accuracy and timeliness demands highly effective computing frameworks.However,traditional financial time-series data processing frameworks have shown performance degradation and adaptation issues,such as the outlier handling with stock suspension in Pandas and TA-Lib.In this paper,we propose HXPY,a high-performance data processing package with a C++/Python interface for financial time-series data.HXPY supports miscellaneous acceleration techniques such as the streaming algorithm,the vectorization instruction set,and memory optimization,together with various functions such as time window functions,group operations,down-sampling operations,cross-section operations,row-wise or column-wise operations,shape transformations,and alignment functions.The results of benchmark and incremental analysis demonstrate the superior performance of HXPY compared with its counterparts.From MiBs to GiBs data,HXPY significantly outperforms other in-memory dataframe computing rivals even up to hundreds of times.展开更多
By employing the unique phenological feature of winter wheat extracted from peak before winter (PBW) and the advantages of moderate resolution imaging spectroradiometer (MODIS) data with high temporal resolution a...By employing the unique phenological feature of winter wheat extracted from peak before winter (PBW) and the advantages of moderate resolution imaging spectroradiometer (MODIS) data with high temporal resolution and intermediate spatial resolution, a remote sensing-based model for mapping winter wheat on the North China Plain was built through integration with Landsat images and land-use data. First, a phenological window, PBW was drawn from time-series MODIS data. Next, feature extraction was performed for the PBW to reduce feature dimension and enhance its information. Finally, a regression model was built to model the relationship of the phenological feature and the sample data. The amount of information of the PBW was evaluated and compared with that of the main peak (MP). The relative precision of the mapping reached up to 92% in comparison to the Landsat sample data, and ranged between 87 and 96% in comparison to the statistical data. These results were sufficient to satisfy the accuracy requirements for winter wheat mapping at a large scale. Moreover, the proposed method has the ability to obtain the distribution information for winter wheat in an earlier period than previous studies. This study could throw light on the monitoring of winter wheat in China by using unique phenological feature of winter wheat.展开更多
Fault diagnosis is important for maintaining the safety and effectiveness of chemical process.Considering the multivariate,nonlinear,and dynamic characteristic of chemical process,many time-series-based data-driven fa...Fault diagnosis is important for maintaining the safety and effectiveness of chemical process.Considering the multivariate,nonlinear,and dynamic characteristic of chemical process,many time-series-based data-driven fault diagnosis methods have been developed in recent years.However,the existing methods have the problem of long-term dependency and are difficult to train due to the sequential way of training.To overcome these problems,a novel fault diagnosis method based on time-series and the hierarchical multihead self-attention(HMSAN)is proposed for chemical process.First,a sliding window strategy is adopted to construct the normalized time-series dataset.Second,the HMSAN is developed to extract the time-relevant features from the time-series process data.It improves the basic self-attention model in both width and depth.With the multihead structure,the HMSAN can pay attention to different aspects of the complicated chemical process and obtain the global dynamic features.However,the multiple heads in parallel lead to redundant information,which cannot improve the diagnosis performance.With the hierarchical structure,the redundant information is reduced and the deep local time-related features are further extracted.Besides,a novel many-to-one training strategy is introduced for HMSAN to simplify the training procedure and capture the long-term dependency.Finally,the effectiveness of the proposed method is demonstrated by two chemical cases.The experimental results show that the proposed method achieves a great performance on time-series industrial data and outperforms the state-of-the-art approaches.展开更多
The frequent missing values in radar-derived time-series tracks of aerial targets(RTT-AT)lead to significant challenges in subsequent data-driven tasks.However,the majority of imputation research focuses on random mis...The frequent missing values in radar-derived time-series tracks of aerial targets(RTT-AT)lead to significant challenges in subsequent data-driven tasks.However,the majority of imputation research focuses on random missing(RM)that differs significantly from common missing patterns of RTT-AT.The method for solving the RM may experience performance degradation or failure when applied to RTT-AT imputation.Conventional autoregressive deep learning methods are prone to error accumulation and long-term dependency loss.In this paper,a non-autoregressive imputation model that addresses the issue of missing value imputation for two common missing patterns in RTT-AT is proposed.Our model consists of two probabilistic sparse diagonal masking self-attention(PSDMSA)units and a weight fusion unit.It learns missing values by combining the representations outputted by the two units,aiming to minimize the difference between the missing values and their actual values.The PSDMSA units effectively capture temporal dependencies and attribute correlations between time steps,improving imputation quality.The weight fusion unit automatically updates the weights of the output representations from the two units to obtain a more accurate final representation.The experimental results indicate that,despite varying missing rates in the two missing patterns,our model consistently outperforms other methods in imputation performance and exhibits a low frequency of deviations in estimates for specific missing entries.Compared to the state-of-the-art autoregressive deep learning imputation model Bidirectional Recurrent Imputation for Time Series(BRITS),our proposed model reduces mean absolute error(MAE)by 31%~50%.Additionally,the model attains a training speed that is 4 to 8 times faster when compared to both BRITS and a standard Transformer model when trained on the same dataset.Finally,the findings from the ablation experiments demonstrate that the PSDMSA,the weight fusion unit,cascade network design,and imputation loss enhance imputation performance and confirm the efficacy of our design.展开更多
Accurate information about phenological stages is essential for canola field management practices such as irrigation, fertilization, and harvesting. Previous studies in canola phenology monitoring focused mainly on th...Accurate information about phenological stages is essential for canola field management practices such as irrigation, fertilization, and harvesting. Previous studies in canola phenology monitoring focused mainly on the flowering stage, using its apparent structure features and colors. Additional phenological stages have been largely overlooked. The objective of this study was to improve a shape-model method(SMM) for extracting winter canola phenological stages from time-series top-of-canopy reflectance images collected by an unmanned aerial vehicle(UAV). The transformation equation of the SMM was refined to account for the multi-peak features of the temporal dynamics of three vegetation indices(VIs)(NDVI, EVI, and CI). An experiment with various seeding scenarios was conducted, including four different seeding dates and three seeding densities. Three mathematical functions: asymmetric Gaussian function(AGF), Fourier function, and double logistic function, were employed to fit timeseries vegetation indices to extract information about phenological stages. The refined SMM effectively estimated the phenological stages of canola, with a minimum root mean square error(RMSE) of 3.7 days for all phenological stages. The AGF function provided the best fitting performance, as it captured multiple peaks in the growth dynamics characteristics for all seeding date scenarios using four scaling parameters. For the three selected VIs, CIred-edgeachieved the greatest accuracy in estimating the phenological stage dates. This study demonstrates the high potential of the refined SMM for estimating winter canola phenology.展开更多
Avoiding,reducing,and reversing land degradation and restoring degraded land is an urgent priority to protect the biodiversity and ecosystem services that are vital to life on Earth.To halt and reverse the current tre...Avoiding,reducing,and reversing land degradation and restoring degraded land is an urgent priority to protect the biodiversity and ecosystem services that are vital to life on Earth.To halt and reverse the current trends in land degradation,there is an immediate need to enhance national capacities to undertake quantitative assessments and mapping of their degraded lands,as required by the Sustainable Development Goals(SDGs),in particular,the SDG indicator 15.3.1(“proportion of land that is degraded over total land area”).Earth Observations(EO)can play an important role both for generating this indicator as well as complementing or enhancing national official data sources.Implementations like Trends.Earth to monitor land degradation in accordance with the SDG15.3.1 rely on default datasets of coarse spatial resolution provided by MODIS or AVHRR.Consequently,there is a need to develop methodologies to benefit from medium to high-resolution satellite EO data(e.g.Landsat or Sentinels).In response to this issue,this paper presents an initial overview of an innovative approach to monitor land degradation at the national scale in compliance with the SDG15.3.1 indicator using Landsat observations using a data cube but further work is required to improve the calculation of the three sub-indicators.展开更多
Earth observation data are typically compressed using general-purpose single-threaded compression algorithms that operate at a fraction of the bandwidth of modern storage and processing systems.We present evidence tha...Earth observation data are typically compressed using general-purpose single-threaded compression algorithms that operate at a fraction of the bandwidth of modern storage and processing systems.We present evidence that recently developed multi-threaded compression codecs offer substantial benefits over widely used single-threaded codecs in terms of compression efficiency when applied to a selection of moderate resolution imaging spectroradiometer(MODIS)datasets stored in the HDF5 format.Compression codecs from the LZ77 and Rice families are shown to vary in efficacy when applied to different MODIS data products,highlighting the need for compression strategies to be tailored to different classes of data.We also introduce LPC-Rice,a new multi-threaded codec,that performs particularly well when applied to time-series data.展开更多
Many fields,such as neuroscience,are experiencing the vast prolife ration of cellular data,underscoring the need fo r organizing and interpreting large datasets.A popular approach partitions data into manageable subse...Many fields,such as neuroscience,are experiencing the vast prolife ration of cellular data,underscoring the need fo r organizing and interpreting large datasets.A popular approach partitions data into manageable subsets via hierarchical clustering,but objective methods to determine the appropriate classification granularity are missing.We recently introduced a technique to systematically identify when to stop subdividing clusters based on the fundamental principle that cells must differ more between than within clusters.Here we present the corresponding protocol to classify cellular datasets by combining datadriven unsupervised hierarchical clustering with statistical testing.These general-purpose functions are applicable to any cellular dataset that can be organized as two-dimensional matrices of numerical values,including molecula r,physiological,and anatomical datasets.We demonstrate the protocol using cellular data from the Janelia MouseLight project to chara cterize morphological aspects of neurons.展开更多
The goal of this study was to map rainfed and irrigated rice-fallow cropland areas across South Asia,using MODIS 250 m time-series data and identify where the farming system may be intensified by the inclusion of a sh...The goal of this study was to map rainfed and irrigated rice-fallow cropland areas across South Asia,using MODIS 250 m time-series data and identify where the farming system may be intensified by the inclusion of a short-season crop during the fallow period.Rice-fallow cropland areas are those areas where rice is grown during the kharif growing season(June–October),followed by a fallow during the rabi season(November–February).These cropland areas are not suitable for growing rabi-season rice due to their high water needs,but are suitable for a short-season(≤3 months),low water-consuming grain legumes such as chickpea(Cicer arietinum L.),black gram,green gram,and lentils.Intensification(double-cropping)in this manner can improve smallholder farmer’s incomes and soil health via rich nitrogen-fixation legume crops as well as address food security challenges of ballooning populations without having to expand croplands.Several grain legumes,primarily chickpea,are increasingly grown across Asia as a source of income for smallholder farmers and at the same time providing rich and cheap source of protein that can improve the nutritional quality of diets in the region.The suitability of rainfed and irrigated rice-fallow croplands for grain legume cultivation across South Asia were defined by these identifiers:(a)rice crop is grown during the primary(kharif)crop growing season or during the north-west monsoon season(June–October);(b)same croplands are left fallow during the second(rabi)season or during the south-east monsoon season(November–February);and(c)ability to support low water-consuming,short-growing season(≤3 months)grain legumes(chickpea,black gram,green gram,and lentils)during rabi season.Existing irrigated or rainfed crops such as rice or wheat that were grown during kharif were not considered suitable for growing during the rabi season,because the moisture/water demand of these crops is too high.The study established cropland classes based on the every 16-day 250 m normalized difference vegetation index(NDVI)time series for one year(June 2010–May 2011)of Moderate Resolution Imaging Spectroradiometer(MODIS)data,using spectral matching techniques(SMTs),and extensive field knowledge.Map accuracy was evaluated based on independent ground survey data as well as compared with available sub-national level statistics.The producers’and users’accuracies of the cropland fallow classes were between 75%and 82%.The overall accuracy and the kappa coefficient estimated for rice classes were 82%and 0.79,respectively.The analysis estimated approximately 22.3 Mha of suitable rice-fallow areas in South Asia,with 88.3%in India,0.5%in Pakistan,1.1%in Sri Lanka,8.7%in Bangladesh,1.4%in Nepal,and 0.02%in Bhutan.Decision-makers can target these areas for sustainable intensification of short-duration grain legumes.展开更多
There is a growing body of clinical research on the utility of synthetic data derivatives,an emerging research tool in medicine.In nephrology,clinicians can use machine learning and artificial intelligence as powerful...There is a growing body of clinical research on the utility of synthetic data derivatives,an emerging research tool in medicine.In nephrology,clinicians can use machine learning and artificial intelligence as powerful aids in their clinical decision-making while also preserving patient privacy.This is especially important given the epidemiology of chronic kidney disease,renal oncology,and hypertension worldwide.However,there remains a need to create a framework for guidance regarding how to better utilize synthetic data as a practical application in this research.展开更多
To investigate the association between temperature and daily mortality in Shanghai from June 1, 2000 to December 31, 2001. Methods Time-series approach was used to estimate the effect of temperature on daily tota...To investigate the association between temperature and daily mortality in Shanghai from June 1, 2000 to December 31, 2001. Methods Time-series approach was used to estimate the effect of temperature on daily total and cause-specific mortality. We fitted generalized additive Poisson regression using non-parametric smooth functions to control for long-term time trend, season and other variables. We also controlled for day of the week. Results A gently sloping V-like relationship between total mortality and temperature was found, with an optimum temperature (e.g. temperature with lowest mortality risk) value of 26.7癈 in Shanghai. For temperatures above the optimum value, total mortality increased by 0.73% for each degree Celsius increase; while for temperature below the optimum value, total mortality decreased by 1.21% for each degree Celsius increase. Conclusions Our findings indicate that temperature has an effect on daily mortality in Shanghai, and the time-series approach is a useful tool for studying the temperature-mortality association.展开更多
In the past 30 years,the small baseline subset(SBAS)InSAR time-series technique has emerged as an essential tool for measuring slow surface displacement and estimating geophysical parameters.Because of its ability to ...In the past 30 years,the small baseline subset(SBAS)InSAR time-series technique has emerged as an essential tool for measuring slow surface displacement and estimating geophysical parameters.Because of its ability to monitor large-scale deformation with millimeter accuracy,the SBAS method has been widely used in various geodetic fields,such as ground subsidence,landslides,and seismic activity.The obtained long-term time-series cumulative deformation is vital for studying the deformation mecha-nism.This article reviews the algorithms,applications,and challenges of the SBAS method.First,we recall the fundamental principle and analyze the shortcomings of the traditional SBAS algorithm,which provides a basic framework for the following improved time series methods.Second,we classify the current improved SBAS techniques from different perspectives:solving the ill-posed equation,increasing the density of high-coherence points,improving the accuracy of monitoring deformation and measuring the multi-dimensional deformation.Third,we summarize the application of the SBAS method in monitoring ground subsidence,permafrost degradation,glacier movement,volcanic activity,landslides,and seismic activity.Finally,we discuss the difficulties faced by the SBAS method and explore its future development direction.展开更多
基金the Frontier Program of the Knowledge Innovation Program of Chinese Academy of Sciences
文摘Based on the 16d-composite MODIS (moderate resolution imaging spectroradiometer)-NDVI(normalized difference vegetation index) time-series data in 2004, vegetation in North Tibet Plateau was classified and seasonal variations on the pixels selected from different vegetation type were analyzed. The Savitzky-Golay filtering algorithm was applied to perform a filtration processing for MODIS-NDVI time-series data. The processed time-series curves can reflect a real variation trend of vegetation growth. The NDVI time-series curves of coniferous forest, high-cold meadow, high-cold meadow steppe and high-cold steppe all appear a mono-peak model during vegetation growth with the maximum peak occurring in August. A decision-tree classification model was established according to either NDVI time-series data or land surface temperature data. And then, both classifying and processing for vegetations were carried out through the model based on NDVI time-series curves. An accuracy test illustrates that classification results are of high accuracy and credibility and the model is conducive for studying a climate variation and estimating a vegetation production at regional even global scale.
文摘Clustering is used to gain an intuition of the struc tures in the data.Most of the current clustering algorithms pro duce a clustering structure even on data that do not possess such structure.In these cases,the algorithms force a structure in the data instead of discovering one.To avoid false structures in the relations of data,a novel clusterability assessment method called density-based clusterability measure is proposed in this paper.I measures the prominence of clustering structure in the data to evaluate whether a cluster analysis could produce a meaningfu insight to the relationships in the data.This is especially useful in time-series data since visualizing the structure in time-series data is hard.The performance of the clusterability measure is evalu ated against several synthetic data sets and time-series data sets which illustrate that the density-based clusterability measure can successfully indicate clustering structure of time-series data.
基金funded by the Ministry-level Scientific and Technological Key Programs of Ministry of Natural Resources and Environment of Viet Nam "Application of thermal infrared remote sensing and GIS for mapping underground coal fires in Quang Ninh coal basin" (Grant No. TNMT.2017.08.06)
文摘Underground coal fires are one of the most common and serious geohazards in most coal producing countries in the world. Monitoring their spatio-temporal changes plays an important role in controlling and preventing the effects of coal fires, and their environmental impact. In this study, the spatio-temporal changes of underground coal fires in Khanh Hoa coal field(North-East of Viet Nam) were analyzed using Landsat time-series data during the 2008-2016 period. Based on land surface temperatures retrieved from Landsat thermal data, underground coal fires related to thermal anomalies were identified using the MEDIAN+1.5×IQR(IQR: Interquartile range) threshold technique. The locations of underground coal fires were validated using a coal fire map produced by the field survey data and cross-validated using the daytime ASTER thermal infrared imagery. Based on the fires extracted from seven Landsat thermal imageries, the spatiotemporal changes of underground coal fire areas were analyzed. The results showed that the thermalanomalous zones have been correlated with known coal fires. Cross-validation of coal fires using ASTER TIR data showed a high consistency of 79.3%. The largest coal fire area of 184.6 hectares was detected in 2010, followed by 2014(181.1 hectares) and 2016(178.5 hectares). The smaller coal fire areas were extracted with areas of 133.6 and 152.5 hectares in 2011 and 2009 respectively. Underground coal fires were mainly detected in the northern and southern part, and tend to spread to north-west of the coal field.
基金This work was supported by Korea Institute for Advancement of Technology(KIAT)grant funded by the Korea Government(MOTIE)(P0016977,The Establishment Project of Industry-University Fusion District).
文摘The increasing penetration rate of electric kickboard vehicles has been popularized and promoted primarily because of its clean and efficient features.Electric kickboards are gradually growing in popularity in tourist and education-centric localities.In the upcoming arrival of electric kickboard vehicles,deploying a customer rental service is essential.Due to its freefloating nature,the shared electric kickboard is a common and practical means of transportation.Relocation plans for shared electric kickboards are required to increase the quality of service,and forecasting demand for their use in a specific region is crucial.Predicting demand accurately with small data is troublesome.Extensive data is necessary for training machine learning algorithms for effective prediction.Data generation is a method for expanding the amount of data that will be further accessible for training.In this work,we proposed a model that takes time-series customers’electric kickboard demand data as input,pre-processes it,and generates synthetic data according to the original data distribution using generative adversarial networks(GAN).The electric kickboard mobility demand prediction error was reduced when we combined synthetic data with the original data.We proposed Tabular-GAN-Modified-WGAN-GP for generating synthetic data for better prediction results.We modified The Wasserstein GAN-gradient penalty(GP)with the RMSprop optimizer and then employed Spectral Normalization(SN)to improve training stability and faster convergence.Finally,we applied a regression-based blending ensemble technique that can help us to improve performance of demand prediction.We used various evaluation criteria and visual representations to compare our proposed model’s performance.Synthetic data generated by our suggested GAN model is also evaluated.The TGAN-Modified-WGAN-GP model mitigates the overfitting and mode collapse problem,and it also converges faster than previous GAN models for synthetic data creation.The presented model’s performance is compared to existing ensemble and baseline models.The experimental findings imply that combining synthetic and actual data can significantly reduce prediction error rates in the mean absolute percentage error(MAPE)of 4.476 and increase prediction accuracy.
基金supported by the Technology Innovation Program(10083633,Development on Big Data Analysis Technology and Business Service for Connected Vehicles)funded by the Ministry of Trade,Industry&Energy(MOTIE,Korea)。
文摘In this study,we developed software for vehicle big data analysis to analyze the time-series data of connected vehicles.We designed two software modules:The rst to derive the Pearson correlation coefcients to analyze the collected data and the second to conduct exploratory data analysis of the collected vehicle data.In particular,we analyzed the dangerous driving patterns of motorists based on the safety standards of the Korea Transportation Safety Authority.We also analyzed seasonal fuel efciency(four seasons)and mileage of vehicles,and identied rapid acceleration,rapid deceleration,sudden stopping(harsh braking),quick starting,sudden left turn,sudden right turn and sudden U-turn driving patterns of vehicles.We implemented the density-based spatial clustering of applications with a noise algorithm for trajectory analysis based on GPS(Global Positioning System)data and designed a long shortterm memory algorithm and an auto-regressive integrated moving average model for time-series data analysis.In this paper,we mainly describe the development environment of the analysis software,the structure and data ow of the overall analysis platform,the conguration of the collected vehicle data,and the various algorithms used in the analysis.Finally,we present illustrative results of our analysis,such as dangerous driving patterns that were detected.
文摘Accurate mapping and timely monitoring of urban redevelopment are pivotal for urban studies and decisionmakers to foster sustainable urban development.Traditional mapping methods heavily depend on field surveys and subjective questionnaires,yielding less objective,reliable,and timely data.Recent advancements in Geographic Information Systems(GIS)and remote-sensing technologies have improved the identification and mapping of urban redevelopment through quantitative analysis using satellite-based observations.Nonetheless,challenges persist,particularly concerning accuracy and significant temporal delays.This study introduces a novel approach to modeling urban redevelopment,leveraging machine learning algorithms and remote-sensing data.This methodology can facilitate the accurate and timely identification of urban redevelopment activities.The study’s machine learning model can analyze time-series remote-sensing data to identify spatio-temporal and spectral patterns related to urban redevelopment.The model is thoroughly evaluated,and the results indicate that it can accurately capture the time-series patterns of urban redevelopment.This research’s findings are useful for evaluating urban demographic and economic changes,informing policymaking and urban planning,and contributing to sustainable urban development.The model can also serve as a foundation for future research on early-stage urban redevelopment detection and evaluation of the causes and impacts of urban redevelopment.
基金This work was supported by the China Postdoctoral Science Foundation(No.20060390326)the key international S&T cooperation project of China(No.2004DFA06300).
文摘A simple data assimilation method for improving estimation of moderate resolution imaging spectroradiometer (MODIS) leaf area index (LAI) time-series data products based on the gradient inverse weighted filter and object analysis is proposed. The properties and quality control (QC) of MODIS LAI data products are introduced. Also, the gradient inverse weighted filter and object analysis are analyzed. An experiment based on the simple data assimilation method is performed using MODIS LAI data sets from 2000 to 2005 of Guizhou Province in China.
文摘A tremendous amount of data has been generated by global financial markets everyday,and such time-series data needs to be analyzed in real time to explore its potential value.In recent years,we have witnessed the successful adoption of machine learning models on financial data,where the importance of accuracy and timeliness demands highly effective computing frameworks.However,traditional financial time-series data processing frameworks have shown performance degradation and adaptation issues,such as the outlier handling with stock suspension in Pandas and TA-Lib.In this paper,we propose HXPY,a high-performance data processing package with a C++/Python interface for financial time-series data.HXPY supports miscellaneous acceleration techniques such as the streaming algorithm,the vectorization instruction set,and memory optimization,together with various functions such as time window functions,group operations,down-sampling operations,cross-section operations,row-wise or column-wise operations,shape transformations,and alignment functions.The results of benchmark and incremental analysis demonstrate the superior performance of HXPY compared with its counterparts.From MiBs to GiBs data,HXPY significantly outperforms other in-memory dataframe computing rivals even up to hundreds of times.
基金supported by the open research fund of the Key Laboratory of Agri-informatics,Ministry of Agriculture and the fund of Outstanding Agricultural Researcher,Ministry of Agriculture,China
文摘By employing the unique phenological feature of winter wheat extracted from peak before winter (PBW) and the advantages of moderate resolution imaging spectroradiometer (MODIS) data with high temporal resolution and intermediate spatial resolution, a remote sensing-based model for mapping winter wheat on the North China Plain was built through integration with Landsat images and land-use data. First, a phenological window, PBW was drawn from time-series MODIS data. Next, feature extraction was performed for the PBW to reduce feature dimension and enhance its information. Finally, a regression model was built to model the relationship of the phenological feature and the sample data. The amount of information of the PBW was evaluated and compared with that of the main peak (MP). The relative precision of the mapping reached up to 92% in comparison to the Landsat sample data, and ranged between 87 and 96% in comparison to the statistical data. These results were sufficient to satisfy the accuracy requirements for winter wheat mapping at a large scale. Moreover, the proposed method has the ability to obtain the distribution information for winter wheat in an earlier period than previous studies. This study could throw light on the monitoring of winter wheat in China by using unique phenological feature of winter wheat.
基金supported by the National Natural Science Foundation of China(62073140,62073141)the Shanghai Rising-Star Program(21QA1401800).
文摘Fault diagnosis is important for maintaining the safety and effectiveness of chemical process.Considering the multivariate,nonlinear,and dynamic characteristic of chemical process,many time-series-based data-driven fault diagnosis methods have been developed in recent years.However,the existing methods have the problem of long-term dependency and are difficult to train due to the sequential way of training.To overcome these problems,a novel fault diagnosis method based on time-series and the hierarchical multihead self-attention(HMSAN)is proposed for chemical process.First,a sliding window strategy is adopted to construct the normalized time-series dataset.Second,the HMSAN is developed to extract the time-relevant features from the time-series process data.It improves the basic self-attention model in both width and depth.With the multihead structure,the HMSAN can pay attention to different aspects of the complicated chemical process and obtain the global dynamic features.However,the multiple heads in parallel lead to redundant information,which cannot improve the diagnosis performance.With the hierarchical structure,the redundant information is reduced and the deep local time-related features are further extracted.Besides,a novel many-to-one training strategy is introduced for HMSAN to simplify the training procedure and capture the long-term dependency.Finally,the effectiveness of the proposed method is demonstrated by two chemical cases.The experimental results show that the proposed method achieves a great performance on time-series industrial data and outperforms the state-of-the-art approaches.
基金supported by Graduate Funded Project(No.JY2022A017).
文摘The frequent missing values in radar-derived time-series tracks of aerial targets(RTT-AT)lead to significant challenges in subsequent data-driven tasks.However,the majority of imputation research focuses on random missing(RM)that differs significantly from common missing patterns of RTT-AT.The method for solving the RM may experience performance degradation or failure when applied to RTT-AT imputation.Conventional autoregressive deep learning methods are prone to error accumulation and long-term dependency loss.In this paper,a non-autoregressive imputation model that addresses the issue of missing value imputation for two common missing patterns in RTT-AT is proposed.Our model consists of two probabilistic sparse diagonal masking self-attention(PSDMSA)units and a weight fusion unit.It learns missing values by combining the representations outputted by the two units,aiming to minimize the difference between the missing values and their actual values.The PSDMSA units effectively capture temporal dependencies and attribute correlations between time steps,improving imputation quality.The weight fusion unit automatically updates the weights of the output representations from the two units to obtain a more accurate final representation.The experimental results indicate that,despite varying missing rates in the two missing patterns,our model consistently outperforms other methods in imputation performance and exhibits a low frequency of deviations in estimates for specific missing entries.Compared to the state-of-the-art autoregressive deep learning imputation model Bidirectional Recurrent Imputation for Time Series(BRITS),our proposed model reduces mean absolute error(MAE)by 31%~50%.Additionally,the model attains a training speed that is 4 to 8 times faster when compared to both BRITS and a standard Transformer model when trained on the same dataset.Finally,the findings from the ablation experiments demonstrate that the PSDMSA,the weight fusion unit,cascade network design,and imputation loss enhance imputation performance and confirm the efficacy of our design.
基金supported by the National Natural Science Foundation of China (51909228)the Postdoctoral Science Foundation of China (2020M671623)the ‘‘Blue Project” of Yangzhou University。
文摘Accurate information about phenological stages is essential for canola field management practices such as irrigation, fertilization, and harvesting. Previous studies in canola phenology monitoring focused mainly on the flowering stage, using its apparent structure features and colors. Additional phenological stages have been largely overlooked. The objective of this study was to improve a shape-model method(SMM) for extracting winter canola phenological stages from time-series top-of-canopy reflectance images collected by an unmanned aerial vehicle(UAV). The transformation equation of the SMM was refined to account for the multi-peak features of the temporal dynamics of three vegetation indices(VIs)(NDVI, EVI, and CI). An experiment with various seeding scenarios was conducted, including four different seeding dates and three seeding densities. Three mathematical functions: asymmetric Gaussian function(AGF), Fourier function, and double logistic function, were employed to fit timeseries vegetation indices to extract information about phenological stages. The refined SMM effectively estimated the phenological stages of canola, with a minimum root mean square error(RMSE) of 3.7 days for all phenological stages. The AGF function provided the best fitting performance, as it captured multiple peaks in the growth dynamics characteristics for all seeding date scenarios using four scaling parameters. For the three selected VIs, CIred-edgeachieved the greatest accuracy in estimating the phenological stage dates. This study demonstrates the high potential of the refined SMM for estimating winter canola phenology.
基金This research was funded by the European Commission“Horizon 2020 Program”ERA-PLANET/GEOEssential project,grant number 689443.
文摘Avoiding,reducing,and reversing land degradation and restoring degraded land is an urgent priority to protect the biodiversity and ecosystem services that are vital to life on Earth.To halt and reverse the current trends in land degradation,there is an immediate need to enhance national capacities to undertake quantitative assessments and mapping of their degraded lands,as required by the Sustainable Development Goals(SDGs),in particular,the SDG indicator 15.3.1(“proportion of land that is degraded over total land area”).Earth Observations(EO)can play an important role both for generating this indicator as well as complementing or enhancing national official data sources.Implementations like Trends.Earth to monitor land degradation in accordance with the SDG15.3.1 rely on default datasets of coarse spatial resolution provided by MODIS or AVHRR.Consequently,there is a need to develop methodologies to benefit from medium to high-resolution satellite EO data(e.g.Landsat or Sentinels).In response to this issue,this paper presents an initial overview of an innovative approach to monitor land degradation at the national scale in compliance with the SDG15.3.1 indicator using Landsat observations using a data cube but further work is required to improve the calculation of the three sub-indicators.
文摘Earth observation data are typically compressed using general-purpose single-threaded compression algorithms that operate at a fraction of the bandwidth of modern storage and processing systems.We present evidence that recently developed multi-threaded compression codecs offer substantial benefits over widely used single-threaded codecs in terms of compression efficiency when applied to a selection of moderate resolution imaging spectroradiometer(MODIS)datasets stored in the HDF5 format.Compression codecs from the LZ77 and Rice families are shown to vary in efficacy when applied to different MODIS data products,highlighting the need for compression strategies to be tailored to different classes of data.We also introduce LPC-Rice,a new multi-threaded codec,that performs particularly well when applied to time-series data.
基金supported in part by NIH grants R01NS39600,U01MH114829RF1MH128693(to GAA)。
文摘Many fields,such as neuroscience,are experiencing the vast prolife ration of cellular data,underscoring the need fo r organizing and interpreting large datasets.A popular approach partitions data into manageable subsets via hierarchical clustering,but objective methods to determine the appropriate classification granularity are missing.We recently introduced a technique to systematically identify when to stop subdividing clusters based on the fundamental principle that cells must differ more between than within clusters.Here we present the corresponding protocol to classify cellular datasets by combining datadriven unsupervised hierarchical clustering with statistical testing.These general-purpose functions are applicable to any cellular dataset that can be organized as two-dimensional matrices of numerical values,including molecula r,physiological,and anatomical datasets.We demonstrate the protocol using cellular data from the Janelia MouseLight project to chara cterize morphological aspects of neurons.
基金supported by two CGIAR Research Programs:Dryland Cereals,Grain legumes and WLE.The research was also supported by the global food security support analysis data at 30 m project(GFSAD30http://geography.wr.usgs.gov/science/croplands/https://croplands.org/)funded by the NASA MEaSUREs[grant number:NNH13AV82I](Making Earth System Data Records for Use in Research Environments)funding obtained through NASA ROSES solicitation as well as by the Land Change Science(LCS),Land Remote Sensing(LRS),and Climate Land Use Change Mission Area Programs of the U.S.Geological Survey(USGS).
文摘The goal of this study was to map rainfed and irrigated rice-fallow cropland areas across South Asia,using MODIS 250 m time-series data and identify where the farming system may be intensified by the inclusion of a short-season crop during the fallow period.Rice-fallow cropland areas are those areas where rice is grown during the kharif growing season(June–October),followed by a fallow during the rabi season(November–February).These cropland areas are not suitable for growing rabi-season rice due to their high water needs,but are suitable for a short-season(≤3 months),low water-consuming grain legumes such as chickpea(Cicer arietinum L.),black gram,green gram,and lentils.Intensification(double-cropping)in this manner can improve smallholder farmer’s incomes and soil health via rich nitrogen-fixation legume crops as well as address food security challenges of ballooning populations without having to expand croplands.Several grain legumes,primarily chickpea,are increasingly grown across Asia as a source of income for smallholder farmers and at the same time providing rich and cheap source of protein that can improve the nutritional quality of diets in the region.The suitability of rainfed and irrigated rice-fallow croplands for grain legume cultivation across South Asia were defined by these identifiers:(a)rice crop is grown during the primary(kharif)crop growing season or during the north-west monsoon season(June–October);(b)same croplands are left fallow during the second(rabi)season or during the south-east monsoon season(November–February);and(c)ability to support low water-consuming,short-growing season(≤3 months)grain legumes(chickpea,black gram,green gram,and lentils)during rabi season.Existing irrigated or rainfed crops such as rice or wheat that were grown during kharif were not considered suitable for growing during the rabi season,because the moisture/water demand of these crops is too high.The study established cropland classes based on the every 16-day 250 m normalized difference vegetation index(NDVI)time series for one year(June 2010–May 2011)of Moderate Resolution Imaging Spectroradiometer(MODIS)data,using spectral matching techniques(SMTs),and extensive field knowledge.Map accuracy was evaluated based on independent ground survey data as well as compared with available sub-national level statistics.The producers’and users’accuracies of the cropland fallow classes were between 75%and 82%.The overall accuracy and the kappa coefficient estimated for rice classes were 82%and 0.79,respectively.The analysis estimated approximately 22.3 Mha of suitable rice-fallow areas in South Asia,with 88.3%in India,0.5%in Pakistan,1.1%in Sri Lanka,8.7%in Bangladesh,1.4%in Nepal,and 0.02%in Bhutan.Decision-makers can target these areas for sustainable intensification of short-duration grain legumes.
文摘There is a growing body of clinical research on the utility of synthetic data derivatives,an emerging research tool in medicine.In nephrology,clinicians can use machine learning and artificial intelligence as powerful aids in their clinical decision-making while also preserving patient privacy.This is especially important given the epidemiology of chronic kidney disease,renal oncology,and hypertension worldwide.However,there remains a need to create a framework for guidance regarding how to better utilize synthetic data as a practical application in this research.
文摘To investigate the association between temperature and daily mortality in Shanghai from June 1, 2000 to December 31, 2001. Methods Time-series approach was used to estimate the effect of temperature on daily total and cause-specific mortality. We fitted generalized additive Poisson regression using non-parametric smooth functions to control for long-term time trend, season and other variables. We also controlled for day of the week. Results A gently sloping V-like relationship between total mortality and temperature was found, with an optimum temperature (e.g. temperature with lowest mortality risk) value of 26.7癈 in Shanghai. For temperatures above the optimum value, total mortality increased by 0.73% for each degree Celsius increase; while for temperature below the optimum value, total mortality decreased by 1.21% for each degree Celsius increase. Conclusions Our findings indicate that temperature has an effect on daily mortality in Shanghai, and the time-series approach is a useful tool for studying the temperature-mortality association.
基金This work was funded by the National Key R&D Program of China(2019YFC1509205)the National Natural Science Foundation of China(Nos.42174023 and 41804015)+1 种基金the Postgraduate Scientific Research Innovation Project of Hunan Province(150110074)the Postgraduate Scientific Research Innovation Project of Central South University(212191010).
文摘In the past 30 years,the small baseline subset(SBAS)InSAR time-series technique has emerged as an essential tool for measuring slow surface displacement and estimating geophysical parameters.Because of its ability to monitor large-scale deformation with millimeter accuracy,the SBAS method has been widely used in various geodetic fields,such as ground subsidence,landslides,and seismic activity.The obtained long-term time-series cumulative deformation is vital for studying the deformation mecha-nism.This article reviews the algorithms,applications,and challenges of the SBAS method.First,we recall the fundamental principle and analyze the shortcomings of the traditional SBAS algorithm,which provides a basic framework for the following improved time series methods.Second,we classify the current improved SBAS techniques from different perspectives:solving the ill-posed equation,increasing the density of high-coherence points,improving the accuracy of monitoring deformation and measuring the multi-dimensional deformation.Third,we summarize the application of the SBAS method in monitoring ground subsidence,permafrost degradation,glacier movement,volcanic activity,landslides,and seismic activity.Finally,we discuss the difficulties faced by the SBAS method and explore its future development direction.