Through Wireless Sensor Networks(WSN)formation,industrial and academic communities have seen remarkable development in recent decades.One of the most common techniques to derive the best out of wireless sensor network...Through Wireless Sensor Networks(WSN)formation,industrial and academic communities have seen remarkable development in recent decades.One of the most common techniques to derive the best out of wireless sensor networks is to upgrade the operating group.The most important problem is the arrangement of optimal number of sensor nodes as clusters to discuss clustering method.In this method,new client nodes and dynamic methods are used to determine the optimal number of clusters and cluster heads which are to be better organized and proposed to classify each round.Parameters of effective energy use and the ability to decide the best method of attachments are included.The Problem coverage find change ability network route due to which traffic and delays keep the performance to be very high.A newer version of Gravity Analysis Algorithm(GAA)is used to solve this problem.This proposed new approach GAA is introduced to improve network lifetime,increase system energy efficiency and end delay performance.Simulation results show that modified GAA performance is better than other networks and it has more advanced Life Time Delay Clustering Algorithms-LTDCA protocols.The proposed method provides a set of data collection and increased throughput in wireless sensor networks.展开更多
Time series clustering is a challenging problem due to the large-volume,high-dimensional,and warping characteristics of time series data.Traditional clustering methods often use a single criterion or distance measure,...Time series clustering is a challenging problem due to the large-volume,high-dimensional,and warping characteristics of time series data.Traditional clustering methods often use a single criterion or distance measure,which may not capture all the features of the data.This paper proposes a novel method for time series clustering based on evolutionary multi-tasking optimization,termed i-MFEA,which uses an improved multifactorial evolutionary algorithm to optimize multiple clustering tasks simultaneously,each with a different validity index or distance measure.Therefore,i-MFEA can produce diverse and robust clustering solutions that satisfy various preferences of decision-makers.Experiments on two artificial datasets show that i-MFEA outperforms single-objective evolutionary algorithms and traditional clustering methods in terms of convergence speed and clustering quality.The paper also discusses how i-MFEA can address two long-standing issues in time series clustering:the choice of appropriate similarity measure and the number of clusters.展开更多
Statistical analyses and descriptive characterizations are sometimes assumed to be offering information on time series forecastability.Despite the scientific interest suggested by such assumptions,the relationships be...Statistical analyses and descriptive characterizations are sometimes assumed to be offering information on time series forecastability.Despite the scientific interest suggested by such assumptions,the relationships between descriptive time series features(e.g.,temporal dependence,entropy,seasonality,trend and linearity features)and actual time series forecastability(quantified by issuing and assessing forecasts for the past)are scarcely studied and quantified in the literature.In this work,we aim to fill in this gap by investigating such relationships,and the way that they can be exploited for understanding hydroclimatic forecastability and its patterns.To this end,we follow a systematic framework bringing together a variety of–mostly new for hydrology–concepts and methods,including 57 descriptive features and nine seasonal time series forecasting methods(i.e.,one simple,five exponential smoothing,two state space and one automated autoregressive fractionally integrated moving average methods).We apply this framework to three global datasets originating from the larger Global Historical Climatology Network(GHCN)and Global Streamflow Indices and Metadata(GSIM)archives.As these datasets comprise over 13,000 monthly temperature,precipitation and river flow time series from several continents and hydroclimatic regimes,they allow us to provide trustable characterizations and interpretations of 12-month ahead hydroclimatic forecastability at the global scale.We first find that the exponential smoothing and state space methods for time series forecasting are rather equally efficient in identifying an upper limit of this forecastability in terms of Nash-Sutcliffe efficiency,while the simple method is shown to be mostly useful in identifying its lower limit.We then demonstrate that the assessed forecastability is strongly related to several descriptive features,including seasonality,entropy,(partial)autocorrelation,stability,(non)linearity,spikiness and heterogeneity features,among others.We further(i)show that,if such descriptive information is available for a monthly hydroclimatic time series,we can even foretell the quality of its future forecasts with a considerable degree of confidence,and(ii)rank the features according to their efficiency in explaining and foretelling forecastability.We believe that the obtained rankings are of key importance for understanding forecastability.Spatial forecastability patterns are also revealed through our experiments,with East Asia(Europe)being characterized by larger(smaller)monthly temperature time series forecastability and the Indian subcontinent(Australia)being characterized by larger(smaller)monthly precipitation time series forecastability,compared to other continental-scale regions,and less notable differences characterizing monthly river flow from continent to continent.A comprehensive interpretation of such patters through massive feature extraction and feature-based time series clustering is shown to be possible.Indeed,continental-scale regions characterized by different degrees of forecastability are also attributed to different clusters or mixtures of clusters(because of their essential differences in terms of descriptive features).展开更多
Anomaly detection using KPI(Key Performance Indicator)is critical for Internet-based services to maintain high service availability.However,given the velocity,volume,and diversified nature of monitoring data,it is dif...Anomaly detection using KPI(Key Performance Indicator)is critical for Internet-based services to maintain high service availability.However,given the velocity,volume,and diversified nature of monitoring data,it is difficult to obtain enough labelled data to build an accurate anomaly detection model for using supervised machine leaning methods.In this paper,we propose an automatic and generic transfer learning strategy:Detecting anomalies on a new KPI by using pretrained model on existing selected labelled KPI.Our approach,called KADT(KPI Anomaly Detection based on Transfer Learning),integrates KPI clustering and model pretrained techniques.KPI clustering is used to obtain the similarity of different KPI data's distribution,and applied transfer knowledge from source dataset to the target dataset by model pretrained technique.In our evaluation using real-world KPIs from large Internet-based services,the clustering algorithm used to detect various KPI curve pattern achieve the best classification effect and accuracy More importantly,further evaluation on 30 KPIs shows that KADT can significantly reduce the time overhead of the model training with little loss of accuracy.展开更多
The original temporal clustering analysis (OTCA) is an effective technique for obtaining brain activation maps when the timing and location of the activation are completely unknown, but its deficiency of sensitivity i...The original temporal clustering analysis (OTCA) is an effective technique for obtaining brain activation maps when the timing and location of the activation are completely unknown, but its deficiency of sensitivity is exposed in processing brain activation signal which is relatively weak. The time slice analysis method based on OTCA is proposed considering the weakness of the functional magnetic resonance imaging (fMRI) signal of the rat model. By dividing the stimulation period into several time slices and analyzing each slice to detect the activated pixels respectively after the background removal, the sensitivity is significantly improved. The inhibitory response in the hypothalamus after glucose loading is detected successfully with this method in the experiment on rat. Combined with the OTCA method, the time slice analysis method based on OTCA is effective on detecting when, where and which type of response will happen after stimulation, even if the fMRI signal is weak.展开更多
Regional economic vitality reflects the scale and development potential of a region’s economy.It largely determines the development of the city,and is also affected by many factors such as population competitiveness,...Regional economic vitality reflects the scale and development potential of a region’s economy.It largely determines the development of the city,and is also affected by many factors such as population competitiveness,corporate competitiveness,market vitality,innovation vitality,and environmental vitality.A pilot model was constructed with Hebei Province as the inspection area.Quantitative measurement of regional economic vitality was made by finding 21 indicators that indirectly or indirectly affect the economic vitality of Hebei Province.By analyzing the data of 21 indicators for nearly 10 years,the time series clustering is used to achieve the dimensionality reduction of the indicators.After the dimension reduction,it is divided into four categories:overall scale,development potential,market vitality,and innovation vitality.Construct the economic vitality structure model of Hebei Province,and determine the four types of contribution to economic vitality and compare them.On this basis,more accurately grasp the indicators that affect economic vitality and work out reasonable and effective action plans.From the perspective of human resources and corporate vitality,analyze how the action plan accurately affects the economic vitality of Hebei Province[1].The 11 cities in Hebei Province are the target of regional economic vitality.The economic vitality structure model constructed uses the required contribution value to select priority indicators.Finally,the six indicators of GPD,GPD growth rate,fiscal revenue,fiscal revenue growth rate,number of industrial enterprises above designated size,and total profit of industrial enterprises above designated size were established for eleven cities in Hebei Province to construct a TOPSIS scoring model,and calculation rankings were conducted through MATLAB.Results The top three cities were Shijiazhuang,Tangshan and Cangzhou.展开更多
Additive hazards model with random effects is proposed for modelling the correlated failure time data when focus is on comparing the failure times within clusters and on estimating the correlation between failure time...Additive hazards model with random effects is proposed for modelling the correlated failure time data when focus is on comparing the failure times within clusters and on estimating the correlation between failure times from the same cluster, as well as the marginal regression parameters. Our model features that, when marginalized over the random effect variable, it still enjoys the structure of the additive hazards model. We develop the estimating equations for inferring the regression parameters. The proposed estimators are shown to be consistent and asymptotically normal under appropriate regularity conditions. Furthermore, the estimator of the baseline hazards function is proposed and its asymptotic properties are also established. We propose a class of diagnostic methods to assess the overall fitting adequacy of the additive hazards model with random effects. We conduct simulation studies to evaluate the finite sample behaviors of the proposed estimators in various scenarios. Analysis of the Diabetic Retinopathy Study is provided as an illustration for the proposed method.展开更多
Human activities significantly impact the environment.Understanding the patterns and distribution of these activities is crucial for ecological protection.With location-based technology advancement,big data such as lo...Human activities significantly impact the environment.Understanding the patterns and distribution of these activities is crucial for ecological protection.With location-based technology advancement,big data such as location and trajectory data can be used to analyze human activities on finer temporal and spatial scales than traditional remote sensing data.In this study,Qilian Mountain National Park(QMNP)was chosen as the research area,and Tencent location data were used to construct time series data.Time series clustering and decomposition were performed,and the spatio-temporal distribution characteristics of human activities in the study area were analyzed in conjunction with GPS trajectory data and land use data.The study found two distinct human activity patterns,Pattern A and Pattern B,in QMNP.Compared to Pattern B,Pattern A had a higher volume of location data and clear nighttime peaks.By incorporating land use and trajectory data,we conclude that Pattern A and Pattern B represent the activity patterns of the resident and tourist populations,respectively.Moreover,the study identified seasonal variations in human activities,with human activity in summer being approximately two hours longer than in winter.We also conducted an analysis of human activities in different counties within the study area.展开更多
A novel reactor with two reaction zones is proposed for pyridine synthesis.The flow hydrodynamics were investigated in experiments.Pressure taps and a PV-6D optical fiber were used to measure the pressure fluctuation ...A novel reactor with two reaction zones is proposed for pyridine synthesis.The flow hydrodynamics were investigated in experiments.Pressure taps and a PV-6D optical fiber were used to measure the pressure fluctuation and particle concentration,respectively.Hilbert–Huang analysis was adopted to distinguish the flow patterns through pressure fluctuation.Results show that the flow patterns are bubbling fluidization and fast fluidization in the regions below(reaction zone I)and above(reaction zone II)the nozzles,respectively.The radial distribution of the cluster time fraction was obtained through signal waves of the particle concentration.Analysis of the cluster time fraction revealed two radial distributions.In the region around the nozzles,the cluster time fraction ranged from 0 to 0.2 and concentrated at radial positions r/R=0–1,which resulted from unsymmetrical catalyst feeding.In reaction zone II,the cluster time fraction ranged from 0 to 0.2,and the radial distribution indicated a core–annulus structure.展开更多
Recurrent event data often arises in biomedical studies, and individuals within a cluster might not be independent. We propose a semiparametric additive rates model for clustered recurrent event data, wherein the cova...Recurrent event data often arises in biomedical studies, and individuals within a cluster might not be independent. We propose a semiparametric additive rates model for clustered recurrent event data, wherein the covariates are assumed to add to the unspecified baseline rate. For the inference on the model parameters, estimating equation approaches are developed, and both large and finite sample properties of the proposed estimators are established.展开更多
文摘Through Wireless Sensor Networks(WSN)formation,industrial and academic communities have seen remarkable development in recent decades.One of the most common techniques to derive the best out of wireless sensor networks is to upgrade the operating group.The most important problem is the arrangement of optimal number of sensor nodes as clusters to discuss clustering method.In this method,new client nodes and dynamic methods are used to determine the optimal number of clusters and cluster heads which are to be better organized and proposed to classify each round.Parameters of effective energy use and the ability to decide the best method of attachments are included.The Problem coverage find change ability network route due to which traffic and delays keep the performance to be very high.A newer version of Gravity Analysis Algorithm(GAA)is used to solve this problem.This proposed new approach GAA is introduced to improve network lifetime,increase system energy efficiency and end delay performance.Simulation results show that modified GAA performance is better than other networks and it has more advanced Life Time Delay Clustering Algorithms-LTDCA protocols.The proposed method provides a set of data collection and increased throughput in wireless sensor networks.
基金supported by the Open Project of Xiangjiang Laboratory(No.22XJ02003)the National Natural Science Foundation of China(No.62122093).
文摘Time series clustering is a challenging problem due to the large-volume,high-dimensional,and warping characteristics of time series data.Traditional clustering methods often use a single criterion or distance measure,which may not capture all the features of the data.This paper proposes a novel method for time series clustering based on evolutionary multi-tasking optimization,termed i-MFEA,which uses an improved multifactorial evolutionary algorithm to optimize multiple clustering tasks simultaneously,each with a different validity index or distance measure.Therefore,i-MFEA can produce diverse and robust clustering solutions that satisfy various preferences of decision-makers.Experiments on two artificial datasets show that i-MFEA outperforms single-objective evolutionary algorithms and traditional clustering methods in terms of convergence speed and clustering quality.The paper also discusses how i-MFEA can address two long-standing issues in time series clustering:the choice of appropriate similarity measure and the number of clusters.
基金Funding from the Italian Ministry of Environment, Land and Sea Protection (MATTM) for the Sim PRO project (2020–2021) is acknowledged by (in alphabetical order): S. Grimaldi, G. Papacharalampous and E. Volpifunding from the Italian Ministry of Education, University and Research (MIUR), in the frame of the Departments of Excellence Initiative 2018–2022, attributed to the Department of Engineering of Roma Tre Universityfunding from the EU Horizon 2020 project CLINT (Climate Intelligence: Extreme events detection, attribution and adaptation design using machine learning) under Grant Agreement 101003876
文摘Statistical analyses and descriptive characterizations are sometimes assumed to be offering information on time series forecastability.Despite the scientific interest suggested by such assumptions,the relationships between descriptive time series features(e.g.,temporal dependence,entropy,seasonality,trend and linearity features)and actual time series forecastability(quantified by issuing and assessing forecasts for the past)are scarcely studied and quantified in the literature.In this work,we aim to fill in this gap by investigating such relationships,and the way that they can be exploited for understanding hydroclimatic forecastability and its patterns.To this end,we follow a systematic framework bringing together a variety of–mostly new for hydrology–concepts and methods,including 57 descriptive features and nine seasonal time series forecasting methods(i.e.,one simple,five exponential smoothing,two state space and one automated autoregressive fractionally integrated moving average methods).We apply this framework to three global datasets originating from the larger Global Historical Climatology Network(GHCN)and Global Streamflow Indices and Metadata(GSIM)archives.As these datasets comprise over 13,000 monthly temperature,precipitation and river flow time series from several continents and hydroclimatic regimes,they allow us to provide trustable characterizations and interpretations of 12-month ahead hydroclimatic forecastability at the global scale.We first find that the exponential smoothing and state space methods for time series forecasting are rather equally efficient in identifying an upper limit of this forecastability in terms of Nash-Sutcliffe efficiency,while the simple method is shown to be mostly useful in identifying its lower limit.We then demonstrate that the assessed forecastability is strongly related to several descriptive features,including seasonality,entropy,(partial)autocorrelation,stability,(non)linearity,spikiness and heterogeneity features,among others.We further(i)show that,if such descriptive information is available for a monthly hydroclimatic time series,we can even foretell the quality of its future forecasts with a considerable degree of confidence,and(ii)rank the features according to their efficiency in explaining and foretelling forecastability.We believe that the obtained rankings are of key importance for understanding forecastability.Spatial forecastability patterns are also revealed through our experiments,with East Asia(Europe)being characterized by larger(smaller)monthly temperature time series forecastability and the Indian subcontinent(Australia)being characterized by larger(smaller)monthly precipitation time series forecastability,compared to other continental-scale regions,and less notable differences characterizing monthly river flow from continent to continent.A comprehensive interpretation of such patters through massive feature extraction and feature-based time series clustering is shown to be possible.Indeed,continental-scale regions characterized by different degrees of forecastability are also attributed to different clusters or mixtures of clusters(because of their essential differences in terms of descriptive features).
文摘Anomaly detection using KPI(Key Performance Indicator)is critical for Internet-based services to maintain high service availability.However,given the velocity,volume,and diversified nature of monitoring data,it is difficult to obtain enough labelled data to build an accurate anomaly detection model for using supervised machine leaning methods.In this paper,we propose an automatic and generic transfer learning strategy:Detecting anomalies on a new KPI by using pretrained model on existing selected labelled KPI.Our approach,called KADT(KPI Anomaly Detection based on Transfer Learning),integrates KPI clustering and model pretrained techniques.KPI clustering is used to obtain the similarity of different KPI data's distribution,and applied transfer knowledge from source dataset to the target dataset by model pretrained technique.In our evaluation using real-world KPIs from large Internet-based services,the clustering algorithm used to detect various KPI curve pattern achieve the best classification effect and accuracy More importantly,further evaluation on 30 KPIs shows that KADT can significantly reduce the time overhead of the model training with little loss of accuracy.
基金the National Natural Science Foundation of China (30370432)
文摘The original temporal clustering analysis (OTCA) is an effective technique for obtaining brain activation maps when the timing and location of the activation are completely unknown, but its deficiency of sensitivity is exposed in processing brain activation signal which is relatively weak. The time slice analysis method based on OTCA is proposed considering the weakness of the functional magnetic resonance imaging (fMRI) signal of the rat model. By dividing the stimulation period into several time slices and analyzing each slice to detect the activated pixels respectively after the background removal, the sensitivity is significantly improved. The inhibitory response in the hypothalamus after glucose loading is detected successfully with this method in the experiment on rat. Combined with the OTCA method, the time slice analysis method based on OTCA is effective on detecting when, where and which type of response will happen after stimulation, even if the fMRI signal is weak.
文摘Regional economic vitality reflects the scale and development potential of a region’s economy.It largely determines the development of the city,and is also affected by many factors such as population competitiveness,corporate competitiveness,market vitality,innovation vitality,and environmental vitality.A pilot model was constructed with Hebei Province as the inspection area.Quantitative measurement of regional economic vitality was made by finding 21 indicators that indirectly or indirectly affect the economic vitality of Hebei Province.By analyzing the data of 21 indicators for nearly 10 years,the time series clustering is used to achieve the dimensionality reduction of the indicators.After the dimension reduction,it is divided into four categories:overall scale,development potential,market vitality,and innovation vitality.Construct the economic vitality structure model of Hebei Province,and determine the four types of contribution to economic vitality and compare them.On this basis,more accurately grasp the indicators that affect economic vitality and work out reasonable and effective action plans.From the perspective of human resources and corporate vitality,analyze how the action plan accurately affects the economic vitality of Hebei Province[1].The 11 cities in Hebei Province are the target of regional economic vitality.The economic vitality structure model constructed uses the required contribution value to select priority indicators.Finally,the six indicators of GPD,GPD growth rate,fiscal revenue,fiscal revenue growth rate,number of industrial enterprises above designated size,and total profit of industrial enterprises above designated size were established for eleven cities in Hebei Province to construct a TOPSIS scoring model,and calculation rankings were conducted through MATLAB.Results The top three cities were Shijiazhuang,Tangshan and Cangzhou.
基金Supported by National Natural Science Foundation of China(Grant Nos.11171263,11201350 and 11371299)Doctoral Fund of Ministry of Education of China(Grant Nos.20110141110004 and 20110141120004)Fundamental Research Funds for the Central Universities
文摘Additive hazards model with random effects is proposed for modelling the correlated failure time data when focus is on comparing the failure times within clusters and on estimating the correlation between failure times from the same cluster, as well as the marginal regression parameters. Our model features that, when marginalized over the random effect variable, it still enjoys the structure of the additive hazards model. We develop the estimating equations for inferring the regression parameters. The proposed estimators are shown to be consistent and asymptotically normal under appropriate regularity conditions. Furthermore, the estimator of the baseline hazards function is proposed and its asymptotic properties are also established. We propose a class of diagnostic methods to assess the overall fitting adequacy of the additive hazards model with random effects. We conduct simulation studies to evaluate the finite sample behaviors of the proposed estimators in various scenarios. Analysis of the Diabetic Retinopathy Study is provided as an illustration for the proposed method.
基金supported by the National Key R&D Program of China(grant number 2019YFC0507401)the National Natural Science Foundation of China(grant number 42371325).
文摘Human activities significantly impact the environment.Understanding the patterns and distribution of these activities is crucial for ecological protection.With location-based technology advancement,big data such as location and trajectory data can be used to analyze human activities on finer temporal and spatial scales than traditional remote sensing data.In this study,Qilian Mountain National Park(QMNP)was chosen as the research area,and Tencent location data were used to construct time series data.Time series clustering and decomposition were performed,and the spatio-temporal distribution characteristics of human activities in the study area were analyzed in conjunction with GPS trajectory data and land use data.The study found two distinct human activity patterns,Pattern A and Pattern B,in QMNP.Compared to Pattern B,Pattern A had a higher volume of location data and clear nighttime peaks.By incorporating land use and trajectory data,we conclude that Pattern A and Pattern B represent the activity patterns of the resident and tourist populations,respectively.Moreover,the study identified seasonal variations in human activities,with human activity in summer being approximately two hours longer than in winter.We also conducted an analysis of human activities in different counties within the study area.
基金The authors gratefully acknowledge financial support from National Natural Science Foundation of Chinaunder GrantNos.91534111.
文摘A novel reactor with two reaction zones is proposed for pyridine synthesis.The flow hydrodynamics were investigated in experiments.Pressure taps and a PV-6D optical fiber were used to measure the pressure fluctuation and particle concentration,respectively.Hilbert–Huang analysis was adopted to distinguish the flow patterns through pressure fluctuation.Results show that the flow patterns are bubbling fluidization and fast fluidization in the regions below(reaction zone I)and above(reaction zone II)the nozzles,respectively.The radial distribution of the cluster time fraction was obtained through signal waves of the particle concentration.Analysis of the cluster time fraction revealed two radial distributions.In the region around the nozzles,the cluster time fraction ranged from 0 to 0.2 and concentrated at radial positions r/R=0–1,which resulted from unsymmetrical catalyst feeding.In reaction zone II,the cluster time fraction ranged from 0 to 0.2,and the radial distribution indicated a core–annulus structure.
基金supported by International Cooperation Projects (2010DFA31790) of Chinese Ministry of Science and Technologythe fund of Central China Normal University for Ph.D students (No. 2009023)+2 种基金supported by the National Natural Science Foundation of China Grants(No. 10731010, 10971015 and 11021161)the National Basic Research Program of China (973 Program) (No.2007CB814902)Key Laboratory of Random Complex Structures and Data Science, Academy of Mathematics& Systems Science, Chinese Academy of Sciences (No. 2008DP173182)
文摘Recurrent event data often arises in biomedical studies, and individuals within a cluster might not be independent. We propose a semiparametric additive rates model for clustered recurrent event data, wherein the covariates are assumed to add to the unspecified baseline rate. For the inference on the model parameters, estimating equation approaches are developed, and both large and finite sample properties of the proposed estimators are established.