Regression is one of the important problems in statistical learning theory. This paper proves the global convergence of the piecewise regression algorithm based on deterministic annealing and continuity of global mini...Regression is one of the important problems in statistical learning theory. This paper proves the global convergence of the piecewise regression algorithm based on deterministic annealing and continuity of global minimum of free energy w.r.t temperature, and derives a new simplified formula to compute the initial critical temperature. A new enhanced plecewise regression algorithm by using "migration of prototypes" is proposed to eliminate "empty cell" in the annealing process. Numerical experiments on several benchmark datasets show that the new algorithm can remove redundancy and improve generalization of the piecewise regression model.展开更多
Piecewise linear regression models are very flexible models for modeling the data. If the piecewise linear regression models are matched against the data, then the parameters are generally not known. This paper studie...Piecewise linear regression models are very flexible models for modeling the data. If the piecewise linear regression models are matched against the data, then the parameters are generally not known. This paper studies the problem of parameter estimation ofpiecewise linear regression models. The method used to estimate the parameters ofpicewise linear regression models is Bayesian method. But the Bayes estimator can not be found analytically. To overcome these problems, the reversible jump MCMC (Marcov Chain Monte Carlo) algorithm is proposed. Reversible jump MCMC algorithm generates the Markov chain converges to the limit distribution of the posterior distribution of the parameters ofpicewise linear regression models. The resulting Markov chain is used to calculate the Bayes estimator for the parameters of picewise linear regression models.展开更多
Recent developments in database technology have seen a wide variety of data being stored in huge collections. The wide variety makes the analysis tasks of a generic database a strenuous task in knowledge discovery. On...Recent developments in database technology have seen a wide variety of data being stored in huge collections. The wide variety makes the analysis tasks of a generic database a strenuous task in knowledge discovery. One approach is to summarize large datasets in such a way that the resulting summary dataset is of manageable size. Histogram has received significant attention as summarization/representative object for large database. But, it suffers from computational and space complexity. In this paper, we propose an idea to transform the histogram object into a Piecewise Linear Regression (PLR) line object and suggest that PLR objects can be less computational and storage intensive while compared to those of histograms. On the other hand to carry out a cluster analysis, we propose a distance measure for computing the distance between the PLR lines. Case study is presented based on the real data of online education system LMS. This demonstrates that PLR is a powerful knowledge representative for very large database.展开更多
Mobile clouds are the most common medium for aggregating,storing,and analyzing data from the medical Internet of Things(MIoT).It is employed to monitor a patient’s essential health signs for earlier disease diagnosis...Mobile clouds are the most common medium for aggregating,storing,and analyzing data from the medical Internet of Things(MIoT).It is employed to monitor a patient’s essential health signs for earlier disease diagnosis and prediction.Among the various disease,skin cancer was the wide variety of cancer,as well as enhances the endurance rate.In recent years,many skin cancer classification systems using machine and deep learning models have been developed for classifying skin tumors,including malignant melanoma(MM)and other skin cancers.However,accurate cancer detection was not performed with minimum time consumption.In order to address these existing problems,a novel Multidimensional Bregman Divergencive Feature Scaling Based Cophenetic Piecewise Regression Recurrent Deep Learning Classification(MBDFS-CPRRDLC)technique is introduced for detecting cancer at an earlier stage.The MBDFS-CPRRDLC performs skin cancer detection using different layers such as input,hidden,and output for feature selection and classification.The patient information is composed of IoT.The patient information was stored in mobile clouds server for performing predictive analytics.The collected data are sent to the recurrent deep learning classifier.In the first hidden layer,the feature selection process is carried out using the Multidimensional Bregman Divergencive Feature Scaling technique to find the significant features for disease identification resulting in decreases time consumption.Followed by,the disease classification is carried out in the second hidden layer using cophenetic correlative piecewise regression for analyzing the testing and training data.This process is repeatedly performed until the error gets minimized.In this way,disease classification is accurately performed with higher accuracy.Experimental evaluation is carried out for factors namely Accuracy,precision,recall,F-measure,as well as cancer detection time,by the amount of patient data.The observed result confirms that the proposed MBDFS-CPRRDLC technique increases accuracy as well as lesser cancer detection time compared to the conventional approaches.展开更多
A non-invasive method to estimate the number of Trypodendron lineatum holes on dead standing pines(Pinus sylvestris L.)was developed using linear and nonlinear estimations.A clas sical linear regres sion model was fir...A non-invasive method to estimate the number of Trypodendron lineatum holes on dead standing pines(Pinus sylvestris L.)was developed using linear and nonlinear estimations.A clas sical linear regres sion model was first used to analyze the relationship between the number of holes caused by T.lineatum on selected stem units and the total number of holes on an entire dead stem of P.sylvestris.Then,to obtain a better fit of the regression function to the data for the stem unit selected in the first step,piecewise linear regression(PLR)was used.Last,in an area used to evaluate wood decomposition(method validation),the total and mean numbers of T.lineatum holes were estimated for single dead trees and for a sample(n=8 dead trees).Data were collected in 2009(data set D1),in 2010-2014(data set D2)and in 2020(data set D3)in forests containing P.sylvestris located within Suchedniow-Oblegorek Landscape Park,Poland.A model was constructed with three linear equations.An evaluation of model accuracy showed that it was highly effective regardless of the density of T.lineatum holes and sample size.The method enables the evaluation of the biological role of this species in the decomposition of dead standing wood of P.sylvestris in strictly protected areas.展开更多
This work addresses the multiscale optimization of the puri cation processes of antibody fragments. Chromatography decisions in the manufacturing processes are optimized, including the number of chromatography columns...This work addresses the multiscale optimization of the puri cation processes of antibody fragments. Chromatography decisions in the manufacturing processes are optimized, including the number of chromatography columns and their sizes, the number of cycles per batch, and the operational ow velocities. Data-driven models of chromatography throughput are developed considering loaded mass, ow velocity, and column bed height as the inputs, using manufacturing-scale simulated datasets based on microscale experimental data. The piecewise linear regression modeling method is adapted due to its simplicity and better prediction accuracy in comparison with other methods. Two alternative mixed-integer nonlinear programming (MINLP) models are proposed to minimize the total cost of goods per gram of the antibody puri cation process, incorporating the data-driven models. These MINLP models are then reformulated as mixed-integer linear programming (MILP) models using linearization techniques and multiparametric disaggregation. Two industrially relevant cases with different chromatography column size alternatives are investigated to demonstrate the applicability of the proposed models.展开更多
The traffic of overloaded trucks is a critical problem in highways.It affects pavement performance life,reduces the service life of bridges,and has a negative impact on road safety,average speed and level of service.T...The traffic of overloaded trucks is a critical problem in highways.It affects pavement performance life,reduces the service life of bridges,and has a negative impact on road safety,average speed and level of service.There are several practices to prevent the truck overloading issue,i.e.,enforcement activities to verify the truck’s compliance with the legal weight limits.This paper investigates the development of a method that uses available weigh-in-motion(WIM)data to identify overloaded truck weight and travel patterns.The proposed approach is based on regression trees method,a simple and easily understandable analytic tool used to build prediction models from a large set of data.An overall analysis of the overloaded truck regression tree model shows that the most important variable to classify and predict overloading is the truck type.Regarding the axle overloading,the most significant variable is the time of the day(most of the overloaded trucks travel at late night or early morning).The regression tree results can be used to optimize the efficiency of administration activities by planning truck enforcement operations based on the more critical scenarios.Also,the results improve the knowledge about the load characteristics of trucks,which can lead to more effective pavement management systems and more assertive pavement structure designs.展开更多
Karst regions are the typical areas of interaction between human society and natural ecosystems.Understanding the historical mechanisms of the evolution of social-ecological systems(SES)is crucial for the future susta...Karst regions are the typical areas of interaction between human society and natural ecosystems.Understanding the historical mechanisms of the evolution of social-ecological systems(SES)is crucial for the future sustainable management of karst regions.This study selected Guangxi,a typical karst mountainous region in Southwest China,as the study area,and used population,cropland area,and forest coverage as the SES elements.Based on the framework of SES research in the karst region,it adopted segmented linear regression to identify the stages of the interactions among these elements,to reveal the evolutionary stages of social development from the long-term perspective.In addition,the driving factor indicators were constructed from the aspects of natural environment,social development,government policy,and climate change,and then the feedback changes brought about by the evolution were investigated.The results show that the evolution of SES in Guangxi from 1363-2020 can be divided into seven stages.In the first,second,and early period of the third stages,the government of Guangxi mainly focused on agricultural activities,although the only way to meet the growing demand for food was by expanding the area of cropland,and the timber trade’s pursuit of economic development,resulting in an increase in rocky desertification.In the fourth stage,the ecological environment improved under the implementation of measures such as the control of rocky desertification and the compensation of forest ecological benefits.After the fifth stage,the effect of rocky desertification control has been remarkable.Although the implementation of relevant policies has alleviated the environmental problems to some extent,the continual changes in the structure and function of SES can challenge further progress towards sustainability in karst regions.This study aims to provide a reference for the long-term national spatial planning and the development of environmental policies in karst regions.展开更多
In this study, we analyzed the spatiotemporal variation of cold surges in Inner Mongolia between 1960 and 2012 and their possible driving factors using daily minimum temperature data from 121 meteorological stations i...In this study, we analyzed the spatiotemporal variation of cold surges in Inner Mongolia between 1960 and 2012 and their possible driving factors using daily minimum temperature data from 121 meteorological stations in Inner Mongolia and the surrounding areas. These data were analyzed utilizing a piecewise regression model, a Sen+Mann- Kendall model, and a correlation analysis. Results demonstrated that (1) the frequency of single-station cold surges decreased in Inner Mongolia during the study period, with a linear tendency of -0.5 times/10a (-2.4 to 1.2 times/10a). Prior to 1991, a significant decreasing trend of-1.1 times/10a (-3.3 to 2.5 times/10a) was detected, while an increasing trend of 0.45 times/10a (-4.4 to 4.2 times/10a) was found after 1991. On a seasonal scale, the trend in spring cold surges was consistent with annual values, and the most obvious change in cold surges occurred during spring. Monthly cold surge frequency displayed a bimodal structure, and November witnessed the highest incidence of cold surge. (2) Spatially, the high incidence of cold surge is mainly observed in the northern and central parts of Inner Mongolia, with a higher occurrence observed in the northern than in the central part. Inter-decadal character- istic also revealed that high frequency and low frequency regions presented decreasing and increasing trends, respectively, between 1960 and 1990. High frequency regions expanded after the 1990s, and regions exhibiting high cold surge frequency were mainly distributed in Tulihe, Xiao'ergou, and Xi Ujimqin Banner. (3) On an annual scale, the cold surge was dominated by AO, NAO, CA, APVII, and CQ. However, seasonal differences in the driving forces of cold surges were detected. Winter cold surges were significantly correlated with AO, NAO, SHI, CA, TPI, APVII, CW, and IZ, indicating they were caused by multiple factors. Au- tumn cold surges were mainly affected by CA and IM, while spring cold surges were significantly correlated with CA and APVII.展开更多
基金the National Natural Science Foundation of China(Grant Nos.60675013 and 4022500)the National Basic Research Program of China(973 Program)(Grant No.2007CB311002)
文摘Regression is one of the important problems in statistical learning theory. This paper proves the global convergence of the piecewise regression algorithm based on deterministic annealing and continuity of global minimum of free energy w.r.t temperature, and derives a new simplified formula to compute the initial critical temperature. A new enhanced plecewise regression algorithm by using "migration of prototypes" is proposed to eliminate "empty cell" in the annealing process. Numerical experiments on several benchmark datasets show that the new algorithm can remove redundancy and improve generalization of the piecewise regression model.
文摘Piecewise linear regression models are very flexible models for modeling the data. If the piecewise linear regression models are matched against the data, then the parameters are generally not known. This paper studies the problem of parameter estimation ofpiecewise linear regression models. The method used to estimate the parameters ofpicewise linear regression models is Bayesian method. But the Bayes estimator can not be found analytically. To overcome these problems, the reversible jump MCMC (Marcov Chain Monte Carlo) algorithm is proposed. Reversible jump MCMC algorithm generates the Markov chain converges to the limit distribution of the posterior distribution of the parameters ofpicewise linear regression models. The resulting Markov chain is used to calculate the Bayes estimator for the parameters of picewise linear regression models.
文摘Recent developments in database technology have seen a wide variety of data being stored in huge collections. The wide variety makes the analysis tasks of a generic database a strenuous task in knowledge discovery. One approach is to summarize large datasets in such a way that the resulting summary dataset is of manageable size. Histogram has received significant attention as summarization/representative object for large database. But, it suffers from computational and space complexity. In this paper, we propose an idea to transform the histogram object into a Piecewise Linear Regression (PLR) line object and suggest that PLR objects can be less computational and storage intensive while compared to those of histograms. On the other hand to carry out a cluster analysis, we propose a distance measure for computing the distance between the PLR lines. Case study is presented based on the real data of online education system LMS. This demonstrates that PLR is a powerful knowledge representative for very large database.
基金This research is funded by Princess Nourah Bint Abdulrahman University Researchers Supporting Project number(PNURSP2022R194)Princess Nourah Bint Abdulrahman University,Riyadh,Saudi Arabia.
文摘Mobile clouds are the most common medium for aggregating,storing,and analyzing data from the medical Internet of Things(MIoT).It is employed to monitor a patient’s essential health signs for earlier disease diagnosis and prediction.Among the various disease,skin cancer was the wide variety of cancer,as well as enhances the endurance rate.In recent years,many skin cancer classification systems using machine and deep learning models have been developed for classifying skin tumors,including malignant melanoma(MM)and other skin cancers.However,accurate cancer detection was not performed with minimum time consumption.In order to address these existing problems,a novel Multidimensional Bregman Divergencive Feature Scaling Based Cophenetic Piecewise Regression Recurrent Deep Learning Classification(MBDFS-CPRRDLC)technique is introduced for detecting cancer at an earlier stage.The MBDFS-CPRRDLC performs skin cancer detection using different layers such as input,hidden,and output for feature selection and classification.The patient information is composed of IoT.The patient information was stored in mobile clouds server for performing predictive analytics.The collected data are sent to the recurrent deep learning classifier.In the first hidden layer,the feature selection process is carried out using the Multidimensional Bregman Divergencive Feature Scaling technique to find the significant features for disease identification resulting in decreases time consumption.Followed by,the disease classification is carried out in the second hidden layer using cophenetic correlative piecewise regression for analyzing the testing and training data.This process is repeatedly performed until the error gets minimized.In this way,disease classification is accurately performed with higher accuracy.Experimental evaluation is carried out for factors namely Accuracy,precision,recall,F-measure,as well as cancer detection time,by the amount of patient data.The observed result confirms that the proposed MBDFS-CPRRDLC technique increases accuracy as well as lesser cancer detection time compared to the conventional approaches.
基金supported by the Ministry of Science and Higher Education in Poland(grant No.612464)。
文摘A non-invasive method to estimate the number of Trypodendron lineatum holes on dead standing pines(Pinus sylvestris L.)was developed using linear and nonlinear estimations.A clas sical linear regres sion model was first used to analyze the relationship between the number of holes caused by T.lineatum on selected stem units and the total number of holes on an entire dead stem of P.sylvestris.Then,to obtain a better fit of the regression function to the data for the stem unit selected in the first step,piecewise linear regression(PLR)was used.Last,in an area used to evaluate wood decomposition(method validation),the total and mean numbers of T.lineatum holes were estimated for single dead trees and for a sample(n=8 dead trees).Data were collected in 2009(data set D1),in 2010-2014(data set D2)and in 2020(data set D3)in forests containing P.sylvestris located within Suchedniow-Oblegorek Landscape Park,Poland.A model was constructed with three linear equations.An evaluation of model accuracy showed that it was highly effective regardless of the density of T.lineatum holes and sample size.The method enables the evaluation of the biological role of this species in the decomposition of dead standing wood of P.sylvestris in strictly protected areas.
文摘This work addresses the multiscale optimization of the puri cation processes of antibody fragments. Chromatography decisions in the manufacturing processes are optimized, including the number of chromatography columns and their sizes, the number of cycles per batch, and the operational ow velocities. Data-driven models of chromatography throughput are developed considering loaded mass, ow velocity, and column bed height as the inputs, using manufacturing-scale simulated datasets based on microscale experimental data. The piecewise linear regression modeling method is adapted due to its simplicity and better prediction accuracy in comparison with other methods. Two alternative mixed-integer nonlinear programming (MINLP) models are proposed to minimize the total cost of goods per gram of the antibody puri cation process, incorporating the data-driven models. These MINLP models are then reformulated as mixed-integer linear programming (MILP) models using linearization techniques and multiparametric disaggregation. Two industrially relevant cases with different chromatography column size alternatives are investigated to demonstrate the applicability of the proposed models.
基金The authors thank the Arteris S.A.(Autopista Fernao Dias and Centro de Desenvolvimento Tecnologico),ANTT(Agencia Nacional de Transportes Terrestres),and CNPq(Conselho Nacional de Desenvolvimento Cientifico e Tecnologico)for supporting this research.
文摘The traffic of overloaded trucks is a critical problem in highways.It affects pavement performance life,reduces the service life of bridges,and has a negative impact on road safety,average speed and level of service.There are several practices to prevent the truck overloading issue,i.e.,enforcement activities to verify the truck’s compliance with the legal weight limits.This paper investigates the development of a method that uses available weigh-in-motion(WIM)data to identify overloaded truck weight and travel patterns.The proposed approach is based on regression trees method,a simple and easily understandable analytic tool used to build prediction models from a large set of data.An overall analysis of the overloaded truck regression tree model shows that the most important variable to classify and predict overloading is the truck type.Regarding the axle overloading,the most significant variable is the time of the day(most of the overloaded trucks travel at late night or early morning).The regression tree results can be used to optimize the efficiency of administration activities by planning truck enforcement operations based on the more critical scenarios.Also,the results improve the knowledge about the load characteristics of trucks,which can lead to more effective pavement management systems and more assertive pavement structure designs.
基金The Natural Science Foundation of Guizhou Province,China(ZK[2023]ZHONGDIAN 027)The Science and Technology Innovation BaseConstruction Project of Guizhou Province(QKHZYD[2023]005).
文摘Karst regions are the typical areas of interaction between human society and natural ecosystems.Understanding the historical mechanisms of the evolution of social-ecological systems(SES)is crucial for the future sustainable management of karst regions.This study selected Guangxi,a typical karst mountainous region in Southwest China,as the study area,and used population,cropland area,and forest coverage as the SES elements.Based on the framework of SES research in the karst region,it adopted segmented linear regression to identify the stages of the interactions among these elements,to reveal the evolutionary stages of social development from the long-term perspective.In addition,the driving factor indicators were constructed from the aspects of natural environment,social development,government policy,and climate change,and then the feedback changes brought about by the evolution were investigated.The results show that the evolution of SES in Guangxi from 1363-2020 can be divided into seven stages.In the first,second,and early period of the third stages,the government of Guangxi mainly focused on agricultural activities,although the only way to meet the growing demand for food was by expanding the area of cropland,and the timber trade’s pursuit of economic development,resulting in an increase in rocky desertification.In the fourth stage,the ecological environment improved under the implementation of measures such as the control of rocky desertification and the compensation of forest ecological benefits.After the fifth stage,the effect of rocky desertification control has been remarkable.Although the implementation of relevant policies has alleviated the environmental problems to some extent,the continual changes in the structure and function of SES can challenge further progress towards sustainability in karst regions.This study aims to provide a reference for the long-term national spatial planning and the development of environmental policies in karst regions.
基金Major Project of High-resolution Earth Observation System
文摘In this study, we analyzed the spatiotemporal variation of cold surges in Inner Mongolia between 1960 and 2012 and their possible driving factors using daily minimum temperature data from 121 meteorological stations in Inner Mongolia and the surrounding areas. These data were analyzed utilizing a piecewise regression model, a Sen+Mann- Kendall model, and a correlation analysis. Results demonstrated that (1) the frequency of single-station cold surges decreased in Inner Mongolia during the study period, with a linear tendency of -0.5 times/10a (-2.4 to 1.2 times/10a). Prior to 1991, a significant decreasing trend of-1.1 times/10a (-3.3 to 2.5 times/10a) was detected, while an increasing trend of 0.45 times/10a (-4.4 to 4.2 times/10a) was found after 1991. On a seasonal scale, the trend in spring cold surges was consistent with annual values, and the most obvious change in cold surges occurred during spring. Monthly cold surge frequency displayed a bimodal structure, and November witnessed the highest incidence of cold surge. (2) Spatially, the high incidence of cold surge is mainly observed in the northern and central parts of Inner Mongolia, with a higher occurrence observed in the northern than in the central part. Inter-decadal character- istic also revealed that high frequency and low frequency regions presented decreasing and increasing trends, respectively, between 1960 and 1990. High frequency regions expanded after the 1990s, and regions exhibiting high cold surge frequency were mainly distributed in Tulihe, Xiao'ergou, and Xi Ujimqin Banner. (3) On an annual scale, the cold surge was dominated by AO, NAO, CA, APVII, and CQ. However, seasonal differences in the driving forces of cold surges were detected. Winter cold surges were significantly correlated with AO, NAO, SHI, CA, TPI, APVII, CW, and IZ, indicating they were caused by multiple factors. Au- tumn cold surges were mainly affected by CA and IM, while spring cold surges were significantly correlated with CA and APVII.