Recent developments in database technology have seen a wide variety of data being stored in huge collections. The wide variety makes the analysis tasks of a generic database a strenuous task in knowledge discovery. On...Recent developments in database technology have seen a wide variety of data being stored in huge collections. The wide variety makes the analysis tasks of a generic database a strenuous task in knowledge discovery. One approach is to summarize large datasets in such a way that the resulting summary dataset is of manageable size. Histogram has received significant attention as summarization/representative object for large database. But, it suffers from computational and space complexity. In this paper, we propose an idea to transform the histogram object into a Piecewise Linear Regression (PLR) line object and suggest that PLR objects can be less computational and storage intensive while compared to those of histograms. On the other hand to carry out a cluster analysis, we propose a distance measure for computing the distance between the PLR lines. Case study is presented based on the real data of online education system LMS. This demonstrates that PLR is a powerful knowledge representative for very large database.展开更多
A non-invasive method to estimate the number of Trypodendron lineatum holes on dead standing pines(Pinus sylvestris L.)was developed using linear and nonlinear estimations.A clas sical linear regres sion model was fir...A non-invasive method to estimate the number of Trypodendron lineatum holes on dead standing pines(Pinus sylvestris L.)was developed using linear and nonlinear estimations.A clas sical linear regres sion model was first used to analyze the relationship between the number of holes caused by T.lineatum on selected stem units and the total number of holes on an entire dead stem of P.sylvestris.Then,to obtain a better fit of the regression function to the data for the stem unit selected in the first step,piecewise linear regression(PLR)was used.Last,in an area used to evaluate wood decomposition(method validation),the total and mean numbers of T.lineatum holes were estimated for single dead trees and for a sample(n=8 dead trees).Data were collected in 2009(data set D1),in 2010-2014(data set D2)and in 2020(data set D3)in forests containing P.sylvestris located within Suchedniow-Oblegorek Landscape Park,Poland.A model was constructed with three linear equations.An evaluation of model accuracy showed that it was highly effective regardless of the density of T.lineatum holes and sample size.The method enables the evaluation of the biological role of this species in the decomposition of dead standing wood of P.sylvestris in strictly protected areas.展开更多
This work addresses the multiscale optimization of the puri cation processes of antibody fragments. Chromatography decisions in the manufacturing processes are optimized, including the number of chromatography columns...This work addresses the multiscale optimization of the puri cation processes of antibody fragments. Chromatography decisions in the manufacturing processes are optimized, including the number of chromatography columns and their sizes, the number of cycles per batch, and the operational ow velocities. Data-driven models of chromatography throughput are developed considering loaded mass, ow velocity, and column bed height as the inputs, using manufacturing-scale simulated datasets based on microscale experimental data. The piecewise linear regression modeling method is adapted due to its simplicity and better prediction accuracy in comparison with other methods. Two alternative mixed-integer nonlinear programming (MINLP) models are proposed to minimize the total cost of goods per gram of the antibody puri cation process, incorporating the data-driven models. These MINLP models are then reformulated as mixed-integer linear programming (MILP) models using linearization techniques and multiparametric disaggregation. Two industrially relevant cases with different chromatography column size alternatives are investigated to demonstrate the applicability of the proposed models.展开更多
文摘Recent developments in database technology have seen a wide variety of data being stored in huge collections. The wide variety makes the analysis tasks of a generic database a strenuous task in knowledge discovery. One approach is to summarize large datasets in such a way that the resulting summary dataset is of manageable size. Histogram has received significant attention as summarization/representative object for large database. But, it suffers from computational and space complexity. In this paper, we propose an idea to transform the histogram object into a Piecewise Linear Regression (PLR) line object and suggest that PLR objects can be less computational and storage intensive while compared to those of histograms. On the other hand to carry out a cluster analysis, we propose a distance measure for computing the distance between the PLR lines. Case study is presented based on the real data of online education system LMS. This demonstrates that PLR is a powerful knowledge representative for very large database.
基金supported by the Ministry of Science and Higher Education in Poland(grant No.612464)。
文摘A non-invasive method to estimate the number of Trypodendron lineatum holes on dead standing pines(Pinus sylvestris L.)was developed using linear and nonlinear estimations.A clas sical linear regres sion model was first used to analyze the relationship between the number of holes caused by T.lineatum on selected stem units and the total number of holes on an entire dead stem of P.sylvestris.Then,to obtain a better fit of the regression function to the data for the stem unit selected in the first step,piecewise linear regression(PLR)was used.Last,in an area used to evaluate wood decomposition(method validation),the total and mean numbers of T.lineatum holes were estimated for single dead trees and for a sample(n=8 dead trees).Data were collected in 2009(data set D1),in 2010-2014(data set D2)and in 2020(data set D3)in forests containing P.sylvestris located within Suchedniow-Oblegorek Landscape Park,Poland.A model was constructed with three linear equations.An evaluation of model accuracy showed that it was highly effective regardless of the density of T.lineatum holes and sample size.The method enables the evaluation of the biological role of this species in the decomposition of dead standing wood of P.sylvestris in strictly protected areas.
文摘This work addresses the multiscale optimization of the puri cation processes of antibody fragments. Chromatography decisions in the manufacturing processes are optimized, including the number of chromatography columns and their sizes, the number of cycles per batch, and the operational ow velocities. Data-driven models of chromatography throughput are developed considering loaded mass, ow velocity, and column bed height as the inputs, using manufacturing-scale simulated datasets based on microscale experimental data. The piecewise linear regression modeling method is adapted due to its simplicity and better prediction accuracy in comparison with other methods. Two alternative mixed-integer nonlinear programming (MINLP) models are proposed to minimize the total cost of goods per gram of the antibody puri cation process, incorporating the data-driven models. These MINLP models are then reformulated as mixed-integer linear programming (MILP) models using linearization techniques and multiparametric disaggregation. Two industrially relevant cases with different chromatography column size alternatives are investigated to demonstrate the applicability of the proposed models.