Gauss-Markov model is frequently used in data analysis; the analysis and estimation of its parameters is always a hot issue. Based on the information theory and from the viewpoint of optimal information on description...Gauss-Markov model is frequently used in data analysis; the analysis and estimation of its parameters is always a hot issue. Based on the information theory and from the viewpoint of optimal information on description—minimum description length, this paper discusses a case: where there is multi-collinearity in the coefficient matrix, principal component estimation is used to estimate and select the original parameters, so as to reduce its multi-collinearity and improve its credibility. From the viewpoint of minimum description length, this paper discusses the approach of selecting principal components and uses this approach to solve a practical problem.展开更多
Computations involved in Bayesian approach to practical model selection problems are usually very difficult. Computational simplifications are sometimes possible, but are not generally applicable. There is a large lit...Computations involved in Bayesian approach to practical model selection problems are usually very difficult. Computational simplifications are sometimes possible, but are not generally applicable. There is a large literature available on a methodology based on information theory called Minimum Description Length (MDL). It is described here how many of these techniques are either directly Bayesian in nature, or are very good objective approximations to Bayesian solutions. First, connections between the Bayesian approach and MDL are theoretically explored;thereafter a few illustrations are provided to describe how MDL can give useful computational simplifications.展开更多
An improved Gaussian mixture model (GMM)- based clustering method is proposed for the difficult case where the true distribution of data is against the assumed GMM. First, an improved model selection criterion, the ...An improved Gaussian mixture model (GMM)- based clustering method is proposed for the difficult case where the true distribution of data is against the assumed GMM. First, an improved model selection criterion, the completed likelihood minimum message length criterion, is derived. It can measure both the goodness-of-fit of the candidate GMM to the data and the goodness-of-partition of the data. Secondly, by utilizing the proposed criterion as the clustering objective function, an improved expectation- maximization (EM) algorithm is developed, which can avoid poor local optimal solutions compared to the standard EM algorithm for estimating the model parameters. The experimental results demonstrate that the proposed method can rectify the over-fitting tendency of representative GMM-based clustering approaches and can robustly provide more accurate clustering results.展开更多
When the stagnation temperature of a perfect gas increases, the specific heat ratio does not remain constant any more, and start to vary with this temperature. The gas remains perfect, its state equation remains alway...When the stagnation temperature of a perfect gas increases, the specific heat ratio does not remain constant any more, and start to vary with this temperature. The gas remains perfect, its state equation remains always valid, except it will name in more calorically imperfect gas or gas at High Temperature. The goal of this work is to trace the profiles of the supersonic Minimum Length Nozzle with centered expansion when the stagnation temperature is taken into account, lower than the threshold of dissociation of the molecules and to have for each exit Mach number several nozzles shapes by changing the value of the temperature. The method of characteristics is used with a new form of the Prandtl Meyer function at high temperature. The resolution of the obtained equations is done by the second order of fmite differences method by using the predictor corrector algorithm. A study on the error given by the perfect gas model compared to our model is presented. The comparison is made with a calorically perfect gas for goal to give a limit of application of this model. The application is for the air.展开更多
[Objective] This paper aimed to provide a new method for genetic data clustering by analyzing the clustering effect of genetic data clustering algorithm based on the minimum coding length. [Method] The genetic data cl...[Objective] This paper aimed to provide a new method for genetic data clustering by analyzing the clustering effect of genetic data clustering algorithm based on the minimum coding length. [Method] The genetic data clustering was regarded as high dimensional mixed data clustering. After preprocessing genetic data, the dimensions of the genetic data were reduced by principal component analysis, when genetic data presented Gaussian-like distribution. This distribution of genetic data could be clustered effectively through lossy data compression, which clustered the genes based on a simple clustering algorithm. This algorithm could achieve its best clustering result when the length of the codes of encoding clustered genes reached its minimum value. This algorithm and the traditional clustering algorithms were used to do the genetic data clustering of yeast and Arabidopsis, and the effectiveness of the algorithm was verified through genetic clustering internal evaluation and function evaluation. [Result] The clustering effect of the new algorithm in this study was superior to traditional clustering algorithms, and it also avoided the problems of subjective determination of clustering data and sensitiveness to initial clustering center. [Conclusion] This study provides a new clustering method for the genetic data clustering.展开更多
In this study, numerical analysis is performed to adopt the equivalence ratio on the high velocity oxygen fuel (HVOF) thermal spray coating systems equipped with a minimum length nozzle. The analysis is applied to i...In this study, numerical analysis is performed to adopt the equivalence ratio on the high velocity oxygen fuel (HVOF) thermal spray coating systems equipped with a minimum length nozzle. The analysis is applied to investigate the axisymmetric, steady-state, turbulent, and chemically combusting flow both within the torch and in a free jet region between the torch and the substrate to be coated. The combustion is modeled using a single-step and eddy-dissipation model which assumes that the reaction rate is limited by the turbulent mixing rate of the fuel and oxidant. As the diameter of the nozzle throat is increased, the location of the Mach shock disc moves backward from the nozzle exit. As the throat diameter and the divergent portion are 6 mm and 8 mm, respectively, the pressure in the HVOF system is the lowest at the chamber and the expanding gas is steadily maintained with both high velocity and high temperature for different equivalence ratios. Thus, relatively minor amendments of the equivalence ratio and the geometry of HVOF can lead to improved control over coating characteristics.展开更多
基金Project(40074001) supported by National Natural Science Foundation of China Project (SD2003 -10) supported by the Open ResearchFund Programof the Key Laboratory of Geomatics and Digital Technilogy ,Shandong Province
文摘Gauss-Markov model is frequently used in data analysis; the analysis and estimation of its parameters is always a hot issue. Based on the information theory and from the viewpoint of optimal information on description—minimum description length, this paper discusses a case: where there is multi-collinearity in the coefficient matrix, principal component estimation is used to estimate and select the original parameters, so as to reduce its multi-collinearity and improve its credibility. From the viewpoint of minimum description length, this paper discusses the approach of selecting principal components and uses this approach to solve a practical problem.
文摘Computations involved in Bayesian approach to practical model selection problems are usually very difficult. Computational simplifications are sometimes possible, but are not generally applicable. There is a large literature available on a methodology based on information theory called Minimum Description Length (MDL). It is described here how many of these techniques are either directly Bayesian in nature, or are very good objective approximations to Bayesian solutions. First, connections between the Bayesian approach and MDL are theoretically explored;thereafter a few illustrations are provided to describe how MDL can give useful computational simplifications.
基金The National Natural Science Foundation of China(No.61105048,60972165)the Doctoral Fund of Ministry of Education of China(No.20110092120034)+2 种基金the Natural Science Foundation of Jiangsu Province(No.BK2010240)the Technology Foundation for Selected Overseas Chinese Scholar,Ministry of Human Resources and Social Security of China(No.6722000008)the Open Fund of Jiangsu Province Key Laboratory for Remote Measuring and Control(No.YCCK201005)
文摘An improved Gaussian mixture model (GMM)- based clustering method is proposed for the difficult case where the true distribution of data is against the assumed GMM. First, an improved model selection criterion, the completed likelihood minimum message length criterion, is derived. It can measure both the goodness-of-fit of the candidate GMM to the data and the goodness-of-partition of the data. Secondly, by utilizing the proposed criterion as the clustering objective function, an improved expectation- maximization (EM) algorithm is developed, which can avoid poor local optimal solutions compared to the standard EM algorithm for estimating the model parameters. The experimental results demonstrate that the proposed method can rectify the over-fitting tendency of representative GMM-based clustering approaches and can robustly provide more accurate clustering results.
文摘When the stagnation temperature of a perfect gas increases, the specific heat ratio does not remain constant any more, and start to vary with this temperature. The gas remains perfect, its state equation remains always valid, except it will name in more calorically imperfect gas or gas at High Temperature. The goal of this work is to trace the profiles of the supersonic Minimum Length Nozzle with centered expansion when the stagnation temperature is taken into account, lower than the threshold of dissociation of the molecules and to have for each exit Mach number several nozzles shapes by changing the value of the temperature. The method of characteristics is used with a new form of the Prandtl Meyer function at high temperature. The resolution of the obtained equations is done by the second order of fmite differences method by using the predictor corrector algorithm. A study on the error given by the perfect gas model compared to our model is presented. The comparison is made with a calorically perfect gas for goal to give a limit of application of this model. The application is for the air.
文摘[Objective] This paper aimed to provide a new method for genetic data clustering by analyzing the clustering effect of genetic data clustering algorithm based on the minimum coding length. [Method] The genetic data clustering was regarded as high dimensional mixed data clustering. After preprocessing genetic data, the dimensions of the genetic data were reduced by principal component analysis, when genetic data presented Gaussian-like distribution. This distribution of genetic data could be clustered effectively through lossy data compression, which clustered the genes based on a simple clustering algorithm. This algorithm could achieve its best clustering result when the length of the codes of encoding clustered genes reached its minimum value. This algorithm and the traditional clustering algorithms were used to do the genetic data clustering of yeast and Arabidopsis, and the effectiveness of the algorithm was verified through genetic clustering internal evaluation and function evaluation. [Result] The clustering effect of the new algorithm in this study was superior to traditional clustering algorithms, and it also avoided the problems of subjective determination of clustering data and sensitiveness to initial clustering center. [Conclusion] This study provides a new clustering method for the genetic data clustering.
基金support by the Center of Excellency Program of the Korea Science and Engineering Foundation (KOSEF)and Ministry of Science and Technology (MOST)(No.R11-2000-086-0000-0)through the Center for Advanced Plasma Surface Technology (CAPST)at the Sungkyunkwan University
文摘In this study, numerical analysis is performed to adopt the equivalence ratio on the high velocity oxygen fuel (HVOF) thermal spray coating systems equipped with a minimum length nozzle. The analysis is applied to investigate the axisymmetric, steady-state, turbulent, and chemically combusting flow both within the torch and in a free jet region between the torch and the substrate to be coated. The combustion is modeled using a single-step and eddy-dissipation model which assumes that the reaction rate is limited by the turbulent mixing rate of the fuel and oxidant. As the diameter of the nozzle throat is increased, the location of the Mach shock disc moves backward from the nozzle exit. As the throat diameter and the divergent portion are 6 mm and 8 mm, respectively, the pressure in the HVOF system is the lowest at the chamber and the expanding gas is steadily maintained with both high velocity and high temperature for different equivalence ratios. Thus, relatively minor amendments of the equivalence ratio and the geometry of HVOF can lead to improved control over coating characteristics.