We study the steady state properties ofa genotype selection model in presence of correlated Gaussian whitenoise. The effect of the noise on the genotype selection model is discussed. It is found that correlated noise ...We study the steady state properties ofa genotype selection model in presence of correlated Gaussian whitenoise. The effect of the noise on the genotype selection model is discussed. It is found that correlated noise can breakthe balance of gene selection and induce the phase transition which can makes us select one type gene haploid from agene group.展开更多
This paper investigates a genotype selection model subjected to both a multiplicative coloured noise and an additive coloured noise with different correlation time τ1 and τ2 by means of the numerical technique. By d...This paper investigates a genotype selection model subjected to both a multiplicative coloured noise and an additive coloured noise with different correlation time τ1 and τ2 by means of the numerical technique. By directly simulating the Langevin Equation, the following results are obtained. (1) The multiplicative coloured noise dominates, however, the effect of the additive coloured noise is not neglected in the practical gene selection process. The selection rate μ decides that the selection is propitious to gene A haploid or gene B haploid. (2) The additive coloured noise intensity and the correlation time τ2 play opposite roles. It is noted that α and τ2 can not separate the single peak, while can make the peak disappear and ^-2 can make the peak be sharp. (3) The multiplicative coloured noise intensity D and the correlation time τ1 can induce phase transition, at the same time they play opposite roles and the reentrance phenomenon appears. In this case, it is easy to select one type haploid from the group with increasing D and decreasing τ1.展开更多
In 1994, Grove, Kocic, Ladas, and Levin conjectured that the local stability and global stability conditions of the fixed point -y= 1/2 in the genotype selection model should be equivalent. In this article, we give an...In 1994, Grove, Kocic, Ladas, and Levin conjectured that the local stability and global stability conditions of the fixed point -y= 1/2 in the genotype selection model should be equivalent. In this article, we give an affirmative answer to this conjecture and prove that local stability implies global stability. Some illustrative examples are included to demonstrate the validity and applicability of the results.展开更多
Selecting the optimal one from similar schemes is a paramount work in equipment design.In consideration of similarity of schemes and repetition of characteristic indices,the theory of set pair analysis(SPA)is proposed...Selecting the optimal one from similar schemes is a paramount work in equipment design.In consideration of similarity of schemes and repetition of characteristic indices,the theory of set pair analysis(SPA)is proposed,and then an optimal selection model is established.In order to improve the accuracy and flexibility,the model is modified by the contribution degree.At last,this model has been validated by an example,and the result demonstrates the method is feasible and valuable for practical usage.展开更多
This article attempted to construct a multi-factor quantitative stock selection model,analyze the financial indicators and transaction data of listed companies in detail via the big data statistical test method,and to...This article attempted to construct a multi-factor quantitative stock selection model,analyze the financial indicators and transaction data of listed companies in detail via the big data statistical test method,and to find out the alpha excess return relative to the market in the case of short stock index futures as a hedge in the Chinese market.展开更多
A group of agents are intimately cooperated to set the assessment indices, establish the weight of each index in overall result of evaluation, collect the experts' scores given to each available resource, and the man...A group of agents are intimately cooperated to set the assessment indices, establish the weight of each index in overall result of evaluation, collect the experts' scores given to each available resource, and the manufacturing resource whose overall assessment value is highest is taken as the optimal choice. Architecture of the proposed system is outlined and an example is offered to show the process of accomplishing the assessment.展开更多
Traditional methods for selecting models in experimental data analysis are susceptible to researcher bias, hindering exploration of alternative explanations and potentially leading to overfitting. The Finite Informati...Traditional methods for selecting models in experimental data analysis are susceptible to researcher bias, hindering exploration of alternative explanations and potentially leading to overfitting. The Finite Information Quantity (FIQ) approach offers a novel solution by acknowledging the inherent limitations in information processing capacity of physical systems. This framework facilitates the development of objective criteria for model selection (comparative uncertainty) and paves the way for a more comprehensive understanding of phenomena through exploring diverse explanations. This work presents a detailed comparison of the FIQ approach with ten established model selection methods, highlighting the advantages and limitations of each. We demonstrate the potential of FIQ to enhance the objectivity and robustness of scientific inquiry through three practical examples: selecting appropriate models for measuring fundamental constants, sound velocity, and underwater electrical discharges. Further research is warranted to explore the full applicability of FIQ across various scientific disciplines.展开更多
Soybean frogeye leaf spot(FLS) disease is a global disease affecting soybean yield, especially in the soybean growing area of Heilongjiang Province. In order to realize genomic selection breeding for FLS resistance of...Soybean frogeye leaf spot(FLS) disease is a global disease affecting soybean yield, especially in the soybean growing area of Heilongjiang Province. In order to realize genomic selection breeding for FLS resistance of soybean, least absolute shrinkage and selection operator(LASSO) regression and stepwise regression were combined, and a genomic selection model was established for 40 002 SNP markers covering soybean genome and relative lesion area of soybean FLS. As a result, 68 molecular markers controlling soybean FLS were detected accurately, and the phenotypic contribution rate of these markers reached 82.45%. In this study, a model was established, which could be used directly to evaluate the resistance of soybean FLS and to select excellent offspring. This research method could also provide ideas and methods for other plants to breeding in disease resistance.展开更多
The optimal selection of radar clutter model is the premise of target detection,tracking,recognition,and cognitive waveform design in clutter background.Clutter characterization models are usually derived by mathemati...The optimal selection of radar clutter model is the premise of target detection,tracking,recognition,and cognitive waveform design in clutter background.Clutter characterization models are usually derived by mathematical simplification or empirical data fitting.However,the lack of standard model labels is a challenge in the optimal selection process.To solve this problem,a general three-level evaluation system for the model selection performance is proposed,including model selection accuracy index based on simulation data,fit goodness indexs based on the optimally selected model,and evaluation index based on the supporting performance to its third-party.The three-level evaluation system can more comprehensively and accurately describe the selection performance of the radar clutter model in different ways,and can be popularized and applied to the evaluation of other similar characterization model selection.展开更多
In a competitive digital age where data volumes are increasing with time, the ability to extract meaningful knowledge from high-dimensional data using machine learning (ML) and data mining (DM) techniques and making d...In a competitive digital age where data volumes are increasing with time, the ability to extract meaningful knowledge from high-dimensional data using machine learning (ML) and data mining (DM) techniques and making decisions based on the extracted knowledge is becoming increasingly important in all business domains. Nevertheless, high-dimensional data remains a major challenge for classification algorithms due to its high computational cost and storage requirements. The 2016 Demographic and Health Survey of Ethiopia (EDHS 2016) used as the data source for this study which is publicly available contains several features that may not be relevant to the prediction task. In this paper, we developed a hybrid multidimensional metrics framework for predictive modeling for both model performance evaluation and feature selection to overcome the feature selection challenges and select the best model among the available models in DM and ML. The proposed hybrid metrics were used to measure the efficiency of the predictive models. Experimental results show that the decision tree algorithm is the most efficient model. The higher score of HMM (m, r) = 0.47 illustrates the overall significant model that encompasses almost all the user’s requirements, unlike the classical metrics that use a criterion to select the most appropriate model. On the other hand, the ANNs were found to be the most computationally intensive for our prediction task. Moreover, the type of data and the class size of the dataset (unbalanced data) have a significant impact on the efficiency of the model, especially on the computational cost, and the interpretability of the parameters of the model would be hampered. And the efficiency of the predictive model could be improved with other feature selection algorithms (especially hybrid metrics) considering the experts of the knowledge domain, as the understanding of the business domain has a significant impact.展开更多
The distributed and customized 3D printing can be realized by 3D printing services in a cloud manufacturing environment.As a growing number of 3D printers are becoming accessible on various 3D printing service platfor...The distributed and customized 3D printing can be realized by 3D printing services in a cloud manufacturing environment.As a growing number of 3D printers are becoming accessible on various 3D printing service platforms,there raises the concern over the validation of virtual product designs and their manufacturing procedures for novices as well as users with 3D printing experience before physical products are produced through the cloud platform.This paper presents a 3D model to help users validate their designs and requirements not only in the traditional digital 3D model properties like shape and size,but also in physical material properties and manufacturing properties when producing physical products like surface roughness,print accuracy and part cost.These properties are closely related to the process of 3D printing and materials.In order to establish the 3D model,the paper analyzes the model of the 3D printing process selection in the cloud platform.Triangular intuitionistic fuzzy numbers are applied to generate a set of 3D printers with the same process and material.Based on the 3D printing process selection model,users can establish the 3D model and validate their designs and requirements on physical material properties and manufacturing properties before printing physical products.展开更多
This paper briefs the configuration and performance of large size gas turbines and their composed combined cycle power plants designed and produced by four large renown gas turbine manufacturing firms in the world, pr...This paper briefs the configuration and performance of large size gas turbines and their composed combined cycle power plants designed and produced by four large renown gas turbine manufacturing firms in the world, providing reference for the relevant sectors and enterprises in importing advanced gas turbines and technologies.展开更多
The performance of six statistical approaches,which can be used for selection of the best model to describe the growth of individual fish,was analyzed using simulated and real length-at-age data.The six approaches inc...The performance of six statistical approaches,which can be used for selection of the best model to describe the growth of individual fish,was analyzed using simulated and real length-at-age data.The six approaches include coefficient of determination(R2),adjusted coefficient of determination(adj.-R2),root mean squared error(RMSE),Akaike's information criterion(AIC),bias correction of AIC(AICc) and Bayesian information criterion(BIC).The simulation data were generated by five growth models with different numbers of parameters.Four sets of real data were taken from the literature.The parameters in each of the five growth models were estimated using the maximum likelihood method under the assumption of the additive error structure for the data.The best supported model by the data was identified using each of the six approaches.The results show that R2 and RMSE have the same properties and perform worst.The sample size has an effect on the performance of adj.-R2,AIC,AICc and BIC.Adj.-R2 does better in small samples than in large samples.AIC is not suitable to use in small samples and tends to select more complex model when the sample size becomes large.AICc and BIC have best performance in small and large sample cases,respectively.Use of AICc or BIC is recommended for selection of fish growth model according to the size of the length-at-age data.展开更多
The traditional model selection criterions try to make a balance between fitted error and model complexity. Assumptions on the distribution of the response or the noise, which may be misspecified, should be made befor...The traditional model selection criterions try to make a balance between fitted error and model complexity. Assumptions on the distribution of the response or the noise, which may be misspecified, should be made before using the traditional ones. In this ar- ticle, we give a new model selection criterion, based on the assumption that noise term in the model is independent with explanatory variables, of minimizing the association strength between regression residuals and the response, with fewer assumptions. Maximal Information Coe^cient (MIC), a recently proposed dependence measure, captures a wide range of associ- ations, and gives almost the same score to different type of relationships with equal noise, so MIC is used to measure the association strength. Furthermore, partial maximal information coefficient (PMIC) is introduced to capture the association between two variables removing a third controlling random variable. In addition, the definition of general partial relationship is given.展开更多
Covariance functions have been proposed as an alternative to model longitudinal data in animal breeding because of their various merits in comparison to the classical analytical methods.In practical estimation,differe...Covariance functions have been proposed as an alternative to model longitudinal data in animal breeding because of their various merits in comparison to the classical analytical methods.In practical estimation,different models and polynomial orders fitted can influence the estimates of covariance functions and thus genetic parameters.The objective of this study was to select model for estimation of covariance functions for body weights of Angora goats at 7 time points.Covariance functions were estimated by fitting 6 random regression models with birth year,birth month,sex,age of dam,birth type,and relative birth date as fixed effects.Random effects involved were direct and maternal additive genetic,and animal and maternal permanent environmental effects with different orders of fit.Selection of model and orders of fit were carried out by likelihood ratio test and 4 types of information criteria.The results showed that model with 6 orders of polynomial fit for direct additive genetic and animal permanent environmental effects and 4 and 5 orders for maternal genetic and permanent environmental effects,respectively,were preferable for estimation of covariance functions.Models with and without maternal effects influenced the estimates of covariance functions greatly.Maternal permanent environmental effect does not explain the variation of all permanent environments,well suggesting different sources of permanent environmental effects also has large influence on covariance function estimates.展开更多
This study focuses on meeting the challenges of big data visualization by using of data reduction methods based the feature selection methods.To reduce the volume of big data and minimize model training time(Tt)while ...This study focuses on meeting the challenges of big data visualization by using of data reduction methods based the feature selection methods.To reduce the volume of big data and minimize model training time(Tt)while maintaining data quality.We contributed to meeting the challenges of big data visualization using the embedded method based“Select from model(SFM)”method by using“Random forest Importance algorithm(RFI)”and comparing it with the filter method by using“Select percentile(SP)”method based chi square“Chi2”tool for selecting the most important features,which are then fed into a classification process using the logistic regression(LR)algorithm and the k-nearest neighbor(KNN)algorithm.Thus,the classification accuracy(AC)performance of LRis also compared to theKNN approach in python on eight data sets to see which method produces the best rating when feature selection methods are applied.Consequently,the study concluded that the feature selection methods have a significant impact on the analysis and visualization of the data after removing the repetitive data and the data that do not affect the goal.After making several comparisons,the study suggests(SFMLR)using SFM based on RFI algorithm for feature selection,with LR algorithm for data classify.The proposal proved its efficacy by comparing its results with recent literature.展开更多
It is quite common in statistical modeling to select a model and make inference as if the model had been known in advance;i.e. ignoring model selection uncertainty. The resulted estimator is called post-model selectio...It is quite common in statistical modeling to select a model and make inference as if the model had been known in advance;i.e. ignoring model selection uncertainty. The resulted estimator is called post-model selection estimator (PMSE) whose properties are hard to derive. Conditioning on data at hand (as it is usually the case), Bayesian model selection is free of this phenomenon. This paper is concerned with the properties of Bayesian estimator obtained after model selection when the frequentist (long run) performances of the resulted Bayesian estimator are of interest. The proposed method, using Bayesian decision theory, is based on the well known Bayesian model averaging (BMA)’s machinery;and outperforms PMSE and BMA. It is shown that if the unconditional model selection probability is equal to model prior, then the proposed approach reduces BMA. The method is illustrated using Bernoulli trials.展开更多
This paper proposes a new search strategy using mutative scale chaos optimization algorithm (MSCO) for model selection of support vector machine (SVM). It searches the parameter space of SVM with a very high effic...This paper proposes a new search strategy using mutative scale chaos optimization algorithm (MSCO) for model selection of support vector machine (SVM). It searches the parameter space of SVM with a very high efficiency and finds the optimum parameter setting for a practical classification problem with very low time cost. To demonstrate the performance of the proposed method it is applied to model selection of SVM in ultrasonic flaw classification and compared with grid search for model selection. Experimental results show that MSCO is a very powerful tool for model selection of SVM, and outperforms grid search in search speed and precision in ultrasonic flaw classification.展开更多
To solve the medium and long term power load forecasting problem,the combination forecasting method is further expanded and a weighted combination forecasting model for power load is put forward.This model is divided ...To solve the medium and long term power load forecasting problem,the combination forecasting method is further expanded and a weighted combination forecasting model for power load is put forward.This model is divided into two stages which are forecasting model selection and weighted combination forecasting.Based on Markov chain conversion and cloud model,the forecasting model selection is implanted and several outstanding models are selected for the combination forecasting.For the weighted combination forecasting,a fuzzy scale joint evaluation method is proposed to determine the weight of selected forecasting model.The percentage error and mean absolute percentage error of weighted combination forecasting result of the power consumption in a certain area of China are 0.7439%and 0.3198%,respectively,while the maximum values of these two indexes of single forecasting models are 5.2278%and 1.9497%.It shows that the forecasting indexes of proposed model are improved significantly compared with the single forecasting models.展开更多
Regional climate change impact assessments are becoming increasingly important for developing adaptation strategies in an uncertain future with respect to hydro-climatic extremes. There are a number of Global Climate ...Regional climate change impact assessments are becoming increasingly important for developing adaptation strategies in an uncertain future with respect to hydro-climatic extremes. There are a number of Global Climate Models (GCMs) and emission scenarios providing predictions of future changes in climate. As a result, there is a level of uncertainty associated with the decision of which climate models to use for the assessment of climate change impacts. The IPCC has recommended using as many global climate model scenarios as possible;however, this approach may be impractical for regional assessments that are computationally demanding. Methods have been developed to select climate model scenarios, generally consisting of selecting a model with the highest skill (validation), creating an ensemble, or selecting one or more extremes. Validation methods limit analyses to models with higher skill in simulating historical climate, ensemble methods typically take multi model means, median, or percentiles, and extremes methods tend to use scenarios which bound the projected changes in precipitation and temperature. In this paper a quantile regression based validation method is developed and applied to generate a reduced set of GCM-scenarios to analyze daily maximum streamflow uncertainty in the Upper Thames River Basin, Canada, while extremes and percentile ensemble approaches are also used for comparison. Results indicate that the validation method was able to effectively rank and reduce the set of scenarios, while the extremes and percentile ensemble methods were found not to necessarily correlate well with the range of extreme flows for all calendar months and return periods.展开更多
文摘We study the steady state properties ofa genotype selection model in presence of correlated Gaussian whitenoise. The effect of the noise on the genotype selection model is discussed. It is found that correlated noise can breakthe balance of gene selection and induce the phase transition which can makes us select one type gene haploid from agene group.
基金Project supported by the Natural Science Foundation of Yunnan province of China (Grant No 2006A0002M)the Science Foundation of Baoji University of Science and Arts of China (Grant No Zk0697)
文摘This paper investigates a genotype selection model subjected to both a multiplicative coloured noise and an additive coloured noise with different correlation time τ1 and τ2 by means of the numerical technique. By directly simulating the Langevin Equation, the following results are obtained. (1) The multiplicative coloured noise dominates, however, the effect of the additive coloured noise is not neglected in the practical gene selection process. The selection rate μ decides that the selection is propitious to gene A haploid or gene B haploid. (2) The additive coloured noise intensity and the correlation time τ2 play opposite roles. It is noted that α and τ2 can not separate the single peak, while can make the peak disappear and ^-2 can make the peak be sharp. (3) The multiplicative coloured noise intensity D and the correlation time τ1 can induce phase transition, at the same time they play opposite roles and the reentrance phenomenon appears. In this case, it is easy to select one type haploid from the group with increasing D and decreasing τ1.
基金the Deanship of Scientific in King Saud University and Centre of Research in Faculty of Science for their encouragements and their support
文摘In 1994, Grove, Kocic, Ladas, and Levin conjectured that the local stability and global stability conditions of the fixed point -y= 1/2 in the genotype selection model should be equivalent. In this article, we give an affirmative answer to this conjecture and prove that local stability implies global stability. Some illustrative examples are included to demonstrate the validity and applicability of the results.
文摘Selecting the optimal one from similar schemes is a paramount work in equipment design.In consideration of similarity of schemes and repetition of characteristic indices,the theory of set pair analysis(SPA)is proposed,and then an optimal selection model is established.In order to improve the accuracy and flexibility,the model is modified by the contribution degree.At last,this model has been validated by an example,and the result demonstrates the method is feasible and valuable for practical usage.
基金Supported by National Natural Science Foundation of China(11961005)Guangdong Province General University Characteristic Innovation Project(2018KTSCX253).
文摘This article attempted to construct a multi-factor quantitative stock selection model,analyze the financial indicators and transaction data of listed companies in detail via the big data statistical test method,and to find out the alpha excess return relative to the market in the case of short stock index futures as a hedge in the Chinese market.
基金Supported by Foundation from Key Lab of Digital Manufacturing of Hubei Province.(SZ0608)
文摘A group of agents are intimately cooperated to set the assessment indices, establish the weight of each index in overall result of evaluation, collect the experts' scores given to each available resource, and the manufacturing resource whose overall assessment value is highest is taken as the optimal choice. Architecture of the proposed system is outlined and an example is offered to show the process of accomplishing the assessment.
文摘Traditional methods for selecting models in experimental data analysis are susceptible to researcher bias, hindering exploration of alternative explanations and potentially leading to overfitting. The Finite Information Quantity (FIQ) approach offers a novel solution by acknowledging the inherent limitations in information processing capacity of physical systems. This framework facilitates the development of objective criteria for model selection (comparative uncertainty) and paves the way for a more comprehensive understanding of phenomena through exploring diverse explanations. This work presents a detailed comparison of the FIQ approach with ten established model selection methods, highlighting the advantages and limitations of each. We demonstrate the potential of FIQ to enhance the objectivity and robustness of scientific inquiry through three practical examples: selecting appropriate models for measuring fundamental constants, sound velocity, and underwater electrical discharges. Further research is warranted to explore the full applicability of FIQ across various scientific disciplines.
基金Supported by the National Key Research and Development Program of China(2021YFD1201103-01-05)。
文摘Soybean frogeye leaf spot(FLS) disease is a global disease affecting soybean yield, especially in the soybean growing area of Heilongjiang Province. In order to realize genomic selection breeding for FLS resistance of soybean, least absolute shrinkage and selection operator(LASSO) regression and stepwise regression were combined, and a genomic selection model was established for 40 002 SNP markers covering soybean genome and relative lesion area of soybean FLS. As a result, 68 molecular markers controlling soybean FLS were detected accurately, and the phenotypic contribution rate of these markers reached 82.45%. In this study, a model was established, which could be used directly to evaluate the resistance of soybean FLS and to select excellent offspring. This research method could also provide ideas and methods for other plants to breeding in disease resistance.
基金the National Natural Science Foundation of China(6187138461921001).
文摘The optimal selection of radar clutter model is the premise of target detection,tracking,recognition,and cognitive waveform design in clutter background.Clutter characterization models are usually derived by mathematical simplification or empirical data fitting.However,the lack of standard model labels is a challenge in the optimal selection process.To solve this problem,a general three-level evaluation system for the model selection performance is proposed,including model selection accuracy index based on simulation data,fit goodness indexs based on the optimally selected model,and evaluation index based on the supporting performance to its third-party.The three-level evaluation system can more comprehensively and accurately describe the selection performance of the radar clutter model in different ways,and can be popularized and applied to the evaluation of other similar characterization model selection.
文摘In a competitive digital age where data volumes are increasing with time, the ability to extract meaningful knowledge from high-dimensional data using machine learning (ML) and data mining (DM) techniques and making decisions based on the extracted knowledge is becoming increasingly important in all business domains. Nevertheless, high-dimensional data remains a major challenge for classification algorithms due to its high computational cost and storage requirements. The 2016 Demographic and Health Survey of Ethiopia (EDHS 2016) used as the data source for this study which is publicly available contains several features that may not be relevant to the prediction task. In this paper, we developed a hybrid multidimensional metrics framework for predictive modeling for both model performance evaluation and feature selection to overcome the feature selection challenges and select the best model among the available models in DM and ML. The proposed hybrid metrics were used to measure the efficiency of the predictive models. Experimental results show that the decision tree algorithm is the most efficient model. The higher score of HMM (m, r) = 0.47 illustrates the overall significant model that encompasses almost all the user’s requirements, unlike the classical metrics that use a criterion to select the most appropriate model. On the other hand, the ANNs were found to be the most computationally intensive for our prediction task. Moreover, the type of data and the class size of the dataset (unbalanced data) have a significant impact on the efficiency of the model, especially on the computational cost, and the interpretability of the parameters of the model would be hampered. And the efficiency of the predictive model could be improved with other feature selection algorithms (especially hybrid metrics) considering the experts of the knowledge domain, as the understanding of the business domain has a significant impact.
基金the National High-Tech Research and Development Plan of China under Grant No.2015AA042101 and Fund of State Key Laboratory of Intelligent Manufacturing System Technology in China.
文摘The distributed and customized 3D printing can be realized by 3D printing services in a cloud manufacturing environment.As a growing number of 3D printers are becoming accessible on various 3D printing service platforms,there raises the concern over the validation of virtual product designs and their manufacturing procedures for novices as well as users with 3D printing experience before physical products are produced through the cloud platform.This paper presents a 3D model to help users validate their designs and requirements not only in the traditional digital 3D model properties like shape and size,but also in physical material properties and manufacturing properties when producing physical products like surface roughness,print accuracy and part cost.These properties are closely related to the process of 3D printing and materials.In order to establish the 3D model,the paper analyzes the model of the 3D printing process selection in the cloud platform.Triangular intuitionistic fuzzy numbers are applied to generate a set of 3D printers with the same process and material.Based on the 3D printing process selection model,users can establish the 3D model and validate their designs and requirements on physical material properties and manufacturing properties before printing physical products.
文摘This paper briefs the configuration and performance of large size gas turbines and their composed combined cycle power plants designed and produced by four large renown gas turbine manufacturing firms in the world, providing reference for the relevant sectors and enterprises in importing advanced gas turbines and technologies.
基金Supported by the High Technology Research and Development Program of China (863 Program,No2006AA100301)
文摘The performance of six statistical approaches,which can be used for selection of the best model to describe the growth of individual fish,was analyzed using simulated and real length-at-age data.The six approaches include coefficient of determination(R2),adjusted coefficient of determination(adj.-R2),root mean squared error(RMSE),Akaike's information criterion(AIC),bias correction of AIC(AICc) and Bayesian information criterion(BIC).The simulation data were generated by five growth models with different numbers of parameters.Four sets of real data were taken from the literature.The parameters in each of the five growth models were estimated using the maximum likelihood method under the assumption of the additive error structure for the data.The best supported model by the data was identified using each of the six approaches.The results show that R2 and RMSE have the same properties and perform worst.The sample size has an effect on the performance of adj.-R2,AIC,AICc and BIC.Adj.-R2 does better in small samples than in large samples.AIC is not suitable to use in small samples and tends to select more complex model when the sample size becomes large.AICc and BIC have best performance in small and large sample cases,respectively.Use of AICc or BIC is recommended for selection of fish growth model according to the size of the length-at-age data.
基金partly supported by National Basic Research Program of China(973 Program,2011CB707802,2013CB910200)National Science Foundation of China(11201466)
文摘The traditional model selection criterions try to make a balance between fitted error and model complexity. Assumptions on the distribution of the response or the noise, which may be misspecified, should be made before using the traditional ones. In this ar- ticle, we give a new model selection criterion, based on the assumption that noise term in the model is independent with explanatory variables, of minimizing the association strength between regression residuals and the response, with fewer assumptions. Maximal Information Coe^cient (MIC), a recently proposed dependence measure, captures a wide range of associ- ations, and gives almost the same score to different type of relationships with equal noise, so MIC is used to measure the association strength. Furthermore, partial maximal information coefficient (PMIC) is introduced to capture the association between two variables removing a third controlling random variable. In addition, the definition of general partial relationship is given.
基金funded by the Young Academic Leaders Supporting Project in Institutions of Higher Education of Shanxi Province,China
文摘Covariance functions have been proposed as an alternative to model longitudinal data in animal breeding because of their various merits in comparison to the classical analytical methods.In practical estimation,different models and polynomial orders fitted can influence the estimates of covariance functions and thus genetic parameters.The objective of this study was to select model for estimation of covariance functions for body weights of Angora goats at 7 time points.Covariance functions were estimated by fitting 6 random regression models with birth year,birth month,sex,age of dam,birth type,and relative birth date as fixed effects.Random effects involved were direct and maternal additive genetic,and animal and maternal permanent environmental effects with different orders of fit.Selection of model and orders of fit were carried out by likelihood ratio test and 4 types of information criteria.The results showed that model with 6 orders of polynomial fit for direct additive genetic and animal permanent environmental effects and 4 and 5 orders for maternal genetic and permanent environmental effects,respectively,were preferable for estimation of covariance functions.Models with and without maternal effects influenced the estimates of covariance functions greatly.Maternal permanent environmental effect does not explain the variation of all permanent environments,well suggesting different sources of permanent environmental effects also has large influence on covariance function estimates.
文摘This study focuses on meeting the challenges of big data visualization by using of data reduction methods based the feature selection methods.To reduce the volume of big data and minimize model training time(Tt)while maintaining data quality.We contributed to meeting the challenges of big data visualization using the embedded method based“Select from model(SFM)”method by using“Random forest Importance algorithm(RFI)”and comparing it with the filter method by using“Select percentile(SP)”method based chi square“Chi2”tool for selecting the most important features,which are then fed into a classification process using the logistic regression(LR)algorithm and the k-nearest neighbor(KNN)algorithm.Thus,the classification accuracy(AC)performance of LRis also compared to theKNN approach in python on eight data sets to see which method produces the best rating when feature selection methods are applied.Consequently,the study concluded that the feature selection methods have a significant impact on the analysis and visualization of the data after removing the repetitive data and the data that do not affect the goal.After making several comparisons,the study suggests(SFMLR)using SFM based on RFI algorithm for feature selection,with LR algorithm for data classify.The proposal proved its efficacy by comparing its results with recent literature.
文摘It is quite common in statistical modeling to select a model and make inference as if the model had been known in advance;i.e. ignoring model selection uncertainty. The resulted estimator is called post-model selection estimator (PMSE) whose properties are hard to derive. Conditioning on data at hand (as it is usually the case), Bayesian model selection is free of this phenomenon. This paper is concerned with the properties of Bayesian estimator obtained after model selection when the frequentist (long run) performances of the resulted Bayesian estimator are of interest. The proposed method, using Bayesian decision theory, is based on the well known Bayesian model averaging (BMA)’s machinery;and outperforms PMSE and BMA. It is shown that if the unconditional model selection probability is equal to model prior, then the proposed approach reduces BMA. The method is illustrated using Bernoulli trials.
基金Project supported by National High-Technology Research and De-velopment Program of China (Grant No .863-2001AA602021)
文摘This paper proposes a new search strategy using mutative scale chaos optimization algorithm (MSCO) for model selection of support vector machine (SVM). It searches the parameter space of SVM with a very high efficiency and finds the optimum parameter setting for a practical classification problem with very low time cost. To demonstrate the performance of the proposed method it is applied to model selection of SVM in ultrasonic flaw classification and compared with grid search for model selection. Experimental results show that MSCO is a very powerful tool for model selection of SVM, and outperforms grid search in search speed and precision in ultrasonic flaw classification.
文摘To solve the medium and long term power load forecasting problem,the combination forecasting method is further expanded and a weighted combination forecasting model for power load is put forward.This model is divided into two stages which are forecasting model selection and weighted combination forecasting.Based on Markov chain conversion and cloud model,the forecasting model selection is implanted and several outstanding models are selected for the combination forecasting.For the weighted combination forecasting,a fuzzy scale joint evaluation method is proposed to determine the weight of selected forecasting model.The percentage error and mean absolute percentage error of weighted combination forecasting result of the power consumption in a certain area of China are 0.7439%and 0.3198%,respectively,while the maximum values of these two indexes of single forecasting models are 5.2278%and 1.9497%.It shows that the forecasting indexes of proposed model are improved significantly compared with the single forecasting models.
文摘Regional climate change impact assessments are becoming increasingly important for developing adaptation strategies in an uncertain future with respect to hydro-climatic extremes. There are a number of Global Climate Models (GCMs) and emission scenarios providing predictions of future changes in climate. As a result, there is a level of uncertainty associated with the decision of which climate models to use for the assessment of climate change impacts. The IPCC has recommended using as many global climate model scenarios as possible;however, this approach may be impractical for regional assessments that are computationally demanding. Methods have been developed to select climate model scenarios, generally consisting of selecting a model with the highest skill (validation), creating an ensemble, or selecting one or more extremes. Validation methods limit analyses to models with higher skill in simulating historical climate, ensemble methods typically take multi model means, median, or percentiles, and extremes methods tend to use scenarios which bound the projected changes in precipitation and temperature. In this paper a quantile regression based validation method is developed and applied to generate a reduced set of GCM-scenarios to analyze daily maximum streamflow uncertainty in the Upper Thames River Basin, Canada, while extremes and percentile ensemble approaches are also used for comparison. Results indicate that the validation method was able to effectively rank and reduce the set of scenarios, while the extremes and percentile ensemble methods were found not to necessarily correlate well with the range of extreme flows for all calendar months and return periods.