Widely used deep neural networks currently face limitations in achieving optimal performance for purchase intention prediction due to constraints on data volume and hyperparameter selection.To address this issue,based...Widely used deep neural networks currently face limitations in achieving optimal performance for purchase intention prediction due to constraints on data volume and hyperparameter selection.To address this issue,based on the deep forest algorithm and further integrating evolutionary ensemble learning methods,this paper proposes a novel Deep Adaptive Evolutionary Ensemble(DAEE)model.This model introduces model diversity into the cascade layer,allowing it to adaptively adjust its structure to accommodate complex and evolving purchasing behavior patterns.Moreover,this paper optimizes the methods of obtaining feature vectors,enhancement vectors,and prediction results within the deep forest algorithm to enhance the model’s predictive accuracy.Results demonstrate that the improved deep forest model not only possesses higher robustness but also shows an increase of 5.02%in AUC value compared to the baseline model.Furthermore,its training runtime speed is 6 times faster than that of deep models,and compared to other improved models,its accuracy has been enhanced by 0.9%.展开更多
Soybean frogeye leaf spot(FLS) disease is a global disease affecting soybean yield, especially in the soybean growing area of Heilongjiang Province. In order to realize genomic selection breeding for FLS resistance of...Soybean frogeye leaf spot(FLS) disease is a global disease affecting soybean yield, especially in the soybean growing area of Heilongjiang Province. In order to realize genomic selection breeding for FLS resistance of soybean, least absolute shrinkage and selection operator(LASSO) regression and stepwise regression were combined, and a genomic selection model was established for 40 002 SNP markers covering soybean genome and relative lesion area of soybean FLS. As a result, 68 molecular markers controlling soybean FLS were detected accurately, and the phenotypic contribution rate of these markers reached 82.45%. In this study, a model was established, which could be used directly to evaluate the resistance of soybean FLS and to select excellent offspring. This research method could also provide ideas and methods for other plants to breeding in disease resistance.展开更多
Government credibility is an important asset of contemporary national governance, an important criterion for evaluating government legitimacy, and a key factor in measuring the effectiveness of government governance. ...Government credibility is an important asset of contemporary national governance, an important criterion for evaluating government legitimacy, and a key factor in measuring the effectiveness of government governance. In recent years, researchers’ research on government credibility has mostly focused on exploring theories and mechanisms, with little empirical research on this topic. This article intends to apply variable selection models in the field of social statistics to the issue of government credibility, in order to achieve empirical research on government credibility and explore its core influencing factors from a statistical perspective. Specifically, this article intends to use four regression-analysis-based methods and three random-forest-based methods to study the influencing factors of government credibility in various provinces in China, and compare the performance of these seven variable selection methods in different dimensions. The research results show that there are certain differences in simplicity, accuracy, and variable importance ranking among different variable selection methods, which present different importance in the study of government credibility issues. This study provides a methodological reference for variable selection models in the field of social science research, and also offers a multidimensional comparative perspective for analyzing the influencing factors of government credibility.展开更多
Traditional methods for selecting models in experimental data analysis are susceptible to researcher bias, hindering exploration of alternative explanations and potentially leading to overfitting. The Finite Informati...Traditional methods for selecting models in experimental data analysis are susceptible to researcher bias, hindering exploration of alternative explanations and potentially leading to overfitting. The Finite Information Quantity (FIQ) approach offers a novel solution by acknowledging the inherent limitations in information processing capacity of physical systems. This framework facilitates the development of objective criteria for model selection (comparative uncertainty) and paves the way for a more comprehensive understanding of phenomena through exploring diverse explanations. This work presents a detailed comparison of the FIQ approach with ten established model selection methods, highlighting the advantages and limitations of each. We demonstrate the potential of FIQ to enhance the objectivity and robustness of scientific inquiry through three practical examples: selecting appropriate models for measuring fundamental constants, sound velocity, and underwater electrical discharges. Further research is warranted to explore the full applicability of FIQ across various scientific disciplines.展开更多
The performance of six statistical approaches,which can be used for selection of the best model to describe the growth of individual fish,was analyzed using simulated and real length-at-age data.The six approaches inc...The performance of six statistical approaches,which can be used for selection of the best model to describe the growth of individual fish,was analyzed using simulated and real length-at-age data.The six approaches include coefficient of determination(R2),adjusted coefficient of determination(adj.-R2),root mean squared error(RMSE),Akaike's information criterion(AIC),bias correction of AIC(AICc) and Bayesian information criterion(BIC).The simulation data were generated by five growth models with different numbers of parameters.Four sets of real data were taken from the literature.The parameters in each of the five growth models were estimated using the maximum likelihood method under the assumption of the additive error structure for the data.The best supported model by the data was identified using each of the six approaches.The results show that R2 and RMSE have the same properties and perform worst.The sample size has an effect on the performance of adj.-R2,AIC,AICc and BIC.Adj.-R2 does better in small samples than in large samples.AIC is not suitable to use in small samples and tends to select more complex model when the sample size becomes large.AICc and BIC have best performance in small and large sample cases,respectively.Use of AICc or BIC is recommended for selection of fish growth model according to the size of the length-at-age data.展开更多
Peak ground acceleration(PGA) estimation is an important task in earthquake engineering practice.One of the most well-known models is the Boore-Joyner-Fumal formula,which estimates the PGA using the moment magnitude,t...Peak ground acceleration(PGA) estimation is an important task in earthquake engineering practice.One of the most well-known models is the Boore-Joyner-Fumal formula,which estimates the PGA using the moment magnitude,the site-to-fault distance and the site foundation properties.In the present study,the complexity for this formula and the homogeneity assumption for the prediction-error variance are investigated and an effi ciency-robustness balanced formula is proposed.For this purpose,a reduced-order Monte Carlo simulation algorithm for Bayesian model class selection is presented to obtain the most suitable predictive formula and prediction-error model for the seismic attenuation relationship.In this approach,each model class(a predictive formula with a prediction-error model) is evaluated according to its plausibility given the data.The one with the highest plausibility is robust since it possesses the optimal balance between the data fi tting capability and the sensitivity to noise.A database of strong ground motion records in the Tangshan region of China is obtained from the China Earthquake Data Center for the analysis.The optimal predictive formula is proposed based on this database.It is shown that the proposed formula with heterogeneous prediction-error variance is much simpler than the attenuation model suggested by Boore,Joyner and Fumal(1993).展开更多
Rodents have been widely used in the production of cerebral ischemia models. However, successful therapies have been proven on experimental rodent stroke model, and they have often failed to be effective when tested c...Rodents have been widely used in the production of cerebral ischemia models. However, successful therapies have been proven on experimental rodent stroke model, and they have often failed to be effective when tested clinically. Therefore, nonhuman primates were recommended as the ideal alternatives, owing to their similarities with the human cerebrovascular system, brain metabolism, grey to white matter ratio and even their rich behavioral repertoire. The present review is a thorough summary of ten methods that establish nonhuman primate models of focal cerebral ischemia; electrocoagulation, endothelin-1-induced occlusion, microvascular clip occlusion, autologous blood clot embolization, balloon inflation, microcatheter embolization, coil embolization, surgical suture embolization, suture, and photochemical induction methods. This review addresses the advantages and disadvantages of each method, as well as precautions for each model, compared nonhuman primates with rodents, different species of nonhuman primates and different modeling methods. Finally it discusses various factors that need to be considered when modelling and the method of evaluation after modelling. These are critical for understanding their respective strengths and weaknesses and underlie the selection of the optimum model.展开更多
Evaluation of numerical earthquake forecasting models needs to consider two issues of equal importance:the application scenario of the simulation,and the complexity of the model.Criterion of the evaluation-based model...Evaluation of numerical earthquake forecasting models needs to consider two issues of equal importance:the application scenario of the simulation,and the complexity of the model.Criterion of the evaluation-based model selection faces some interesting problems in need of discussion.展开更多
Time-series-based forecasting is essential to determine how past events affect future events. This paper compares the performance accuracy of different time-series models for oil prices. Three types of univariate mode...Time-series-based forecasting is essential to determine how past events affect future events. This paper compares the performance accuracy of different time-series models for oil prices. Three types of univariate models are discussed: the exponential smoothing (ES), Holt-Winters (HW) and autoregressive intergrade moving average (ARIMA) models. To determine the best model, six different strategies were applied as selection criteria to quantify these models’ prediction accuracies. This comparison should help policy makers and industry marketing strategists select the best forecasting method in oil market. The three models were compared by applying them to the time series of regular oil prices for West Texas Intermediate (WTI) crude. The comparison indicated that the HW model performed better than the ES model for a prediction with a confidence interval of 95%. However, the ARIMA (2, 1, 2) model yielded the best results, leading us to conclude that this sophisticated and robust model outperformed other simple yet flexible models in oil market.展开更多
We have compiled a sample of two subsets of AGN selected from their optical and X ray data. The first subset was selected for very broad and/or peculiar optical emission line profiles, the second for a high X ray flux...We have compiled a sample of two subsets of AGN selected from their optical and X ray data. The first subset was selected for very broad and/or peculiar optical emission line profiles, the second for a high X ray flux. Here we will discuss properties of these galaxies and show that both subsets are very similar in the multi wavelength view. Furthermore, we will discuss differences between the two subsets and their implications for a Unified Model of AGN.展开更多
This study focuses on meeting the challenges of big data visualization by using of data reduction methods based the feature selection methods.To reduce the volume of big data and minimize model training time(Tt)while ...This study focuses on meeting the challenges of big data visualization by using of data reduction methods based the feature selection methods.To reduce the volume of big data and minimize model training time(Tt)while maintaining data quality.We contributed to meeting the challenges of big data visualization using the embedded method based“Select from model(SFM)”method by using“Random forest Importance algorithm(RFI)”and comparing it with the filter method by using“Select percentile(SP)”method based chi square“Chi2”tool for selecting the most important features,which are then fed into a classification process using the logistic regression(LR)algorithm and the k-nearest neighbor(KNN)algorithm.Thus,the classification accuracy(AC)performance of LRis also compared to theKNN approach in python on eight data sets to see which method produces the best rating when feature selection methods are applied.Consequently,the study concluded that the feature selection methods have a significant impact on the analysis and visualization of the data after removing the repetitive data and the data that do not affect the goal.After making several comparisons,the study suggests(SFMLR)using SFM based on RFI algorithm for feature selection,with LR algorithm for data classify.The proposal proved its efficacy by comparing its results with recent literature.展开更多
The optimal selection of radar clutter model is the premise of target detection,tracking,recognition,and cognitive waveform design in clutter background.Clutter characterization models are usually derived by mathemati...The optimal selection of radar clutter model is the premise of target detection,tracking,recognition,and cognitive waveform design in clutter background.Clutter characterization models are usually derived by mathematical simplification or empirical data fitting.However,the lack of standard model labels is a challenge in the optimal selection process.To solve this problem,a general three-level evaluation system for the model selection performance is proposed,including model selection accuracy index based on simulation data,fit goodness indexs based on the optimally selected model,and evaluation index based on the supporting performance to its third-party.The three-level evaluation system can more comprehensively and accurately describe the selection performance of the radar clutter model in different ways,and can be popularized and applied to the evaluation of other similar characterization model selection.展开更多
In a competitive digital age where data volumes are increasing with time, the ability to extract meaningful knowledge from high-dimensional data using machine learning (ML) and data mining (DM) techniques and making d...In a competitive digital age where data volumes are increasing with time, the ability to extract meaningful knowledge from high-dimensional data using machine learning (ML) and data mining (DM) techniques and making decisions based on the extracted knowledge is becoming increasingly important in all business domains. Nevertheless, high-dimensional data remains a major challenge for classification algorithms due to its high computational cost and storage requirements. The 2016 Demographic and Health Survey of Ethiopia (EDHS 2016) used as the data source for this study which is publicly available contains several features that may not be relevant to the prediction task. In this paper, we developed a hybrid multidimensional metrics framework for predictive modeling for both model performance evaluation and feature selection to overcome the feature selection challenges and select the best model among the available models in DM and ML. The proposed hybrid metrics were used to measure the efficiency of the predictive models. Experimental results show that the decision tree algorithm is the most efficient model. The higher score of HMM (m, r) = 0.47 illustrates the overall significant model that encompasses almost all the user’s requirements, unlike the classical metrics that use a criterion to select the most appropriate model. On the other hand, the ANNs were found to be the most computationally intensive for our prediction task. Moreover, the type of data and the class size of the dataset (unbalanced data) have a significant impact on the efficiency of the model, especially on the computational cost, and the interpretability of the parameters of the model would be hampered. And the efficiency of the predictive model could be improved with other feature selection algorithms (especially hybrid metrics) considering the experts of the knowledge domain, as the understanding of the business domain has a significant impact.展开更多
An improved Gaussian mixture model (GMM)- based clustering method is proposed for the difficult case where the true distribution of data is against the assumed GMM. First, an improved model selection criterion, the ...An improved Gaussian mixture model (GMM)- based clustering method is proposed for the difficult case where the true distribution of data is against the assumed GMM. First, an improved model selection criterion, the completed likelihood minimum message length criterion, is derived. It can measure both the goodness-of-fit of the candidate GMM to the data and the goodness-of-partition of the data. Secondly, by utilizing the proposed criterion as the clustering objective function, an improved expectation- maximization (EM) algorithm is developed, which can avoid poor local optimal solutions compared to the standard EM algorithm for estimating the model parameters. The experimental results demonstrate that the proposed method can rectify the over-fitting tendency of representative GMM-based clustering approaches and can robustly provide more accurate clustering results.展开更多
Dynamic casual modeling of functional magnetic resonance imaging(fMRI) signals is employed to explore critical emotional neurocircuitry under sad stimuli. The intrinsic model of emotional loops is built on the basis...Dynamic casual modeling of functional magnetic resonance imaging(fMRI) signals is employed to explore critical emotional neurocircuitry under sad stimuli. The intrinsic model of emotional loops is built on the basis of Papez's circuit and related prior knowledge, and then three modulatory connection models are established. In these models, stimuli are placed at different points, which represents they affect the neural activities between brain regions, and these activities are modulated in different ways. Then, the optimal model is selected by Bayesian model comparison. From group analysis, patients' intrinsic and modulatory connections from the anterior cingulate cortex (ACC) to the right inferior frontal gyrus (rlFG) are significantly higher than those of the control group. Then the functional connection parameters of the model are selected as classifier features. The classification accuracy rate from the support vector machine(SVM) classifier is 80.73%, which, to some extent, validates the effectiveness of the regional connectivity parameters for depression recognition and provides a new approach for the clinical diagnosis of depression.展开更多
This paper briefs the configuration and performance of large size gas turbines and their composed combined cycle power plants designed and produced by four large renown gas turbine manufacturing firms in the world, pr...This paper briefs the configuration and performance of large size gas turbines and their composed combined cycle power plants designed and produced by four large renown gas turbine manufacturing firms in the world, providing reference for the relevant sectors and enterprises in importing advanced gas turbines and technologies.展开更多
SVMs(support vector machines) is a new artificial intelligence methodology derived from Vapnik's statistical learning theory, which has better generalization than artificial neural network. A Csupport vector classi...SVMs(support vector machines) is a new artificial intelligence methodology derived from Vapnik's statistical learning theory, which has better generalization than artificial neural network. A Csupport vector classifiers Based Fault Diagnostic Model (CBFDM) which gives the 3 most possible fault causes is constructed in this paper. Five fold cross validation is chosen as the method of model selection for CBFDM. The simulated data are generated from PW4000-94 engine influence coefficient matrix at cruise, and the results show that the diagnostic accuracy of CBFDM is over 93 % even when the standard deviation of noise is 3 times larger than the normal. This model can also be used for other diagnostic problems.展开更多
The traditional model selection criterions try to make a balance between fitted error and model complexity. Assumptions on the distribution of the response or the noise, which may be misspecified, should be made befor...The traditional model selection criterions try to make a balance between fitted error and model complexity. Assumptions on the distribution of the response or the noise, which may be misspecified, should be made before using the traditional ones. In this ar- ticle, we give a new model selection criterion, based on the assumption that noise term in the model is independent with explanatory variables, of minimizing the association strength between regression residuals and the response, with fewer assumptions. Maximal Information Coe^cient (MIC), a recently proposed dependence measure, captures a wide range of associ- ations, and gives almost the same score to different type of relationships with equal noise, so MIC is used to measure the association strength. Furthermore, partial maximal information coefficient (PMIC) is introduced to capture the association between two variables removing a third controlling random variable. In addition, the definition of general partial relationship is given.展开更多
One branch of structural health monitoring (SHM) utilizes dynamic response measurements to assess the structural integrity of civil infrastructures. In particular,modal frequency is a widely adopted indicator for stru...One branch of structural health monitoring (SHM) utilizes dynamic response measurements to assess the structural integrity of civil infrastructures. In particular,modal frequency is a widely adopted indicator for structural damage since its square is proportional to structural stiffness. However,it has been demonstrated in various SHM projects that this indicator is substantially affected by fluctuating environmental conditions. In order to provide reliable and consistent information on the health status of the monitored structures,it is necessary to develop a method to filter this interference. This study attempts to model and quantify the environmental influence on the modal frequencies of reinforced concrete buildings. Daily structural response measurements of a twenty-two story reinforced concrete building were collected and analyzed over a one-year period. The Bayesian spectral density approach was utilized to identify the modal frequencies of this building and it was clearly seen that the temperature and humidity fluctuation induced notable variations. A mathematical model was developed to quantify the environmental effects and model complexity was taken into consideration. Based on a Timoshenko beam model,the full model class was constructed and other reduced-order model class candidates were obtained. Then,the Bayesian modal class selection approach was employed to select the one with the most suitable complexity. The proposed model successfully characterizes the environmental influence on the modal frequencies. Furthermore,the estimated uncertainty of the model parameters allows for assessment of the reliability of the prediction. This study not only improves the understanding about the monitored structure,but also establishes a systematic approach for reliable health assessment of reinforced concrete buildings.展开更多
Comparative fishing experiments were carried out in 2010 using tube traps with five hole diameters (8, 15, 18, 20 and 22 mm) to establish the size selectivity of escape holes for white-spotted conger. Selectivity and ...Comparative fishing experiments were carried out in 2010 using tube traps with five hole diameters (8, 15, 18, 20 and 22 mm) to establish the size selectivity of escape holes for white-spotted conger. Selectivity and split parameters of the SELECT model were calculated using the estimated-split and equal-spilt model. From likelihood ratio tests and AIC (Akaike's Information Criterion) values, the estimated-split model was selected as the best-fit model. Size selectivity of escape holes in the tube traps was expressed as a logistic curve, similar to mesh selectivity. The 50% selection length of white-spotted conger in the estimated-split model was 28.26, 33.35, 39.31 and 47.30 cm for escape-hole diameters of 15, 18, 20 and 22 mm, respectively. The optimum escape-hole size is discussed with respect to management of the white-spotted conger fishery. The results indicate that tube traps with escape holes of 18 mm in diameter would benefit this fishery.展开更多
基金supported by Ningxia Key R&D Program (Key)Project (2023BDE02001)Ningxia Key R&D Program (Talent Introduction Special)Project (2022YCZX0013)+2 种基金North Minzu University 2022 School-Level Research Platform“Digital Agriculture Empowering Ningxia Rural Revitalization Innovation Team”,Project Number:2022PT_S10Yinchuan City School-Enterprise Joint Innovation Project (2022XQZD009)“Innovation Team for Imaging and Intelligent Information Processing”of the National Ethnic Affairs Commission.
文摘Widely used deep neural networks currently face limitations in achieving optimal performance for purchase intention prediction due to constraints on data volume and hyperparameter selection.To address this issue,based on the deep forest algorithm and further integrating evolutionary ensemble learning methods,this paper proposes a novel Deep Adaptive Evolutionary Ensemble(DAEE)model.This model introduces model diversity into the cascade layer,allowing it to adaptively adjust its structure to accommodate complex and evolving purchasing behavior patterns.Moreover,this paper optimizes the methods of obtaining feature vectors,enhancement vectors,and prediction results within the deep forest algorithm to enhance the model’s predictive accuracy.Results demonstrate that the improved deep forest model not only possesses higher robustness but also shows an increase of 5.02%in AUC value compared to the baseline model.Furthermore,its training runtime speed is 6 times faster than that of deep models,and compared to other improved models,its accuracy has been enhanced by 0.9%.
基金Supported by the National Key Research and Development Program of China(2021YFD1201103-01-05)。
文摘Soybean frogeye leaf spot(FLS) disease is a global disease affecting soybean yield, especially in the soybean growing area of Heilongjiang Province. In order to realize genomic selection breeding for FLS resistance of soybean, least absolute shrinkage and selection operator(LASSO) regression and stepwise regression were combined, and a genomic selection model was established for 40 002 SNP markers covering soybean genome and relative lesion area of soybean FLS. As a result, 68 molecular markers controlling soybean FLS were detected accurately, and the phenotypic contribution rate of these markers reached 82.45%. In this study, a model was established, which could be used directly to evaluate the resistance of soybean FLS and to select excellent offspring. This research method could also provide ideas and methods for other plants to breeding in disease resistance.
文摘Government credibility is an important asset of contemporary national governance, an important criterion for evaluating government legitimacy, and a key factor in measuring the effectiveness of government governance. In recent years, researchers’ research on government credibility has mostly focused on exploring theories and mechanisms, with little empirical research on this topic. This article intends to apply variable selection models in the field of social statistics to the issue of government credibility, in order to achieve empirical research on government credibility and explore its core influencing factors from a statistical perspective. Specifically, this article intends to use four regression-analysis-based methods and three random-forest-based methods to study the influencing factors of government credibility in various provinces in China, and compare the performance of these seven variable selection methods in different dimensions. The research results show that there are certain differences in simplicity, accuracy, and variable importance ranking among different variable selection methods, which present different importance in the study of government credibility issues. This study provides a methodological reference for variable selection models in the field of social science research, and also offers a multidimensional comparative perspective for analyzing the influencing factors of government credibility.
文摘Traditional methods for selecting models in experimental data analysis are susceptible to researcher bias, hindering exploration of alternative explanations and potentially leading to overfitting. The Finite Information Quantity (FIQ) approach offers a novel solution by acknowledging the inherent limitations in information processing capacity of physical systems. This framework facilitates the development of objective criteria for model selection (comparative uncertainty) and paves the way for a more comprehensive understanding of phenomena through exploring diverse explanations. This work presents a detailed comparison of the FIQ approach with ten established model selection methods, highlighting the advantages and limitations of each. We demonstrate the potential of FIQ to enhance the objectivity and robustness of scientific inquiry through three practical examples: selecting appropriate models for measuring fundamental constants, sound velocity, and underwater electrical discharges. Further research is warranted to explore the full applicability of FIQ across various scientific disciplines.
基金Supported by the High Technology Research and Development Program of China (863 Program,No2006AA100301)
文摘The performance of six statistical approaches,which can be used for selection of the best model to describe the growth of individual fish,was analyzed using simulated and real length-at-age data.The six approaches include coefficient of determination(R2),adjusted coefficient of determination(adj.-R2),root mean squared error(RMSE),Akaike's information criterion(AIC),bias correction of AIC(AICc) and Bayesian information criterion(BIC).The simulation data were generated by five growth models with different numbers of parameters.Four sets of real data were taken from the literature.The parameters in each of the five growth models were estimated using the maximum likelihood method under the assumption of the additive error structure for the data.The best supported model by the data was identified using each of the six approaches.The results show that R2 and RMSE have the same properties and perform worst.The sample size has an effect on the performance of adj.-R2,AIC,AICc and BIC.Adj.-R2 does better in small samples than in large samples.AIC is not suitable to use in small samples and tends to select more complex model when the sample size becomes large.AICc and BIC have best performance in small and large sample cases,respectively.Use of AICc or BIC is recommended for selection of fish growth model according to the size of the length-at-age data.
基金Research Committee of University of Macao under Research Grant No.MYRG081(Y1-L2)-FST13-YKVthe Science and Technology Development Fund of the Macao SAR government under Grant No.012/2013/A1
文摘Peak ground acceleration(PGA) estimation is an important task in earthquake engineering practice.One of the most well-known models is the Boore-Joyner-Fumal formula,which estimates the PGA using the moment magnitude,the site-to-fault distance and the site foundation properties.In the present study,the complexity for this formula and the homogeneity assumption for the prediction-error variance are investigated and an effi ciency-robustness balanced formula is proposed.For this purpose,a reduced-order Monte Carlo simulation algorithm for Bayesian model class selection is presented to obtain the most suitable predictive formula and prediction-error model for the seismic attenuation relationship.In this approach,each model class(a predictive formula with a prediction-error model) is evaluated according to its plausibility given the data.The one with the highest plausibility is robust since it possesses the optimal balance between the data fi tting capability and the sensitivity to noise.A database of strong ground motion records in the Tangshan region of China is obtained from the China Earthquake Data Center for the analysis.The optimal predictive formula is proposed based on this database.It is shown that the proposed formula with heterogeneous prediction-error variance is much simpler than the attenuation model suggested by Boore,Joyner and Fumal(1993).
基金supported by the National Natural Science Foundation of China,No.81000852 and 81301677the AHA Award,No.17POST32530004+1 种基金the Supporting Project of Science & Technology of Sichuan Province of China,No.2012SZ0140the Research Foundation of Zhejiang Province of China,No.201022896
文摘Rodents have been widely used in the production of cerebral ischemia models. However, successful therapies have been proven on experimental rodent stroke model, and they have often failed to be effective when tested clinically. Therefore, nonhuman primates were recommended as the ideal alternatives, owing to their similarities with the human cerebrovascular system, brain metabolism, grey to white matter ratio and even their rich behavioral repertoire. The present review is a thorough summary of ten methods that establish nonhuman primate models of focal cerebral ischemia; electrocoagulation, endothelin-1-induced occlusion, microvascular clip occlusion, autologous blood clot embolization, balloon inflation, microcatheter embolization, coil embolization, surgical suture embolization, suture, and photochemical induction methods. This review addresses the advantages and disadvantages of each method, as well as precautions for each model, compared nonhuman primates with rodents, different species of nonhuman primates and different modeling methods. Finally it discusses various factors that need to be considered when modelling and the method of evaluation after modelling. These are critical for understanding their respective strengths and weaknesses and underlie the selection of the optimum model.
基金supported by the National natural Science Foundation of China (NSFC, grant No. U2039207)
文摘Evaluation of numerical earthquake forecasting models needs to consider two issues of equal importance:the application scenario of the simulation,and the complexity of the model.Criterion of the evaluation-based model selection faces some interesting problems in need of discussion.
文摘Time-series-based forecasting is essential to determine how past events affect future events. This paper compares the performance accuracy of different time-series models for oil prices. Three types of univariate models are discussed: the exponential smoothing (ES), Holt-Winters (HW) and autoregressive intergrade moving average (ARIMA) models. To determine the best model, six different strategies were applied as selection criteria to quantify these models’ prediction accuracies. This comparison should help policy makers and industry marketing strategists select the best forecasting method in oil market. The three models were compared by applying them to the time series of regular oil prices for West Texas Intermediate (WTI) crude. The comparison indicated that the HW model performed better than the ES model for a prediction with a confidence interval of 95%. However, the ARIMA (2, 1, 2) model yielded the best results, leading us to conclude that this sophisticated and robust model outperformed other simple yet flexible models in oil market.
文摘We have compiled a sample of two subsets of AGN selected from their optical and X ray data. The first subset was selected for very broad and/or peculiar optical emission line profiles, the second for a high X ray flux. Here we will discuss properties of these galaxies and show that both subsets are very similar in the multi wavelength view. Furthermore, we will discuss differences between the two subsets and their implications for a Unified Model of AGN.
文摘This study focuses on meeting the challenges of big data visualization by using of data reduction methods based the feature selection methods.To reduce the volume of big data and minimize model training time(Tt)while maintaining data quality.We contributed to meeting the challenges of big data visualization using the embedded method based“Select from model(SFM)”method by using“Random forest Importance algorithm(RFI)”and comparing it with the filter method by using“Select percentile(SP)”method based chi square“Chi2”tool for selecting the most important features,which are then fed into a classification process using the logistic regression(LR)algorithm and the k-nearest neighbor(KNN)algorithm.Thus,the classification accuracy(AC)performance of LRis also compared to theKNN approach in python on eight data sets to see which method produces the best rating when feature selection methods are applied.Consequently,the study concluded that the feature selection methods have a significant impact on the analysis and visualization of the data after removing the repetitive data and the data that do not affect the goal.After making several comparisons,the study suggests(SFMLR)using SFM based on RFI algorithm for feature selection,with LR algorithm for data classify.The proposal proved its efficacy by comparing its results with recent literature.
基金the National Natural Science Foundation of China(6187138461921001).
文摘The optimal selection of radar clutter model is the premise of target detection,tracking,recognition,and cognitive waveform design in clutter background.Clutter characterization models are usually derived by mathematical simplification or empirical data fitting.However,the lack of standard model labels is a challenge in the optimal selection process.To solve this problem,a general three-level evaluation system for the model selection performance is proposed,including model selection accuracy index based on simulation data,fit goodness indexs based on the optimally selected model,and evaluation index based on the supporting performance to its third-party.The three-level evaluation system can more comprehensively and accurately describe the selection performance of the radar clutter model in different ways,and can be popularized and applied to the evaluation of other similar characterization model selection.
文摘In a competitive digital age where data volumes are increasing with time, the ability to extract meaningful knowledge from high-dimensional data using machine learning (ML) and data mining (DM) techniques and making decisions based on the extracted knowledge is becoming increasingly important in all business domains. Nevertheless, high-dimensional data remains a major challenge for classification algorithms due to its high computational cost and storage requirements. The 2016 Demographic and Health Survey of Ethiopia (EDHS 2016) used as the data source for this study which is publicly available contains several features that may not be relevant to the prediction task. In this paper, we developed a hybrid multidimensional metrics framework for predictive modeling for both model performance evaluation and feature selection to overcome the feature selection challenges and select the best model among the available models in DM and ML. The proposed hybrid metrics were used to measure the efficiency of the predictive models. Experimental results show that the decision tree algorithm is the most efficient model. The higher score of HMM (m, r) = 0.47 illustrates the overall significant model that encompasses almost all the user’s requirements, unlike the classical metrics that use a criterion to select the most appropriate model. On the other hand, the ANNs were found to be the most computationally intensive for our prediction task. Moreover, the type of data and the class size of the dataset (unbalanced data) have a significant impact on the efficiency of the model, especially on the computational cost, and the interpretability of the parameters of the model would be hampered. And the efficiency of the predictive model could be improved with other feature selection algorithms (especially hybrid metrics) considering the experts of the knowledge domain, as the understanding of the business domain has a significant impact.
基金The National Natural Science Foundation of China(No.61105048,60972165)the Doctoral Fund of Ministry of Education of China(No.20110092120034)+2 种基金the Natural Science Foundation of Jiangsu Province(No.BK2010240)the Technology Foundation for Selected Overseas Chinese Scholar,Ministry of Human Resources and Social Security of China(No.6722000008)the Open Fund of Jiangsu Province Key Laboratory for Remote Measuring and Control(No.YCCK201005)
文摘An improved Gaussian mixture model (GMM)- based clustering method is proposed for the difficult case where the true distribution of data is against the assumed GMM. First, an improved model selection criterion, the completed likelihood minimum message length criterion, is derived. It can measure both the goodness-of-fit of the candidate GMM to the data and the goodness-of-partition of the data. Secondly, by utilizing the proposed criterion as the clustering objective function, an improved expectation- maximization (EM) algorithm is developed, which can avoid poor local optimal solutions compared to the standard EM algorithm for estimating the model parameters. The experimental results demonstrate that the proposed method can rectify the over-fitting tendency of representative GMM-based clustering approaches and can robustly provide more accurate clustering results.
基金The National Natural Science Foundation of China(No.30900356,81071135)
文摘Dynamic casual modeling of functional magnetic resonance imaging(fMRI) signals is employed to explore critical emotional neurocircuitry under sad stimuli. The intrinsic model of emotional loops is built on the basis of Papez's circuit and related prior knowledge, and then three modulatory connection models are established. In these models, stimuli are placed at different points, which represents they affect the neural activities between brain regions, and these activities are modulated in different ways. Then, the optimal model is selected by Bayesian model comparison. From group analysis, patients' intrinsic and modulatory connections from the anterior cingulate cortex (ACC) to the right inferior frontal gyrus (rlFG) are significantly higher than those of the control group. Then the functional connection parameters of the model are selected as classifier features. The classification accuracy rate from the support vector machine(SVM) classifier is 80.73%, which, to some extent, validates the effectiveness of the regional connectivity parameters for depression recognition and provides a new approach for the clinical diagnosis of depression.
文摘This paper briefs the configuration and performance of large size gas turbines and their composed combined cycle power plants designed and produced by four large renown gas turbine manufacturing firms in the world, providing reference for the relevant sectors and enterprises in importing advanced gas turbines and technologies.
文摘SVMs(support vector machines) is a new artificial intelligence methodology derived from Vapnik's statistical learning theory, which has better generalization than artificial neural network. A Csupport vector classifiers Based Fault Diagnostic Model (CBFDM) which gives the 3 most possible fault causes is constructed in this paper. Five fold cross validation is chosen as the method of model selection for CBFDM. The simulated data are generated from PW4000-94 engine influence coefficient matrix at cruise, and the results show that the diagnostic accuracy of CBFDM is over 93 % even when the standard deviation of noise is 3 times larger than the normal. This model can also be used for other diagnostic problems.
基金partly supported by National Basic Research Program of China(973 Program,2011CB707802,2013CB910200)National Science Foundation of China(11201466)
文摘The traditional model selection criterions try to make a balance between fitted error and model complexity. Assumptions on the distribution of the response or the noise, which may be misspecified, should be made before using the traditional ones. In this ar- ticle, we give a new model selection criterion, based on the assumption that noise term in the model is independent with explanatory variables, of minimizing the association strength between regression residuals and the response, with fewer assumptions. Maximal Information Coe^cient (MIC), a recently proposed dependence measure, captures a wide range of associ- ations, and gives almost the same score to different type of relationships with equal noise, so MIC is used to measure the association strength. Furthermore, partial maximal information coefficient (PMIC) is introduced to capture the association between two variables removing a third controlling random variable. In addition, the definition of general partial relationship is given.
基金Research Committee,University of Macao,China Under Grant No.RG077/07-08S/09R/YKV/FST
文摘One branch of structural health monitoring (SHM) utilizes dynamic response measurements to assess the structural integrity of civil infrastructures. In particular,modal frequency is a widely adopted indicator for structural damage since its square is proportional to structural stiffness. However,it has been demonstrated in various SHM projects that this indicator is substantially affected by fluctuating environmental conditions. In order to provide reliable and consistent information on the health status of the monitored structures,it is necessary to develop a method to filter this interference. This study attempts to model and quantify the environmental influence on the modal frequencies of reinforced concrete buildings. Daily structural response measurements of a twenty-two story reinforced concrete building were collected and analyzed over a one-year period. The Bayesian spectral density approach was utilized to identify the modal frequencies of this building and it was clearly seen that the temperature and humidity fluctuation induced notable variations. A mathematical model was developed to quantify the environmental effects and model complexity was taken into consideration. Based on a Timoshenko beam model,the full model class was constructed and other reduced-order model class candidates were obtained. Then,the Bayesian modal class selection approach was employed to select the one with the most suitable complexity. The proposed model successfully characterizes the environmental influence on the modal frequencies. Furthermore,the estimated uncertainty of the model parameters allows for assessment of the reliability of the prediction. This study not only improves the understanding about the monitored structure,but also establishes a systematic approach for reliable health assessment of reinforced concrete buildings.
基金Supported by National Key Technology Research and Development Program of China (No. 2006BAD09A05)
文摘Comparative fishing experiments were carried out in 2010 using tube traps with five hole diameters (8, 15, 18, 20 and 22 mm) to establish the size selectivity of escape holes for white-spotted conger. Selectivity and split parameters of the SELECT model were calculated using the estimated-split and equal-spilt model. From likelihood ratio tests and AIC (Akaike's Information Criterion) values, the estimated-split model was selected as the best-fit model. Size selectivity of escape holes in the tube traps was expressed as a logistic curve, similar to mesh selectivity. The 50% selection length of white-spotted conger in the estimated-split model was 28.26, 33.35, 39.31 and 47.30 cm for escape-hole diameters of 15, 18, 20 and 22 mm, respectively. The optimum escape-hole size is discussed with respect to management of the white-spotted conger fishery. The results indicate that tube traps with escape holes of 18 mm in diameter would benefit this fishery.