Widely used deep neural networks currently face limitations in achieving optimal performance for purchase intention prediction due to constraints on data volume and hyperparameter selection.To address this issue,based...Widely used deep neural networks currently face limitations in achieving optimal performance for purchase intention prediction due to constraints on data volume and hyperparameter selection.To address this issue,based on the deep forest algorithm and further integrating evolutionary ensemble learning methods,this paper proposes a novel Deep Adaptive Evolutionary Ensemble(DAEE)model.This model introduces model diversity into the cascade layer,allowing it to adaptively adjust its structure to accommodate complex and evolving purchasing behavior patterns.Moreover,this paper optimizes the methods of obtaining feature vectors,enhancement vectors,and prediction results within the deep forest algorithm to enhance the model’s predictive accuracy.Results demonstrate that the improved deep forest model not only possesses higher robustness but also shows an increase of 5.02%in AUC value compared to the baseline model.Furthermore,its training runtime speed is 6 times faster than that of deep models,and compared to other improved models,its accuracy has been enhanced by 0.9%.展开更多
Soybean frogeye leaf spot(FLS) disease is a global disease affecting soybean yield, especially in the soybean growing area of Heilongjiang Province. In order to realize genomic selection breeding for FLS resistance of...Soybean frogeye leaf spot(FLS) disease is a global disease affecting soybean yield, especially in the soybean growing area of Heilongjiang Province. In order to realize genomic selection breeding for FLS resistance of soybean, least absolute shrinkage and selection operator(LASSO) regression and stepwise regression were combined, and a genomic selection model was established for 40 002 SNP markers covering soybean genome and relative lesion area of soybean FLS. As a result, 68 molecular markers controlling soybean FLS were detected accurately, and the phenotypic contribution rate of these markers reached 82.45%. In this study, a model was established, which could be used directly to evaluate the resistance of soybean FLS and to select excellent offspring. This research method could also provide ideas and methods for other plants to breeding in disease resistance.展开更多
Traditional methods for selecting models in experimental data analysis are susceptible to researcher bias, hindering exploration of alternative explanations and potentially leading to overfitting. The Finite Informati...Traditional methods for selecting models in experimental data analysis are susceptible to researcher bias, hindering exploration of alternative explanations and potentially leading to overfitting. The Finite Information Quantity (FIQ) approach offers a novel solution by acknowledging the inherent limitations in information processing capacity of physical systems. This framework facilitates the development of objective criteria for model selection (comparative uncertainty) and paves the way for a more comprehensive understanding of phenomena through exploring diverse explanations. This work presents a detailed comparison of the FIQ approach with ten established model selection methods, highlighting the advantages and limitations of each. We demonstrate the potential of FIQ to enhance the objectivity and robustness of scientific inquiry through three practical examples: selecting appropriate models for measuring fundamental constants, sound velocity, and underwater electrical discharges. Further research is warranted to explore the full applicability of FIQ across various scientific disciplines.展开更多
Peak ground acceleration(PGA) estimation is an important task in earthquake engineering practice.One of the most well-known models is the Boore-Joyner-Fumal formula,which estimates the PGA using the moment magnitude,t...Peak ground acceleration(PGA) estimation is an important task in earthquake engineering practice.One of the most well-known models is the Boore-Joyner-Fumal formula,which estimates the PGA using the moment magnitude,the site-to-fault distance and the site foundation properties.In the present study,the complexity for this formula and the homogeneity assumption for the prediction-error variance are investigated and an effi ciency-robustness balanced formula is proposed.For this purpose,a reduced-order Monte Carlo simulation algorithm for Bayesian model class selection is presented to obtain the most suitable predictive formula and prediction-error model for the seismic attenuation relationship.In this approach,each model class(a predictive formula with a prediction-error model) is evaluated according to its plausibility given the data.The one with the highest plausibility is robust since it possesses the optimal balance between the data fi tting capability and the sensitivity to noise.A database of strong ground motion records in the Tangshan region of China is obtained from the China Earthquake Data Center for the analysis.The optimal predictive formula is proposed based on this database.It is shown that the proposed formula with heterogeneous prediction-error variance is much simpler than the attenuation model suggested by Boore,Joyner and Fumal(1993).展开更多
Rodents have been widely used in the production of cerebral ischemia models. However, successful therapies have been proven on experimental rodent stroke model, and they have often failed to be effective when tested c...Rodents have been widely used in the production of cerebral ischemia models. However, successful therapies have been proven on experimental rodent stroke model, and they have often failed to be effective when tested clinically. Therefore, nonhuman primates were recommended as the ideal alternatives, owing to their similarities with the human cerebrovascular system, brain metabolism, grey to white matter ratio and even their rich behavioral repertoire. The present review is a thorough summary of ten methods that establish nonhuman primate models of focal cerebral ischemia; electrocoagulation, endothelin-1-induced occlusion, microvascular clip occlusion, autologous blood clot embolization, balloon inflation, microcatheter embolization, coil embolization, surgical suture embolization, suture, and photochemical induction methods. This review addresses the advantages and disadvantages of each method, as well as precautions for each model, compared nonhuman primates with rodents, different species of nonhuman primates and different modeling methods. Finally it discusses various factors that need to be considered when modelling and the method of evaluation after modelling. These are critical for understanding their respective strengths and weaknesses and underlie the selection of the optimum model.展开更多
Time-series-based forecasting is essential to determine how past events affect future events. This paper compares the performance accuracy of different time-series models for oil prices. Three types of univariate mode...Time-series-based forecasting is essential to determine how past events affect future events. This paper compares the performance accuracy of different time-series models for oil prices. Three types of univariate models are discussed: the exponential smoothing (ES), Holt-Winters (HW) and autoregressive intergrade moving average (ARIMA) models. To determine the best model, six different strategies were applied as selection criteria to quantify these models’ prediction accuracies. This comparison should help policy makers and industry marketing strategists select the best forecasting method in oil market. The three models were compared by applying them to the time series of regular oil prices for West Texas Intermediate (WTI) crude. The comparison indicated that the HW model performed better than the ES model for a prediction with a confidence interval of 95%. However, the ARIMA (2, 1, 2) model yielded the best results, leading us to conclude that this sophisticated and robust model outperformed other simple yet flexible models in oil market.展开更多
Evaluation of numerical earthquake forecasting models needs to consider two issues of equal importance:the application scenario of the simulation,and the complexity of the model.Criterion of the evaluation-based model...Evaluation of numerical earthquake forecasting models needs to consider two issues of equal importance:the application scenario of the simulation,and the complexity of the model.Criterion of the evaluation-based model selection faces some interesting problems in need of discussion.展开更多
This study focuses on meeting the challenges of big data visualization by using of data reduction methods based the feature selection methods.To reduce the volume of big data and minimize model training time(Tt)while ...This study focuses on meeting the challenges of big data visualization by using of data reduction methods based the feature selection methods.To reduce the volume of big data and minimize model training time(Tt)while maintaining data quality.We contributed to meeting the challenges of big data visualization using the embedded method based“Select from model(SFM)”method by using“Random forest Importance algorithm(RFI)”and comparing it with the filter method by using“Select percentile(SP)”method based chi square“Chi2”tool for selecting the most important features,which are then fed into a classification process using the logistic regression(LR)algorithm and the k-nearest neighbor(KNN)algorithm.Thus,the classification accuracy(AC)performance of LRis also compared to theKNN approach in python on eight data sets to see which method produces the best rating when feature selection methods are applied.Consequently,the study concluded that the feature selection methods have a significant impact on the analysis and visualization of the data after removing the repetitive data and the data that do not affect the goal.After making several comparisons,the study suggests(SFMLR)using SFM based on RFI algorithm for feature selection,with LR algorithm for data classify.The proposal proved its efficacy by comparing its results with recent literature.展开更多
The optimal selection of radar clutter model is the premise of target detection,tracking,recognition,and cognitive waveform design in clutter background.Clutter characterization models are usually derived by mathemati...The optimal selection of radar clutter model is the premise of target detection,tracking,recognition,and cognitive waveform design in clutter background.Clutter characterization models are usually derived by mathematical simplification or empirical data fitting.However,the lack of standard model labels is a challenge in the optimal selection process.To solve this problem,a general three-level evaluation system for the model selection performance is proposed,including model selection accuracy index based on simulation data,fit goodness indexs based on the optimally selected model,and evaluation index based on the supporting performance to its third-party.The three-level evaluation system can more comprehensively and accurately describe the selection performance of the radar clutter model in different ways,and can be popularized and applied to the evaluation of other similar characterization model selection.展开更多
In a competitive digital age where data volumes are increasing with time, the ability to extract meaningful knowledge from high-dimensional data using machine learning (ML) and data mining (DM) techniques and making d...In a competitive digital age where data volumes are increasing with time, the ability to extract meaningful knowledge from high-dimensional data using machine learning (ML) and data mining (DM) techniques and making decisions based on the extracted knowledge is becoming increasingly important in all business domains. Nevertheless, high-dimensional data remains a major challenge for classification algorithms due to its high computational cost and storage requirements. The 2016 Demographic and Health Survey of Ethiopia (EDHS 2016) used as the data source for this study which is publicly available contains several features that may not be relevant to the prediction task. In this paper, we developed a hybrid multidimensional metrics framework for predictive modeling for both model performance evaluation and feature selection to overcome the feature selection challenges and select the best model among the available models in DM and ML. The proposed hybrid metrics were used to measure the efficiency of the predictive models. Experimental results show that the decision tree algorithm is the most efficient model. The higher score of HMM (m, r) = 0.47 illustrates the overall significant model that encompasses almost all the user’s requirements, unlike the classical metrics that use a criterion to select the most appropriate model. On the other hand, the ANNs were found to be the most computationally intensive for our prediction task. Moreover, the type of data and the class size of the dataset (unbalanced data) have a significant impact on the efficiency of the model, especially on the computational cost, and the interpretability of the parameters of the model would be hampered. And the efficiency of the predictive model could be improved with other feature selection algorithms (especially hybrid metrics) considering the experts of the knowledge domain, as the understanding of the business domain has a significant impact.展开更多
SVMs(support vector machines) is a new artificial intelligence methodology derived from Vapnik's statistical learning theory, which has better generalization than artificial neural network. A Csupport vector classi...SVMs(support vector machines) is a new artificial intelligence methodology derived from Vapnik's statistical learning theory, which has better generalization than artificial neural network. A Csupport vector classifiers Based Fault Diagnostic Model (CBFDM) which gives the 3 most possible fault causes is constructed in this paper. Five fold cross validation is chosen as the method of model selection for CBFDM. The simulated data are generated from PW4000-94 engine influence coefficient matrix at cruise, and the results show that the diagnostic accuracy of CBFDM is over 93 % even when the standard deviation of noise is 3 times larger than the normal. This model can also be used for other diagnostic problems.展开更多
One branch of structural health monitoring (SHM) utilizes dynamic response measurements to assess the structural integrity of civil infrastructures. In particular,modal frequency is a widely adopted indicator for stru...One branch of structural health monitoring (SHM) utilizes dynamic response measurements to assess the structural integrity of civil infrastructures. In particular,modal frequency is a widely adopted indicator for structural damage since its square is proportional to structural stiffness. However,it has been demonstrated in various SHM projects that this indicator is substantially affected by fluctuating environmental conditions. In order to provide reliable and consistent information on the health status of the monitored structures,it is necessary to develop a method to filter this interference. This study attempts to model and quantify the environmental influence on the modal frequencies of reinforced concrete buildings. Daily structural response measurements of a twenty-two story reinforced concrete building were collected and analyzed over a one-year period. The Bayesian spectral density approach was utilized to identify the modal frequencies of this building and it was clearly seen that the temperature and humidity fluctuation induced notable variations. A mathematical model was developed to quantify the environmental effects and model complexity was taken into consideration. Based on a Timoshenko beam model,the full model class was constructed and other reduced-order model class candidates were obtained. Then,the Bayesian modal class selection approach was employed to select the one with the most suitable complexity. The proposed model successfully characterizes the environmental influence on the modal frequencies. Furthermore,the estimated uncertainty of the model parameters allows for assessment of the reliability of the prediction. This study not only improves the understanding about the monitored structure,but also establishes a systematic approach for reliable health assessment of reinforced concrete buildings.展开更多
This study examines the impact of farmers’cooperatives participation and technology adoption on their economic welfare in China.A double selectivity model(DSM)is applied to correct for sample selection bias stemming ...This study examines the impact of farmers’cooperatives participation and technology adoption on their economic welfare in China.A double selectivity model(DSM)is applied to correct for sample selection bias stemming from both observed and unobserved factors,and a propensity score matching(PSM)method is applied to calculate the agricultural income difference with counter factual analysis using survey data from 396 farmers in 15 provinces in China.The findings indicate that farmers who join farmer cooperatives and adopt agricultural technology can increase agricultural income by 2.77 and 2.35%,respectively,compared with those non-participants and non-adopters.Interestingly,the effect on agricultural income is found to be more significant for the low-income farmers than the high-income ones,with income increasing 5.45 and 4.51%when participating in farmer cooperatives and adopting agricultural technology,respectively.Our findings highlight the positive role of farmer cooperatives and agricultural technology in promoting farmers’economic welfare.Based on the findings,government policy implications are also discussed.展开更多
The traditional model selection criterions try to make a balance between fitted error and model complexity. Assumptions on the distribution of the response or the noise, which may be misspecified, should be made befor...The traditional model selection criterions try to make a balance between fitted error and model complexity. Assumptions on the distribution of the response or the noise, which may be misspecified, should be made before using the traditional ones. In this ar- ticle, we give a new model selection criterion, based on the assumption that noise term in the model is independent with explanatory variables, of minimizing the association strength between regression residuals and the response, with fewer assumptions. Maximal Information Coe^cient (MIC), a recently proposed dependence measure, captures a wide range of associ- ations, and gives almost the same score to different type of relationships with equal noise, so MIC is used to measure the association strength. Furthermore, partial maximal information coefficient (PMIC) is introduced to capture the association between two variables removing a third controlling random variable. In addition, the definition of general partial relationship is given.展开更多
In vehicle ad-hoc networks(VANETs),the proliferation of wireless communication will give rise to the heterogeneous access environment where network selection becomes significant.Motivated by the self-adaptive paradigm...In vehicle ad-hoc networks(VANETs),the proliferation of wireless communication will give rise to the heterogeneous access environment where network selection becomes significant.Motivated by the self-adaptive paradigm of cellular attractors,this paper regards an individual communication as a cell,so that we can apply the revised attractor selection model to induce each connected vehicle.Aiming at improving the Quality of Service(QoS),we presented the bio-inspired handover decision-making mechanism.In addition,we employ the Technique for Order Preference by Similarity to an Ideal Solution(TOPSIS)for any vehicle to choose an access network.This paper proposes a novel framework where the bio-inspired mechanism is combined with TOPSIS.In a dynamic and random mobility environment,our method achieves the coordination of performance of heterogeneous networks by guaranteeing the efficient utilization and fair distribution of network resources in a global sense.The experimental results confirm that the proposed method performs better when compared with conventional schemes.展开更多
Covariance functions have been proposed as an alternative to model longitudinal data in animal breeding because of their various merits in comparison to the classical analytical methods.In practical estimation,differe...Covariance functions have been proposed as an alternative to model longitudinal data in animal breeding because of their various merits in comparison to the classical analytical methods.In practical estimation,different models and polynomial orders fitted can influence the estimates of covariance functions and thus genetic parameters.The objective of this study was to select model for estimation of covariance functions for body weights of Angora goats at 7 time points.Covariance functions were estimated by fitting 6 random regression models with birth year,birth month,sex,age of dam,birth type,and relative birth date as fixed effects.Random effects involved were direct and maternal additive genetic,and animal and maternal permanent environmental effects with different orders of fit.Selection of model and orders of fit were carried out by likelihood ratio test and 4 types of information criteria.The results showed that model with 6 orders of polynomial fit for direct additive genetic and animal permanent environmental effects and 4 and 5 orders for maternal genetic and permanent environmental effects,respectively,were preferable for estimation of covariance functions.Models with and without maternal effects influenced the estimates of covariance functions greatly.Maternal permanent environmental effect does not explain the variation of all permanent environments,well suggesting different sources of permanent environmental effects also has large influence on covariance function estimates.展开更多
Data available in software engineering for many applications contains variability and it is not possible to say which variable helps in the process of the prediction.Most of the work present in software defect predict...Data available in software engineering for many applications contains variability and it is not possible to say which variable helps in the process of the prediction.Most of the work present in software defect prediction is focused on the selection of best prediction techniques.For this purpose,deep learning and ensemble models have shown promising results.In contrast,there are very few researches that deals with cleaning the training data and selection of best parameter values from the data.Sometimes data available for training the models have high variability and this variability may cause a decrease in model accuracy.To deal with this problem we used the Akaike information criterion(AIC)and the Bayesian information criterion(BIC)for selection of the best variables to train the model.A simple ANN model with one input,one output and two hidden layers was used for the training instead of a very deep and complex model.AIC and BIC values are calculated and combination for minimum AIC and BIC values to be selected for the best model.At first,variables were narrowed down to a smaller number using correlation values.Then subsets for all the possible variable combinations were formed.In the end,an artificial neural network(ANN)model was trained for each subset and the best model was selected on the basis of the smallest AIC and BIC value.It was found that combination of only two variables’ns and entropy are best for software defect prediction as it gives minimum AIC and BIC values.While,nm and npt is the worst combination and gives maximum AIC and BIC values.展开更多
Surface wave methods have received much attention due to their efficient, flexible and convenient characteristics. However, there are still critical issues regarding a key step in surface wave inversion. In most exist...Surface wave methods have received much attention due to their efficient, flexible and convenient characteristics. However, there are still critical issues regarding a key step in surface wave inversion. In most existing methods, the number of layers is assumed to be known prior to the process of inversion. However, improper assignment of this parameter leads to erroneous inversion results. A Bayesian nonparametric method for Rayleigh wave inversion is proposed herein to address this problem. In this method, each model class represents a particular number of layers with unknown S-wave velocity and thickness of each layer. As a result, determination of the number of layers is equivalent to selection of the most applicable model class. Regarding each model class, the optimization search of S-wave velocity and thickness of each layer is implemented by using a genetic algorithm. Then, each model class is assessed in view of its efficiency under the Bayesian framework and the most efficient class is selected. Simulated and actual examples verify that the proposed Bayesian nonparametric approach is reliable and efficient for Rayleigh wave inversion, especially for its capability to determine the number of layers.展开更多
Affinity propagation(AP)is a classic clustering algorithm.To improve the classical AP algorithms,we propose a clustering algorithm namely,adaptive spectral affinity propagation(AdaSAP).In particular,we discuss why AP ...Affinity propagation(AP)is a classic clustering algorithm.To improve the classical AP algorithms,we propose a clustering algorithm namely,adaptive spectral affinity propagation(AdaSAP).In particular,we discuss why AP is not suitable for non-spherical clusters and present a unifying view of nine famous arbitrary-shaped clustering algorithms.We propose a strategy of extending AP in non-spherical clustering by constructing category similarity of objects.Leveraging the monotonicity that the clusters’number increases with the self-similarity in AP,we propose a model selection procedure that can determine the number of clusters adaptively.For the parameters introduced by extending AP in non-spherical clustering,we provide a grid-evolving strategy to optimize them automatically.The effectiveness of AdaSAP is evaluated by experiments on both synthetic datasets and real-world clustering tasks.Experimental results validate that the superiority of AdaSAP over benchmark algorithms like the classical AP and spectral clustering algorithms.展开更多
This paper investigates a genotype selection model subjected to both a multiplicative coloured noise and an additive coloured noise with different correlation time τ1 and τ2 by means of the numerical technique. By d...This paper investigates a genotype selection model subjected to both a multiplicative coloured noise and an additive coloured noise with different correlation time τ1 and τ2 by means of the numerical technique. By directly simulating the Langevin Equation, the following results are obtained. (1) The multiplicative coloured noise dominates, however, the effect of the additive coloured noise is not neglected in the practical gene selection process. The selection rate μ decides that the selection is propitious to gene A haploid or gene B haploid. (2) The additive coloured noise intensity and the correlation time τ2 play opposite roles. It is noted that α and τ2 can not separate the single peak, while can make the peak disappear and ^-2 can make the peak be sharp. (3) The multiplicative coloured noise intensity D and the correlation time τ1 can induce phase transition, at the same time they play opposite roles and the reentrance phenomenon appears. In this case, it is easy to select one type haploid from the group with increasing D and decreasing τ1.展开更多
基金supported by Ningxia Key R&D Program (Key)Project (2023BDE02001)Ningxia Key R&D Program (Talent Introduction Special)Project (2022YCZX0013)+2 种基金North Minzu University 2022 School-Level Research Platform“Digital Agriculture Empowering Ningxia Rural Revitalization Innovation Team”,Project Number:2022PT_S10Yinchuan City School-Enterprise Joint Innovation Project (2022XQZD009)“Innovation Team for Imaging and Intelligent Information Processing”of the National Ethnic Affairs Commission.
文摘Widely used deep neural networks currently face limitations in achieving optimal performance for purchase intention prediction due to constraints on data volume and hyperparameter selection.To address this issue,based on the deep forest algorithm and further integrating evolutionary ensemble learning methods,this paper proposes a novel Deep Adaptive Evolutionary Ensemble(DAEE)model.This model introduces model diversity into the cascade layer,allowing it to adaptively adjust its structure to accommodate complex and evolving purchasing behavior patterns.Moreover,this paper optimizes the methods of obtaining feature vectors,enhancement vectors,and prediction results within the deep forest algorithm to enhance the model’s predictive accuracy.Results demonstrate that the improved deep forest model not only possesses higher robustness but also shows an increase of 5.02%in AUC value compared to the baseline model.Furthermore,its training runtime speed is 6 times faster than that of deep models,and compared to other improved models,its accuracy has been enhanced by 0.9%.
基金Supported by the National Key Research and Development Program of China(2021YFD1201103-01-05)。
文摘Soybean frogeye leaf spot(FLS) disease is a global disease affecting soybean yield, especially in the soybean growing area of Heilongjiang Province. In order to realize genomic selection breeding for FLS resistance of soybean, least absolute shrinkage and selection operator(LASSO) regression and stepwise regression were combined, and a genomic selection model was established for 40 002 SNP markers covering soybean genome and relative lesion area of soybean FLS. As a result, 68 molecular markers controlling soybean FLS were detected accurately, and the phenotypic contribution rate of these markers reached 82.45%. In this study, a model was established, which could be used directly to evaluate the resistance of soybean FLS and to select excellent offspring. This research method could also provide ideas and methods for other plants to breeding in disease resistance.
文摘Traditional methods for selecting models in experimental data analysis are susceptible to researcher bias, hindering exploration of alternative explanations and potentially leading to overfitting. The Finite Information Quantity (FIQ) approach offers a novel solution by acknowledging the inherent limitations in information processing capacity of physical systems. This framework facilitates the development of objective criteria for model selection (comparative uncertainty) and paves the way for a more comprehensive understanding of phenomena through exploring diverse explanations. This work presents a detailed comparison of the FIQ approach with ten established model selection methods, highlighting the advantages and limitations of each. We demonstrate the potential of FIQ to enhance the objectivity and robustness of scientific inquiry through three practical examples: selecting appropriate models for measuring fundamental constants, sound velocity, and underwater electrical discharges. Further research is warranted to explore the full applicability of FIQ across various scientific disciplines.
基金Research Committee of University of Macao under Research Grant No.MYRG081(Y1-L2)-FST13-YKVthe Science and Technology Development Fund of the Macao SAR government under Grant No.012/2013/A1
文摘Peak ground acceleration(PGA) estimation is an important task in earthquake engineering practice.One of the most well-known models is the Boore-Joyner-Fumal formula,which estimates the PGA using the moment magnitude,the site-to-fault distance and the site foundation properties.In the present study,the complexity for this formula and the homogeneity assumption for the prediction-error variance are investigated and an effi ciency-robustness balanced formula is proposed.For this purpose,a reduced-order Monte Carlo simulation algorithm for Bayesian model class selection is presented to obtain the most suitable predictive formula and prediction-error model for the seismic attenuation relationship.In this approach,each model class(a predictive formula with a prediction-error model) is evaluated according to its plausibility given the data.The one with the highest plausibility is robust since it possesses the optimal balance between the data fi tting capability and the sensitivity to noise.A database of strong ground motion records in the Tangshan region of China is obtained from the China Earthquake Data Center for the analysis.The optimal predictive formula is proposed based on this database.It is shown that the proposed formula with heterogeneous prediction-error variance is much simpler than the attenuation model suggested by Boore,Joyner and Fumal(1993).
基金supported by the National Natural Science Foundation of China,No.81000852 and 81301677the AHA Award,No.17POST32530004+1 种基金the Supporting Project of Science & Technology of Sichuan Province of China,No.2012SZ0140the Research Foundation of Zhejiang Province of China,No.201022896
文摘Rodents have been widely used in the production of cerebral ischemia models. However, successful therapies have been proven on experimental rodent stroke model, and they have often failed to be effective when tested clinically. Therefore, nonhuman primates were recommended as the ideal alternatives, owing to their similarities with the human cerebrovascular system, brain metabolism, grey to white matter ratio and even their rich behavioral repertoire. The present review is a thorough summary of ten methods that establish nonhuman primate models of focal cerebral ischemia; electrocoagulation, endothelin-1-induced occlusion, microvascular clip occlusion, autologous blood clot embolization, balloon inflation, microcatheter embolization, coil embolization, surgical suture embolization, suture, and photochemical induction methods. This review addresses the advantages and disadvantages of each method, as well as precautions for each model, compared nonhuman primates with rodents, different species of nonhuman primates and different modeling methods. Finally it discusses various factors that need to be considered when modelling and the method of evaluation after modelling. These are critical for understanding their respective strengths and weaknesses and underlie the selection of the optimum model.
文摘Time-series-based forecasting is essential to determine how past events affect future events. This paper compares the performance accuracy of different time-series models for oil prices. Three types of univariate models are discussed: the exponential smoothing (ES), Holt-Winters (HW) and autoregressive intergrade moving average (ARIMA) models. To determine the best model, six different strategies were applied as selection criteria to quantify these models’ prediction accuracies. This comparison should help policy makers and industry marketing strategists select the best forecasting method in oil market. The three models were compared by applying them to the time series of regular oil prices for West Texas Intermediate (WTI) crude. The comparison indicated that the HW model performed better than the ES model for a prediction with a confidence interval of 95%. However, the ARIMA (2, 1, 2) model yielded the best results, leading us to conclude that this sophisticated and robust model outperformed other simple yet flexible models in oil market.
基金supported by the National natural Science Foundation of China (NSFC, grant No. U2039207)
文摘Evaluation of numerical earthquake forecasting models needs to consider two issues of equal importance:the application scenario of the simulation,and the complexity of the model.Criterion of the evaluation-based model selection faces some interesting problems in need of discussion.
文摘This study focuses on meeting the challenges of big data visualization by using of data reduction methods based the feature selection methods.To reduce the volume of big data and minimize model training time(Tt)while maintaining data quality.We contributed to meeting the challenges of big data visualization using the embedded method based“Select from model(SFM)”method by using“Random forest Importance algorithm(RFI)”and comparing it with the filter method by using“Select percentile(SP)”method based chi square“Chi2”tool for selecting the most important features,which are then fed into a classification process using the logistic regression(LR)algorithm and the k-nearest neighbor(KNN)algorithm.Thus,the classification accuracy(AC)performance of LRis also compared to theKNN approach in python on eight data sets to see which method produces the best rating when feature selection methods are applied.Consequently,the study concluded that the feature selection methods have a significant impact on the analysis and visualization of the data after removing the repetitive data and the data that do not affect the goal.After making several comparisons,the study suggests(SFMLR)using SFM based on RFI algorithm for feature selection,with LR algorithm for data classify.The proposal proved its efficacy by comparing its results with recent literature.
基金the National Natural Science Foundation of China(6187138461921001).
文摘The optimal selection of radar clutter model is the premise of target detection,tracking,recognition,and cognitive waveform design in clutter background.Clutter characterization models are usually derived by mathematical simplification or empirical data fitting.However,the lack of standard model labels is a challenge in the optimal selection process.To solve this problem,a general three-level evaluation system for the model selection performance is proposed,including model selection accuracy index based on simulation data,fit goodness indexs based on the optimally selected model,and evaluation index based on the supporting performance to its third-party.The three-level evaluation system can more comprehensively and accurately describe the selection performance of the radar clutter model in different ways,and can be popularized and applied to the evaluation of other similar characterization model selection.
文摘In a competitive digital age where data volumes are increasing with time, the ability to extract meaningful knowledge from high-dimensional data using machine learning (ML) and data mining (DM) techniques and making decisions based on the extracted knowledge is becoming increasingly important in all business domains. Nevertheless, high-dimensional data remains a major challenge for classification algorithms due to its high computational cost and storage requirements. The 2016 Demographic and Health Survey of Ethiopia (EDHS 2016) used as the data source for this study which is publicly available contains several features that may not be relevant to the prediction task. In this paper, we developed a hybrid multidimensional metrics framework for predictive modeling for both model performance evaluation and feature selection to overcome the feature selection challenges and select the best model among the available models in DM and ML. The proposed hybrid metrics were used to measure the efficiency of the predictive models. Experimental results show that the decision tree algorithm is the most efficient model. The higher score of HMM (m, r) = 0.47 illustrates the overall significant model that encompasses almost all the user’s requirements, unlike the classical metrics that use a criterion to select the most appropriate model. On the other hand, the ANNs were found to be the most computationally intensive for our prediction task. Moreover, the type of data and the class size of the dataset (unbalanced data) have a significant impact on the efficiency of the model, especially on the computational cost, and the interpretability of the parameters of the model would be hampered. And the efficiency of the predictive model could be improved with other feature selection algorithms (especially hybrid metrics) considering the experts of the knowledge domain, as the understanding of the business domain has a significant impact.
文摘SVMs(support vector machines) is a new artificial intelligence methodology derived from Vapnik's statistical learning theory, which has better generalization than artificial neural network. A Csupport vector classifiers Based Fault Diagnostic Model (CBFDM) which gives the 3 most possible fault causes is constructed in this paper. Five fold cross validation is chosen as the method of model selection for CBFDM. The simulated data are generated from PW4000-94 engine influence coefficient matrix at cruise, and the results show that the diagnostic accuracy of CBFDM is over 93 % even when the standard deviation of noise is 3 times larger than the normal. This model can also be used for other diagnostic problems.
基金Research Committee,University of Macao,China Under Grant No.RG077/07-08S/09R/YKV/FST
文摘One branch of structural health monitoring (SHM) utilizes dynamic response measurements to assess the structural integrity of civil infrastructures. In particular,modal frequency is a widely adopted indicator for structural damage since its square is proportional to structural stiffness. However,it has been demonstrated in various SHM projects that this indicator is substantially affected by fluctuating environmental conditions. In order to provide reliable and consistent information on the health status of the monitored structures,it is necessary to develop a method to filter this interference. This study attempts to model and quantify the environmental influence on the modal frequencies of reinforced concrete buildings. Daily structural response measurements of a twenty-two story reinforced concrete building were collected and analyzed over a one-year period. The Bayesian spectral density approach was utilized to identify the modal frequencies of this building and it was clearly seen that the temperature and humidity fluctuation induced notable variations. A mathematical model was developed to quantify the environmental effects and model complexity was taken into consideration. Based on a Timoshenko beam model,the full model class was constructed and other reduced-order model class candidates were obtained. Then,the Bayesian modal class selection approach was employed to select the one with the most suitable complexity. The proposed model successfully characterizes the environmental influence on the modal frequencies. Furthermore,the estimated uncertainty of the model parameters allows for assessment of the reliability of the prediction. This study not only improves the understanding about the monitored structure,but also establishes a systematic approach for reliable health assessment of reinforced concrete buildings.
基金the Special Project of Major Theoretical Research and Interpretation of Philosophy and Social Sciences of Chongqing Municipal Education Commission,China(19SKZDZX15)the Key Project of Humanities and Social Sciences Research of Chongqing Education Commission,China(18SKSJ003)the Funding for Cultivating Major Projects in Humanities and Social Sciences of Southwest University,China(SWU1809009)。
文摘This study examines the impact of farmers’cooperatives participation and technology adoption on their economic welfare in China.A double selectivity model(DSM)is applied to correct for sample selection bias stemming from both observed and unobserved factors,and a propensity score matching(PSM)method is applied to calculate the agricultural income difference with counter factual analysis using survey data from 396 farmers in 15 provinces in China.The findings indicate that farmers who join farmer cooperatives and adopt agricultural technology can increase agricultural income by 2.77 and 2.35%,respectively,compared with those non-participants and non-adopters.Interestingly,the effect on agricultural income is found to be more significant for the low-income farmers than the high-income ones,with income increasing 5.45 and 4.51%when participating in farmer cooperatives and adopting agricultural technology,respectively.Our findings highlight the positive role of farmer cooperatives and agricultural technology in promoting farmers’economic welfare.Based on the findings,government policy implications are also discussed.
基金partly supported by National Basic Research Program of China(973 Program,2011CB707802,2013CB910200)National Science Foundation of China(11201466)
文摘The traditional model selection criterions try to make a balance between fitted error and model complexity. Assumptions on the distribution of the response or the noise, which may be misspecified, should be made before using the traditional ones. In this ar- ticle, we give a new model selection criterion, based on the assumption that noise term in the model is independent with explanatory variables, of minimizing the association strength between regression residuals and the response, with fewer assumptions. Maximal Information Coe^cient (MIC), a recently proposed dependence measure, captures a wide range of associ- ations, and gives almost the same score to different type of relationships with equal noise, so MIC is used to measure the association strength. Furthermore, partial maximal information coefficient (PMIC) is introduced to capture the association between two variables removing a third controlling random variable. In addition, the definition of general partial relationship is given.
基金This research was supported in part by the National Natural Science Foundation of China under Grant Nos.61672082 and 61822101Beijing Municipal Natural Science Foundation Nos.4181002Beihang University Innovation&Practice Fund for Graduate(YCSJ-02-2018-05).
文摘In vehicle ad-hoc networks(VANETs),the proliferation of wireless communication will give rise to the heterogeneous access environment where network selection becomes significant.Motivated by the self-adaptive paradigm of cellular attractors,this paper regards an individual communication as a cell,so that we can apply the revised attractor selection model to induce each connected vehicle.Aiming at improving the Quality of Service(QoS),we presented the bio-inspired handover decision-making mechanism.In addition,we employ the Technique for Order Preference by Similarity to an Ideal Solution(TOPSIS)for any vehicle to choose an access network.This paper proposes a novel framework where the bio-inspired mechanism is combined with TOPSIS.In a dynamic and random mobility environment,our method achieves the coordination of performance of heterogeneous networks by guaranteeing the efficient utilization and fair distribution of network resources in a global sense.The experimental results confirm that the proposed method performs better when compared with conventional schemes.
基金funded by the Young Academic Leaders Supporting Project in Institutions of Higher Education of Shanxi Province,China
文摘Covariance functions have been proposed as an alternative to model longitudinal data in animal breeding because of their various merits in comparison to the classical analytical methods.In practical estimation,different models and polynomial orders fitted can influence the estimates of covariance functions and thus genetic parameters.The objective of this study was to select model for estimation of covariance functions for body weights of Angora goats at 7 time points.Covariance functions were estimated by fitting 6 random regression models with birth year,birth month,sex,age of dam,birth type,and relative birth date as fixed effects.Random effects involved were direct and maternal additive genetic,and animal and maternal permanent environmental effects with different orders of fit.Selection of model and orders of fit were carried out by likelihood ratio test and 4 types of information criteria.The results showed that model with 6 orders of polynomial fit for direct additive genetic and animal permanent environmental effects and 4 and 5 orders for maternal genetic and permanent environmental effects,respectively,were preferable for estimation of covariance functions.Models with and without maternal effects influenced the estimates of covariance functions greatly.Maternal permanent environmental effect does not explain the variation of all permanent environments,well suggesting different sources of permanent environmental effects also has large influence on covariance function estimates.
文摘Data available in software engineering for many applications contains variability and it is not possible to say which variable helps in the process of the prediction.Most of the work present in software defect prediction is focused on the selection of best prediction techniques.For this purpose,deep learning and ensemble models have shown promising results.In contrast,there are very few researches that deals with cleaning the training data and selection of best parameter values from the data.Sometimes data available for training the models have high variability and this variability may cause a decrease in model accuracy.To deal with this problem we used the Akaike information criterion(AIC)and the Bayesian information criterion(BIC)for selection of the best variables to train the model.A simple ANN model with one input,one output and two hidden layers was used for the training instead of a very deep and complex model.AIC and BIC values are calculated and combination for minimum AIC and BIC values to be selected for the best model.At first,variables were narrowed down to a smaller number using correlation values.Then subsets for all the possible variable combinations were formed.In the end,an artificial neural network(ANN)model was trained for each subset and the best model was selected on the basis of the smallest AIC and BIC value.It was found that combination of only two variables’ns and entropy are best for software defect prediction as it gives minimum AIC and BIC values.While,nm and npt is the worst combination and gives maximum AIC and BIC values.
基金Science and Technology Development Fund of the Macao SAR under research grant SKL-IOTSC-2018-2020the Research Committee of University of Macao under Research Grant MYRG2016-00029-FST。
文摘Surface wave methods have received much attention due to their efficient, flexible and convenient characteristics. However, there are still critical issues regarding a key step in surface wave inversion. In most existing methods, the number of layers is assumed to be known prior to the process of inversion. However, improper assignment of this parameter leads to erroneous inversion results. A Bayesian nonparametric method for Rayleigh wave inversion is proposed herein to address this problem. In this method, each model class represents a particular number of layers with unknown S-wave velocity and thickness of each layer. As a result, determination of the number of layers is equivalent to selection of the most applicable model class. Regarding each model class, the optimization search of S-wave velocity and thickness of each layer is implemented by using a genetic algorithm. Then, each model class is assessed in view of its efficiency under the Bayesian framework and the most efficient class is selected. Simulated and actual examples verify that the proposed Bayesian nonparametric approach is reliable and efficient for Rayleigh wave inversion, especially for its capability to determine the number of layers.
基金This work was supported by the National Natural Science Foundation of China(71771034,71901011,71971039)the Scientific and Technological Innovation Foundation of Dalian(2018J11CY009).
文摘Affinity propagation(AP)is a classic clustering algorithm.To improve the classical AP algorithms,we propose a clustering algorithm namely,adaptive spectral affinity propagation(AdaSAP).In particular,we discuss why AP is not suitable for non-spherical clusters and present a unifying view of nine famous arbitrary-shaped clustering algorithms.We propose a strategy of extending AP in non-spherical clustering by constructing category similarity of objects.Leveraging the monotonicity that the clusters’number increases with the self-similarity in AP,we propose a model selection procedure that can determine the number of clusters adaptively.For the parameters introduced by extending AP in non-spherical clustering,we provide a grid-evolving strategy to optimize them automatically.The effectiveness of AdaSAP is evaluated by experiments on both synthetic datasets and real-world clustering tasks.Experimental results validate that the superiority of AdaSAP over benchmark algorithms like the classical AP and spectral clustering algorithms.
基金Project supported by the Natural Science Foundation of Yunnan province of China (Grant No 2006A0002M)the Science Foundation of Baoji University of Science and Arts of China (Grant No Zk0697)
文摘This paper investigates a genotype selection model subjected to both a multiplicative coloured noise and an additive coloured noise with different correlation time τ1 and τ2 by means of the numerical technique. By directly simulating the Langevin Equation, the following results are obtained. (1) The multiplicative coloured noise dominates, however, the effect of the additive coloured noise is not neglected in the practical gene selection process. The selection rate μ decides that the selection is propitious to gene A haploid or gene B haploid. (2) The additive coloured noise intensity and the correlation time τ2 play opposite roles. It is noted that α and τ2 can not separate the single peak, while can make the peak disappear and ^-2 can make the peak be sharp. (3) The multiplicative coloured noise intensity D and the correlation time τ1 can induce phase transition, at the same time they play opposite roles and the reentrance phenomenon appears. In this case, it is easy to select one type haploid from the group with increasing D and decreasing τ1.