Traditional methods for selecting models in experimental data analysis are susceptible to researcher bias, hindering exploration of alternative explanations and potentially leading to overfitting. The Finite Informati...Traditional methods for selecting models in experimental data analysis are susceptible to researcher bias, hindering exploration of alternative explanations and potentially leading to overfitting. The Finite Information Quantity (FIQ) approach offers a novel solution by acknowledging the inherent limitations in information processing capacity of physical systems. This framework facilitates the development of objective criteria for model selection (comparative uncertainty) and paves the way for a more comprehensive understanding of phenomena through exploring diverse explanations. This work presents a detailed comparison of the FIQ approach with ten established model selection methods, highlighting the advantages and limitations of each. We demonstrate the potential of FIQ to enhance the objectivity and robustness of scientific inquiry through three practical examples: selecting appropriate models for measuring fundamental constants, sound velocity, and underwater electrical discharges. Further research is warranted to explore the full applicability of FIQ across various scientific disciplines.展开更多
In recent years,deep learning-based signal recognition technology has gained attention and emerged as an important approach for safeguarding the electromagnetic environment.However,training deep learning-based classif...In recent years,deep learning-based signal recognition technology has gained attention and emerged as an important approach for safeguarding the electromagnetic environment.However,training deep learning-based classifiers on large signal datasets with redundant samples requires significant memory and high costs.This paper proposes a support databased core-set selection method(SD)for signal recognition,aiming to screen a representative subset that approximates the large signal dataset.Specifically,this subset can be identified by employing the labeled information during the early stages of model training,as some training samples are labeled as supporting data frequently.This support data is crucial for model training and can be found using a border sample selector.Simulation results demonstrate that the SD method minimizes the impact on model recognition performance while reducing the dataset size,and outperforms five other state-of-the-art core-set selection methods when the fraction of training sample kept is less than or equal to 0.3 on the RML2016.04C dataset or 0.5 on the RML22 dataset.The SD method is particularly helpful for signal recognition tasks with limited memory and computing resources.展开更多
Medical Internet of Things(IoT)devices are becoming more and more common in healthcare.This has created a huge need for advanced predictive health modeling strategies that can make good use of the growing amount of mu...Medical Internet of Things(IoT)devices are becoming more and more common in healthcare.This has created a huge need for advanced predictive health modeling strategies that can make good use of the growing amount of multimodal data to find potential health risks early and help individuals in a personalized way.Existing methods,while useful,have limitations in predictive accuracy,delay,personalization,and user interpretability,requiring a more comprehensive and efficient approach to harness modern medical IoT devices.MAIPFE is a multimodal approach integrating pre-emptive analysis,personalized feature selection,and explainable AI for real-time health monitoring and disease detection.By using AI for early disease detection,personalized health recommendations,and transparency,healthcare will be transformed.The Multimodal Approach Integrating Pre-emptive Analysis,Personalized Feature Selection,and Explainable AI(MAIPFE)framework,which combines Firefly Optimizer,Recurrent Neural Network(RNN),Fuzzy C Means(FCM),and Explainable AI,improves disease detection precision over existing methods.Comprehensive metrics show the model’s superiority in real-time health analysis.The proposed framework outperformed existing models by 8.3%in disease detection classification precision,8.5%in accuracy,5.5%in recall,2.9%in specificity,4.5%in AUC(Area Under the Curve),and 4.9%in delay reduction.Disease prediction precision increased by 4.5%,accuracy by 3.9%,recall by 2.5%,specificity by 3.5%,AUC by 1.9%,and delay levels decreased by 9.4%.MAIPFE can revolutionize healthcare with preemptive analysis,personalized health insights,and actionable recommendations.The research shows that this innovative approach improves patient outcomes and healthcare efficiency in the real world.展开更多
This research delves into the hurdles and strategies aimed at augmenting the market involvement of smallholder carrot farmers in Nakuru County, Kenya. Employing a Multinomial Logit (MNL) model, it scrutinizes the fact...This research delves into the hurdles and strategies aimed at augmenting the market involvement of smallholder carrot farmers in Nakuru County, Kenya. Employing a Multinomial Logit (MNL) model, it scrutinizes the factors influencing the selection of marketing outlets among carrot farmers. The findings unveil that a significant majority (81%) of surveyed farmers actively participate in diverse market outlets, encompassing the farm gate, cleaning point, local market, external market, and export market. Notably, pivotal buyers include aggregators, brokers, wholesalers, retailers, and consumers, with transactions predominantly occurring at the farm level. Additionally, the analysis discerns substantial influences of socio-economic characteristics, experiential factors, and geographical proximity on farmers’ choices of market outlets. Specifically, gender, age, land size, farming experience, and distance to markets emerge as critical determinants. Moreover, the study delves into the examination of market margins along the carrot value chain, shedding light on the potential profitability of carrot farming in the region. Remarkably, higher average gross margins are identified in export and external markets, signaling lucrative prospects for farmers targeting these segments. However, disparities in profit distribution between farmers and traders underscore the necessity for interventions to ensure equitable value distribution throughout the value chain. These findings underscore the imperative for tailored interventions to tackle challenges and foster inclusive agricultural development. Strategies such as farmer organizations, contracting, and vertical integration are advocated to enhance market access and profitability for smallholder carrot farmers. Thus, this study enriches our comprehension of the dynamics within carrot value chains and provides valuable insights for policymakers and development practitioners aiming to uplift rural livelihoods and bolster food security.展开更多
Soybean frogeye leaf spot(FLS) disease is a global disease affecting soybean yield, especially in the soybean growing area of Heilongjiang Province. In order to realize genomic selection breeding for FLS resistance of...Soybean frogeye leaf spot(FLS) disease is a global disease affecting soybean yield, especially in the soybean growing area of Heilongjiang Province. In order to realize genomic selection breeding for FLS resistance of soybean, least absolute shrinkage and selection operator(LASSO) regression and stepwise regression were combined, and a genomic selection model was established for 40 002 SNP markers covering soybean genome and relative lesion area of soybean FLS. As a result, 68 molecular markers controlling soybean FLS were detected accurately, and the phenotypic contribution rate of these markers reached 82.45%. In this study, a model was established, which could be used directly to evaluate the resistance of soybean FLS and to select excellent offspring. This research method could also provide ideas and methods for other plants to breeding in disease resistance.展开更多
The optimal selection of radar clutter model is the premise of target detection,tracking,recognition,and cognitive waveform design in clutter background.Clutter characterization models are usually derived by mathemati...The optimal selection of radar clutter model is the premise of target detection,tracking,recognition,and cognitive waveform design in clutter background.Clutter characterization models are usually derived by mathematical simplification or empirical data fitting.However,the lack of standard model labels is a challenge in the optimal selection process.To solve this problem,a general three-level evaluation system for the model selection performance is proposed,including model selection accuracy index based on simulation data,fit goodness indexs based on the optimally selected model,and evaluation index based on the supporting performance to its third-party.The three-level evaluation system can more comprehensively and accurately describe the selection performance of the radar clutter model in different ways,and can be popularized and applied to the evaluation of other similar characterization model selection.展开更多
We investigate the Turing instability and pattern formation mechanism of a plant-wrack model with both self-diffusion and cross-diffusion terms.We first study the effect of self-diffusion on the stability of equilibri...We investigate the Turing instability and pattern formation mechanism of a plant-wrack model with both self-diffusion and cross-diffusion terms.We first study the effect of self-diffusion on the stability of equilibrium.We then derive the conditions for the occurrence of the Turing patterns induced by cross-diffusion based on self-diffusion stability.Next,we analyze the pattern selection by using the amplitude equation and obtain the exact parameter ranges of different types of patterns,including stripe patterns,hexagonal patterns and mixed states.Finally,numerical simulations confirm the theoretical results.展开更多
In a competitive digital age where data volumes are increasing with time, the ability to extract meaningful knowledge from high-dimensional data using machine learning (ML) and data mining (DM) techniques and making d...In a competitive digital age where data volumes are increasing with time, the ability to extract meaningful knowledge from high-dimensional data using machine learning (ML) and data mining (DM) techniques and making decisions based on the extracted knowledge is becoming increasingly important in all business domains. Nevertheless, high-dimensional data remains a major challenge for classification algorithms due to its high computational cost and storage requirements. The 2016 Demographic and Health Survey of Ethiopia (EDHS 2016) used as the data source for this study which is publicly available contains several features that may not be relevant to the prediction task. In this paper, we developed a hybrid multidimensional metrics framework for predictive modeling for both model performance evaluation and feature selection to overcome the feature selection challenges and select the best model among the available models in DM and ML. The proposed hybrid metrics were used to measure the efficiency of the predictive models. Experimental results show that the decision tree algorithm is the most efficient model. The higher score of HMM (m, r) = 0.47 illustrates the overall significant model that encompasses almost all the user’s requirements, unlike the classical metrics that use a criterion to select the most appropriate model. On the other hand, the ANNs were found to be the most computationally intensive for our prediction task. Moreover, the type of data and the class size of the dataset (unbalanced data) have a significant impact on the efficiency of the model, especially on the computational cost, and the interpretability of the parameters of the model would be hampered. And the efficiency of the predictive model could be improved with other feature selection algorithms (especially hybrid metrics) considering the experts of the knowledge domain, as the understanding of the business domain has a significant impact.展开更多
This paper briefs the configuration and performance of large size gas turbines and their composed combined cycle power plants designed and produced by four large renown gas turbine manufacturing firms in the world, pr...This paper briefs the configuration and performance of large size gas turbines and their composed combined cycle power plants designed and produced by four large renown gas turbine manufacturing firms in the world, providing reference for the relevant sectors and enterprises in importing advanced gas turbines and technologies.展开更多
This study focuses on meeting the challenges of big data visualization by using of data reduction methods based the feature selection methods.To reduce the volume of big data and minimize model training time(Tt)while ...This study focuses on meeting the challenges of big data visualization by using of data reduction methods based the feature selection methods.To reduce the volume of big data and minimize model training time(Tt)while maintaining data quality.We contributed to meeting the challenges of big data visualization using the embedded method based“Select from model(SFM)”method by using“Random forest Importance algorithm(RFI)”and comparing it with the filter method by using“Select percentile(SP)”method based chi square“Chi2”tool for selecting the most important features,which are then fed into a classification process using the logistic regression(LR)algorithm and the k-nearest neighbor(KNN)algorithm.Thus,the classification accuracy(AC)performance of LRis also compared to theKNN approach in python on eight data sets to see which method produces the best rating when feature selection methods are applied.Consequently,the study concluded that the feature selection methods have a significant impact on the analysis and visualization of the data after removing the repetitive data and the data that do not affect the goal.After making several comparisons,the study suggests(SFMLR)using SFM based on RFI algorithm for feature selection,with LR algorithm for data classify.The proposal proved its efficacy by comparing its results with recent literature.展开更多
In this paper, based on the theory of parameter estimation, we give a selection method and, in a sense of a good character of the parameter estimation, we think that it is very reasonable. Moreover, we offer a calcula...In this paper, based on the theory of parameter estimation, we give a selection method and, in a sense of a good character of the parameter estimation, we think that it is very reasonable. Moreover, we offer a calculation method of selection statistic and an applied example.展开更多
An improved social force model based on exit selection is proposed to simulate pedestrians' microscopic behaviors in subway station. The modification lies in considering three factors of spatial distance, occupant...An improved social force model based on exit selection is proposed to simulate pedestrians' microscopic behaviors in subway station. The modification lies in considering three factors of spatial distance, occupant density and exit width. In addition, the problem of pedestrians selecting exit frequently is solved as follows: not changing to other exits in the affected area of one exit, using the probability of remaining preceding exit and invoking function of exit selection after several simulation steps. Pedestrians in subway station have some special characteristics, such as explicit destinations, different familiarities with subway station. Finally, Beijing Zoo Subway Station is taken as an example and the feasibility of the model results is verified through the comparison of the actual data and simulation data. The simulation results show that the improved model can depict the microscopic behaviors of pedestrians in subway station.展开更多
The traditional model selection criterions try to make a balance between fitted error and model complexity. Assumptions on the distribution of the response or the noise, which may be misspecified, should be made befor...The traditional model selection criterions try to make a balance between fitted error and model complexity. Assumptions on the distribution of the response or the noise, which may be misspecified, should be made before using the traditional ones. In this ar- ticle, we give a new model selection criterion, based on the assumption that noise term in the model is independent with explanatory variables, of minimizing the association strength between regression residuals and the response, with fewer assumptions. Maximal Information Coe^cient (MIC), a recently proposed dependence measure, captures a wide range of associ- ations, and gives almost the same score to different type of relationships with equal noise, so MIC is used to measure the association strength. Furthermore, partial maximal information coefficient (PMIC) is introduced to capture the association between two variables removing a third controlling random variable. In addition, the definition of general partial relationship is given.展开更多
We present Turing pattern selection in a reaction-diffusion epidemic model under zero-flux boundary conditions. The value of this study is twofold. First, it establishes the amplitude equations for the excited modes, ...We present Turing pattern selection in a reaction-diffusion epidemic model under zero-flux boundary conditions. The value of this study is twofold. First, it establishes the amplitude equations for the excited modes, which determines the stability of amplitudes towards uniform and inhomogeneous perturbations. Second, it illustrates all five categories of Turing patterns close to the onset of Turing bifurcation via numerical simulations which indicates that the model dynamics exhibits complex pattern replication: on increasing the control parameter v, the sequence "H0 hexagons → H0-hexagon-stripe mixtures →stripes → Hπ-hexagon-stripe mixtures → Hπ hexagons" is observed. This may enrich the pattern dynamics in a diffusive epidemic model.展开更多
The performance of six statistical approaches,which can be used for selection of the best model to describe the growth of individual fish,was analyzed using simulated and real length-at-age data.The six approaches inc...The performance of six statistical approaches,which can be used for selection of the best model to describe the growth of individual fish,was analyzed using simulated and real length-at-age data.The six approaches include coefficient of determination(R2),adjusted coefficient of determination(adj.-R2),root mean squared error(RMSE),Akaike's information criterion(AIC),bias correction of AIC(AICc) and Bayesian information criterion(BIC).The simulation data were generated by five growth models with different numbers of parameters.Four sets of real data were taken from the literature.The parameters in each of the five growth models were estimated using the maximum likelihood method under the assumption of the additive error structure for the data.The best supported model by the data was identified using each of the six approaches.The results show that R2 and RMSE have the same properties and perform worst.The sample size has an effect on the performance of adj.-R2,AIC,AICc and BIC.Adj.-R2 does better in small samples than in large samples.AIC is not suitable to use in small samples and tends to select more complex model when the sample size becomes large.AICc and BIC have best performance in small and large sample cases,respectively.Use of AICc or BIC is recommended for selection of fish growth model according to the size of the length-at-age data.展开更多
Covariance functions have been proposed as an alternative to model longitudinal data in animal breeding because of their various merits in comparison to the classical analytical methods.In practical estimation,differe...Covariance functions have been proposed as an alternative to model longitudinal data in animal breeding because of their various merits in comparison to the classical analytical methods.In practical estimation,different models and polynomial orders fitted can influence the estimates of covariance functions and thus genetic parameters.The objective of this study was to select model for estimation of covariance functions for body weights of Angora goats at 7 time points.Covariance functions were estimated by fitting 6 random regression models with birth year,birth month,sex,age of dam,birth type,and relative birth date as fixed effects.Random effects involved were direct and maternal additive genetic,and animal and maternal permanent environmental effects with different orders of fit.Selection of model and orders of fit were carried out by likelihood ratio test and 4 types of information criteria.The results showed that model with 6 orders of polynomial fit for direct additive genetic and animal permanent environmental effects and 4 and 5 orders for maternal genetic and permanent environmental effects,respectively,were preferable for estimation of covariance functions.Models with and without maternal effects influenced the estimates of covariance functions greatly.Maternal permanent environmental effect does not explain the variation of all permanent environments,well suggesting different sources of permanent environmental effects also has large influence on covariance function estimates.展开更多
The test selection and optimization (TSO) can improve the abilities of fault diagnosis, prognosis and health-state evalua- tion for prognostics and health management (PHM) systems. Traditionally, TSO mainly focuse...The test selection and optimization (TSO) can improve the abilities of fault diagnosis, prognosis and health-state evalua- tion for prognostics and health management (PHM) systems. Traditionally, TSO mainly focuses on fault detection and isolation, but they cannot provide an effective guide for the design for testability (DFT) to improve the PHM performance level. To solve the problem, a model of TSO for PHM systems is proposed. Firstly, through integrating the characteristics of fault severity and propa- gation time, and analyzing the test timing and sensitivity, a testability model based on failure evolution mechanism model (FEMM) for PHM systems is built up. This model describes the fault evolution- test dependency using the fault-symptom parameter matrix and symptom parameter-test matrix. Secondly, a novel method of in- herent testability analysis for PHM systems is developed based on the above information. Having completed the analysis, a TSO model, whose objective is to maximize fault trackability and mini- mize the test cost, is proposed through inherent testability analysis results, and an adaptive simulated annealing genetic algorithm (ASAGA) is introduced to solve the TSO problem. Finally, a case of a centrifugal pump system is used to verify the feasibility and effectiveness of the proposed models and methods. The results show that the proposed technology is important for PHM systems to select and optimize the test set in order to improve their performance level.展开更多
Genomic selection(GS)can be used to accelerate genetic improvement by shortening the selection interval.The successful application of GS depends largely on the accuracy of the prediction of genomic estimated breeding ...Genomic selection(GS)can be used to accelerate genetic improvement by shortening the selection interval.The successful application of GS depends largely on the accuracy of the prediction of genomic estimated breeding value(GEBV).This study is a fi rst attempt to understand the practicality of GS in Litopenaeus vannamei and aims to evaluate models for GS on growth traits.The performance of GS models in L.vannamei was evaluated in a population consisting of 205 individuals,which were genotyped for 6 359 single nucleotide polymorphism(SNP)markers by specifi c length amplifi ed fragment sequencing(SLAF-seq)and phenotyped for body length and body weight.Three GS models(RR-BLUP,Bayes A,and Bayesian LASSO)were used to obtain the GEBV,and their predictive ability was assessed by the reliability of the GEBV and the bias of the predicted phenotypes.The mean reliability of the GEBVs for body length and body weight predicted by the dif ferent models was 0.296 and 0.411,respectively.For each trait,the performances of the three models were very similar to each other with respect to predictability.The regression coeffi cients estimated by the three models were close to one,suggesting near to zero bias for the predictions.Therefore,when GS was applied in a L.vannamei population for the studied scenarios,all three models appeared practicable.Further analyses suggested that improved estimation of the genomic prediction could be realized by increasing the size of the training population as well as the density of SNPs.展开更多
Modern medicine is reliant on various medical imaging technologies for non-invasively observing patients’anatomy.However,the interpretation of medical images can be highly subjective and dependent on the expertise of...Modern medicine is reliant on various medical imaging technologies for non-invasively observing patients’anatomy.However,the interpretation of medical images can be highly subjective and dependent on the expertise of clinicians.Moreover,some potentially useful quantitative information in medical images,especially that which is not visible to the naked eye,is often ignored during clinical practice.In contrast,radiomics performs high-throughput feature extraction from medical images,which enables quantitative analysis of medical images and prediction of various clinical endpoints.Studies have reported that radiomics exhibits promising performance in diagnosis and predicting treatment responses and prognosis,demonstrating its potential to be a non-invasive auxiliary tool for personalized medicine.However,radiomics remains in a developmental phase as numerous technical challenges have yet to be solved,especially in feature engineering and statistical modeling.In this review,we introduce the current utility of radiomics by summarizing research on its application in the diagnosis,prognosis,and prediction of treatment responses in patients with cancer.We focus on machine learning approaches,for feature extraction and selection during feature engineering and for imbalanced datasets and multi-modality fusion during statistical modeling.Furthermore,we introduce the stability,reproducibility,and interpretability of features,and the generalizability and interpretability of models.Finally,we offer possible solutions to current challenges in radiomics research.展开更多
In this paper we reparameterize covariance structures in longitudinal data analysis through the modified Cholesky decomposition of itself. Based on this modified Cholesky decomposition, the within-subject covariance m...In this paper we reparameterize covariance structures in longitudinal data analysis through the modified Cholesky decomposition of itself. Based on this modified Cholesky decomposition, the within-subject covariance matrix is decomposed into a unit lower triangular matrix involving moving average coefficients and a diagonal matrix involving innovation variances, which are modeled as linear functions of covariates. Then, we propose a penalized maximum likelihood method for variable selection in joint mean and covariance models based on this decomposition. Under certain regularity conditions, we establish the consistency and asymptotic normality of the penalized maximum likelihood estimators of parameters in the models. Simulation studies are undertaken to assess the finite sample performance of the proposed variable selection procedure.展开更多
文摘Traditional methods for selecting models in experimental data analysis are susceptible to researcher bias, hindering exploration of alternative explanations and potentially leading to overfitting. The Finite Information Quantity (FIQ) approach offers a novel solution by acknowledging the inherent limitations in information processing capacity of physical systems. This framework facilitates the development of objective criteria for model selection (comparative uncertainty) and paves the way for a more comprehensive understanding of phenomena through exploring diverse explanations. This work presents a detailed comparison of the FIQ approach with ten established model selection methods, highlighting the advantages and limitations of each. We demonstrate the potential of FIQ to enhance the objectivity and robustness of scientific inquiry through three practical examples: selecting appropriate models for measuring fundamental constants, sound velocity, and underwater electrical discharges. Further research is warranted to explore the full applicability of FIQ across various scientific disciplines.
基金supported by National Natural Science Foundation of China(62371098)Natural Science Foundation of Sichuan Province(2023NSFSC1422)+1 种基金National Key Research and Development Program of China(2021YFB2900404)Central Universities of South west Minzu University(ZYN2022032).
文摘In recent years,deep learning-based signal recognition technology has gained attention and emerged as an important approach for safeguarding the electromagnetic environment.However,training deep learning-based classifiers on large signal datasets with redundant samples requires significant memory and high costs.This paper proposes a support databased core-set selection method(SD)for signal recognition,aiming to screen a representative subset that approximates the large signal dataset.Specifically,this subset can be identified by employing the labeled information during the early stages of model training,as some training samples are labeled as supporting data frequently.This support data is crucial for model training and can be found using a border sample selector.Simulation results demonstrate that the SD method minimizes the impact on model recognition performance while reducing the dataset size,and outperforms five other state-of-the-art core-set selection methods when the fraction of training sample kept is less than or equal to 0.3 on the RML2016.04C dataset or 0.5 on the RML22 dataset.The SD method is particularly helpful for signal recognition tasks with limited memory and computing resources.
文摘Medical Internet of Things(IoT)devices are becoming more and more common in healthcare.This has created a huge need for advanced predictive health modeling strategies that can make good use of the growing amount of multimodal data to find potential health risks early and help individuals in a personalized way.Existing methods,while useful,have limitations in predictive accuracy,delay,personalization,and user interpretability,requiring a more comprehensive and efficient approach to harness modern medical IoT devices.MAIPFE is a multimodal approach integrating pre-emptive analysis,personalized feature selection,and explainable AI for real-time health monitoring and disease detection.By using AI for early disease detection,personalized health recommendations,and transparency,healthcare will be transformed.The Multimodal Approach Integrating Pre-emptive Analysis,Personalized Feature Selection,and Explainable AI(MAIPFE)framework,which combines Firefly Optimizer,Recurrent Neural Network(RNN),Fuzzy C Means(FCM),and Explainable AI,improves disease detection precision over existing methods.Comprehensive metrics show the model’s superiority in real-time health analysis.The proposed framework outperformed existing models by 8.3%in disease detection classification precision,8.5%in accuracy,5.5%in recall,2.9%in specificity,4.5%in AUC(Area Under the Curve),and 4.9%in delay reduction.Disease prediction precision increased by 4.5%,accuracy by 3.9%,recall by 2.5%,specificity by 3.5%,AUC by 1.9%,and delay levels decreased by 9.4%.MAIPFE can revolutionize healthcare with preemptive analysis,personalized health insights,and actionable recommendations.The research shows that this innovative approach improves patient outcomes and healthcare efficiency in the real world.
文摘This research delves into the hurdles and strategies aimed at augmenting the market involvement of smallholder carrot farmers in Nakuru County, Kenya. Employing a Multinomial Logit (MNL) model, it scrutinizes the factors influencing the selection of marketing outlets among carrot farmers. The findings unveil that a significant majority (81%) of surveyed farmers actively participate in diverse market outlets, encompassing the farm gate, cleaning point, local market, external market, and export market. Notably, pivotal buyers include aggregators, brokers, wholesalers, retailers, and consumers, with transactions predominantly occurring at the farm level. Additionally, the analysis discerns substantial influences of socio-economic characteristics, experiential factors, and geographical proximity on farmers’ choices of market outlets. Specifically, gender, age, land size, farming experience, and distance to markets emerge as critical determinants. Moreover, the study delves into the examination of market margins along the carrot value chain, shedding light on the potential profitability of carrot farming in the region. Remarkably, higher average gross margins are identified in export and external markets, signaling lucrative prospects for farmers targeting these segments. However, disparities in profit distribution between farmers and traders underscore the necessity for interventions to ensure equitable value distribution throughout the value chain. These findings underscore the imperative for tailored interventions to tackle challenges and foster inclusive agricultural development. Strategies such as farmer organizations, contracting, and vertical integration are advocated to enhance market access and profitability for smallholder carrot farmers. Thus, this study enriches our comprehension of the dynamics within carrot value chains and provides valuable insights for policymakers and development practitioners aiming to uplift rural livelihoods and bolster food security.
基金Supported by the National Key Research and Development Program of China(2021YFD1201103-01-05)。
文摘Soybean frogeye leaf spot(FLS) disease is a global disease affecting soybean yield, especially in the soybean growing area of Heilongjiang Province. In order to realize genomic selection breeding for FLS resistance of soybean, least absolute shrinkage and selection operator(LASSO) regression and stepwise regression were combined, and a genomic selection model was established for 40 002 SNP markers covering soybean genome and relative lesion area of soybean FLS. As a result, 68 molecular markers controlling soybean FLS were detected accurately, and the phenotypic contribution rate of these markers reached 82.45%. In this study, a model was established, which could be used directly to evaluate the resistance of soybean FLS and to select excellent offspring. This research method could also provide ideas and methods for other plants to breeding in disease resistance.
基金the National Natural Science Foundation of China(6187138461921001).
文摘The optimal selection of radar clutter model is the premise of target detection,tracking,recognition,and cognitive waveform design in clutter background.Clutter characterization models are usually derived by mathematical simplification or empirical data fitting.However,the lack of standard model labels is a challenge in the optimal selection process.To solve this problem,a general three-level evaluation system for the model selection performance is proposed,including model selection accuracy index based on simulation data,fit goodness indexs based on the optimally selected model,and evaluation index based on the supporting performance to its third-party.The three-level evaluation system can more comprehensively and accurately describe the selection performance of the radar clutter model in different ways,and can be popularized and applied to the evaluation of other similar characterization model selection.
基金the National Natural Science Foundation of China(Grant Nos.10971009,11771033,and12201046)Fundamental Research Funds for the Central Universities(Grant No.BLX201925)China Postdoctoral Science Foundation(Grant No.2020M670175)。
文摘We investigate the Turing instability and pattern formation mechanism of a plant-wrack model with both self-diffusion and cross-diffusion terms.We first study the effect of self-diffusion on the stability of equilibrium.We then derive the conditions for the occurrence of the Turing patterns induced by cross-diffusion based on self-diffusion stability.Next,we analyze the pattern selection by using the amplitude equation and obtain the exact parameter ranges of different types of patterns,including stripe patterns,hexagonal patterns and mixed states.Finally,numerical simulations confirm the theoretical results.
文摘In a competitive digital age where data volumes are increasing with time, the ability to extract meaningful knowledge from high-dimensional data using machine learning (ML) and data mining (DM) techniques and making decisions based on the extracted knowledge is becoming increasingly important in all business domains. Nevertheless, high-dimensional data remains a major challenge for classification algorithms due to its high computational cost and storage requirements. The 2016 Demographic and Health Survey of Ethiopia (EDHS 2016) used as the data source for this study which is publicly available contains several features that may not be relevant to the prediction task. In this paper, we developed a hybrid multidimensional metrics framework for predictive modeling for both model performance evaluation and feature selection to overcome the feature selection challenges and select the best model among the available models in DM and ML. The proposed hybrid metrics were used to measure the efficiency of the predictive models. Experimental results show that the decision tree algorithm is the most efficient model. The higher score of HMM (m, r) = 0.47 illustrates the overall significant model that encompasses almost all the user’s requirements, unlike the classical metrics that use a criterion to select the most appropriate model. On the other hand, the ANNs were found to be the most computationally intensive for our prediction task. Moreover, the type of data and the class size of the dataset (unbalanced data) have a significant impact on the efficiency of the model, especially on the computational cost, and the interpretability of the parameters of the model would be hampered. And the efficiency of the predictive model could be improved with other feature selection algorithms (especially hybrid metrics) considering the experts of the knowledge domain, as the understanding of the business domain has a significant impact.
文摘This paper briefs the configuration and performance of large size gas turbines and their composed combined cycle power plants designed and produced by four large renown gas turbine manufacturing firms in the world, providing reference for the relevant sectors and enterprises in importing advanced gas turbines and technologies.
文摘This study focuses on meeting the challenges of big data visualization by using of data reduction methods based the feature selection methods.To reduce the volume of big data and minimize model training time(Tt)while maintaining data quality.We contributed to meeting the challenges of big data visualization using the embedded method based“Select from model(SFM)”method by using“Random forest Importance algorithm(RFI)”and comparing it with the filter method by using“Select percentile(SP)”method based chi square“Chi2”tool for selecting the most important features,which are then fed into a classification process using the logistic regression(LR)algorithm and the k-nearest neighbor(KNN)algorithm.Thus,the classification accuracy(AC)performance of LRis also compared to theKNN approach in python on eight data sets to see which method produces the best rating when feature selection methods are applied.Consequently,the study concluded that the feature selection methods have a significant impact on the analysis and visualization of the data after removing the repetitive data and the data that do not affect the goal.After making several comparisons,the study suggests(SFMLR)using SFM based on RFI algorithm for feature selection,with LR algorithm for data classify.The proposal proved its efficacy by comparing its results with recent literature.
基金Supported by the Natural Science Foundation of Anhui Education Committee
文摘In this paper, based on the theory of parameter estimation, we give a selection method and, in a sense of a good character of the parameter estimation, we think that it is very reasonable. Moreover, we offer a calculation method of selection statistic and an applied example.
基金Project(T14JB00200)supported by the Fundamental Research Funds for the Central UniversitiesChina+2 种基金Projects(RCS2012ZZ002RCS2012ZT003)supported by the State Key Laboratory of Rail Traffic Control and SafetyChina
文摘An improved social force model based on exit selection is proposed to simulate pedestrians' microscopic behaviors in subway station. The modification lies in considering three factors of spatial distance, occupant density and exit width. In addition, the problem of pedestrians selecting exit frequently is solved as follows: not changing to other exits in the affected area of one exit, using the probability of remaining preceding exit and invoking function of exit selection after several simulation steps. Pedestrians in subway station have some special characteristics, such as explicit destinations, different familiarities with subway station. Finally, Beijing Zoo Subway Station is taken as an example and the feasibility of the model results is verified through the comparison of the actual data and simulation data. The simulation results show that the improved model can depict the microscopic behaviors of pedestrians in subway station.
基金partly supported by National Basic Research Program of China(973 Program,2011CB707802,2013CB910200)National Science Foundation of China(11201466)
文摘The traditional model selection criterions try to make a balance between fitted error and model complexity. Assumptions on the distribution of the response or the noise, which may be misspecified, should be made before using the traditional ones. In this ar- ticle, we give a new model selection criterion, based on the assumption that noise term in the model is independent with explanatory variables, of minimizing the association strength between regression residuals and the response, with fewer assumptions. Maximal Information Coe^cient (MIC), a recently proposed dependence measure, captures a wide range of associ- ations, and gives almost the same score to different type of relationships with equal noise, so MIC is used to measure the association strength. Furthermore, partial maximal information coefficient (PMIC) is introduced to capture the association between two variables removing a third controlling random variable. In addition, the definition of general partial relationship is given.
基金Project supported by the Natural Science Foundation of Zhejiang Province of China (Grant No.Y7080041)
文摘We present Turing pattern selection in a reaction-diffusion epidemic model under zero-flux boundary conditions. The value of this study is twofold. First, it establishes the amplitude equations for the excited modes, which determines the stability of amplitudes towards uniform and inhomogeneous perturbations. Second, it illustrates all five categories of Turing patterns close to the onset of Turing bifurcation via numerical simulations which indicates that the model dynamics exhibits complex pattern replication: on increasing the control parameter v, the sequence "H0 hexagons → H0-hexagon-stripe mixtures →stripes → Hπ-hexagon-stripe mixtures → Hπ hexagons" is observed. This may enrich the pattern dynamics in a diffusive epidemic model.
基金Supported by the High Technology Research and Development Program of China (863 Program,No2006AA100301)
文摘The performance of six statistical approaches,which can be used for selection of the best model to describe the growth of individual fish,was analyzed using simulated and real length-at-age data.The six approaches include coefficient of determination(R2),adjusted coefficient of determination(adj.-R2),root mean squared error(RMSE),Akaike's information criterion(AIC),bias correction of AIC(AICc) and Bayesian information criterion(BIC).The simulation data were generated by five growth models with different numbers of parameters.Four sets of real data were taken from the literature.The parameters in each of the five growth models were estimated using the maximum likelihood method under the assumption of the additive error structure for the data.The best supported model by the data was identified using each of the six approaches.The results show that R2 and RMSE have the same properties and perform worst.The sample size has an effect on the performance of adj.-R2,AIC,AICc and BIC.Adj.-R2 does better in small samples than in large samples.AIC is not suitable to use in small samples and tends to select more complex model when the sample size becomes large.AICc and BIC have best performance in small and large sample cases,respectively.Use of AICc or BIC is recommended for selection of fish growth model according to the size of the length-at-age data.
基金funded by the Young Academic Leaders Supporting Project in Institutions of Higher Education of Shanxi Province,China
文摘Covariance functions have been proposed as an alternative to model longitudinal data in animal breeding because of their various merits in comparison to the classical analytical methods.In practical estimation,different models and polynomial orders fitted can influence the estimates of covariance functions and thus genetic parameters.The objective of this study was to select model for estimation of covariance functions for body weights of Angora goats at 7 time points.Covariance functions were estimated by fitting 6 random regression models with birth year,birth month,sex,age of dam,birth type,and relative birth date as fixed effects.Random effects involved were direct and maternal additive genetic,and animal and maternal permanent environmental effects with different orders of fit.Selection of model and orders of fit were carried out by likelihood ratio test and 4 types of information criteria.The results showed that model with 6 orders of polynomial fit for direct additive genetic and animal permanent environmental effects and 4 and 5 orders for maternal genetic and permanent environmental effects,respectively,were preferable for estimation of covariance functions.Models with and without maternal effects influenced the estimates of covariance functions greatly.Maternal permanent environmental effect does not explain the variation of all permanent environments,well suggesting different sources of permanent environmental effects also has large influence on covariance function estimates.
基金supported by the National Natural Science Foundation of China(51175502)
文摘The test selection and optimization (TSO) can improve the abilities of fault diagnosis, prognosis and health-state evalua- tion for prognostics and health management (PHM) systems. Traditionally, TSO mainly focuses on fault detection and isolation, but they cannot provide an effective guide for the design for testability (DFT) to improve the PHM performance level. To solve the problem, a model of TSO for PHM systems is proposed. Firstly, through integrating the characteristics of fault severity and propa- gation time, and analyzing the test timing and sensitivity, a testability model based on failure evolution mechanism model (FEMM) for PHM systems is built up. This model describes the fault evolution- test dependency using the fault-symptom parameter matrix and symptom parameter-test matrix. Secondly, a novel method of in- herent testability analysis for PHM systems is developed based on the above information. Having completed the analysis, a TSO model, whose objective is to maximize fault trackability and mini- mize the test cost, is proposed through inherent testability analysis results, and an adaptive simulated annealing genetic algorithm (ASAGA) is introduced to solve the TSO problem. Finally, a case of a centrifugal pump system is used to verify the feasibility and effectiveness of the proposed models and methods. The results show that the proposed technology is important for PHM systems to select and optimize the test set in order to improve their performance level.
基金Supported by the National High Technology Research and Development Program of China(863 Program)(No.2012AA10A404)the National Natural Science Foundation of China(No.31502161)Financially Supported by Qingdao National Laboratory for Marine Science and Technology(No.2015ASKJ02)
文摘Genomic selection(GS)can be used to accelerate genetic improvement by shortening the selection interval.The successful application of GS depends largely on the accuracy of the prediction of genomic estimated breeding value(GEBV).This study is a fi rst attempt to understand the practicality of GS in Litopenaeus vannamei and aims to evaluate models for GS on growth traits.The performance of GS models in L.vannamei was evaluated in a population consisting of 205 individuals,which were genotyped for 6 359 single nucleotide polymorphism(SNP)markers by specifi c length amplifi ed fragment sequencing(SLAF-seq)and phenotyped for body length and body weight.Three GS models(RR-BLUP,Bayes A,and Bayesian LASSO)were used to obtain the GEBV,and their predictive ability was assessed by the reliability of the GEBV and the bias of the predicted phenotypes.The mean reliability of the GEBVs for body length and body weight predicted by the dif ferent models was 0.296 and 0.411,respectively.For each trait,the performances of the three models were very similar to each other with respect to predictability.The regression coeffi cients estimated by the three models were close to one,suggesting near to zero bias for the predictions.Therefore,when GS was applied in a L.vannamei population for the studied scenarios,all three models appeared practicable.Further analyses suggested that improved estimation of the genomic prediction could be realized by increasing the size of the training population as well as the density of SNPs.
基金supported in part by the National Natural Science Foundation of China(82072019)the Shenzhen Basic Research Program(JCYJ20210324130209023)+5 种基金the Shenzhen-Hong Kong-Macao S&T Program(Category C)(SGDX20201103095002019)the Mainland-Hong Kong Joint Funding Scheme(MHKJFS)(MHP/005/20),the Project of Strategic Importance Fund(P0035421)the Projects of RISA(P0043001)from the Hong Kong Polytechnic University,the Natural Science Foundation of Jiangsu Province(BK20201441)the Provincial and Ministry Co-constructed Project of Henan Province Medical Science and Technology Research(SBGJ202103038,SBGJ202102056)the Henan Province Key R&D and Promotion Project(Science and Technology Research)(222102310015)the Natural Science Foundation of Henan Province(222300420575),and the Henan Province Science and Technology Research(222102310322).
文摘Modern medicine is reliant on various medical imaging technologies for non-invasively observing patients’anatomy.However,the interpretation of medical images can be highly subjective and dependent on the expertise of clinicians.Moreover,some potentially useful quantitative information in medical images,especially that which is not visible to the naked eye,is often ignored during clinical practice.In contrast,radiomics performs high-throughput feature extraction from medical images,which enables quantitative analysis of medical images and prediction of various clinical endpoints.Studies have reported that radiomics exhibits promising performance in diagnosis and predicting treatment responses and prognosis,demonstrating its potential to be a non-invasive auxiliary tool for personalized medicine.However,radiomics remains in a developmental phase as numerous technical challenges have yet to be solved,especially in feature engineering and statistical modeling.In this review,we introduce the current utility of radiomics by summarizing research on its application in the diagnosis,prognosis,and prediction of treatment responses in patients with cancer.We focus on machine learning approaches,for feature extraction and selection during feature engineering and for imbalanced datasets and multi-modality fusion during statistical modeling.Furthermore,we introduce the stability,reproducibility,and interpretability of features,and the generalizability and interpretability of models.Finally,we offer possible solutions to current challenges in radiomics research.
文摘In this paper we reparameterize covariance structures in longitudinal data analysis through the modified Cholesky decomposition of itself. Based on this modified Cholesky decomposition, the within-subject covariance matrix is decomposed into a unit lower triangular matrix involving moving average coefficients and a diagonal matrix involving innovation variances, which are modeled as linear functions of covariates. Then, we propose a penalized maximum likelihood method for variable selection in joint mean and covariance models based on this decomposition. Under certain regularity conditions, we establish the consistency and asymptotic normality of the penalized maximum likelihood estimators of parameters in the models. Simulation studies are undertaken to assess the finite sample performance of the proposed variable selection procedure.