Modern medicine is reliant on various medical imaging technologies for non-invasively observing patients’anatomy.However,the interpretation of medical images can be highly subjective and dependent on the expertise of...Modern medicine is reliant on various medical imaging technologies for non-invasively observing patients’anatomy.However,the interpretation of medical images can be highly subjective and dependent on the expertise of clinicians.Moreover,some potentially useful quantitative information in medical images,especially that which is not visible to the naked eye,is often ignored during clinical practice.In contrast,radiomics performs high-throughput feature extraction from medical images,which enables quantitative analysis of medical images and prediction of various clinical endpoints.Studies have reported that radiomics exhibits promising performance in diagnosis and predicting treatment responses and prognosis,demonstrating its potential to be a non-invasive auxiliary tool for personalized medicine.However,radiomics remains in a developmental phase as numerous technical challenges have yet to be solved,especially in feature engineering and statistical modeling.In this review,we introduce the current utility of radiomics by summarizing research on its application in the diagnosis,prognosis,and prediction of treatment responses in patients with cancer.We focus on machine learning approaches,for feature extraction and selection during feature engineering and for imbalanced datasets and multi-modality fusion during statistical modeling.Furthermore,we introduce the stability,reproducibility,and interpretability of features,and the generalizability and interpretability of models.Finally,we offer possible solutions to current challenges in radiomics research.展开更多
Cleats are the dominant micro-fracture network controlling the macro-mechanical behavior of coal.Improved understanding of the spatial characteristics of cleat networks is therefore important to the coal mining indust...Cleats are the dominant micro-fracture network controlling the macro-mechanical behavior of coal.Improved understanding of the spatial characteristics of cleat networks is therefore important to the coal mining industry.Discrete fracture networks(DFNs)are increasingly used in engineering analyses to spatially model fractures at various scales.The reliability of coal DFNs largely depends on the confidence in the input cleat statistics.Estimates of these parameters can be made from image-based three-dimensional(3D)characterization of coal cleats using X-ray micro-computed tomography(m CT).One key step in this process,after cleat extraction,is the separation of individual cleats,without which the cleats are a connected network and statistics for different cleat sets cannot be measured.In this paper,a feature extraction-based image processing method is introduced to identify and separate distinct cleat groups from 3D X-ray m CT images.Kernels(filters)representing explicit cleat features of coal are built and cleat separation is successfully achieved by convolutional operations on 3D coal images.The new method is applied to a coal specimen with 80 mm in diameter and 100 mm in length acquired from an Anglo American Steelmaking Coal mine in the Bowen Basin,Queensland,Australia.It is demonstrated that the new method produces reliable cleat separation capable of defining individual cleats and preserving 3D topology after separation.Bedding-parallel fractures are also identified and separated,which has his-torically been challenging to delineate and rarely reported.A variety of cleat/fracture statistics is measured which not only can quantitatively characterize the cleat/fracture system but also can be used for DFN modeling.Finally,variability and heterogeneity with respect to the core axis are investigated.Significant heterogeneity is observed and suggests that the representative elementary volume(REV)of the cleat groups for engineering purposes may be a complex problem requiring careful consideration.展开更多
Medical Internet of Things(IoT)devices are becoming more and more common in healthcare.This has created a huge need for advanced predictive health modeling strategies that can make good use of the growing amount of mu...Medical Internet of Things(IoT)devices are becoming more and more common in healthcare.This has created a huge need for advanced predictive health modeling strategies that can make good use of the growing amount of multimodal data to find potential health risks early and help individuals in a personalized way.Existing methods,while useful,have limitations in predictive accuracy,delay,personalization,and user interpretability,requiring a more comprehensive and efficient approach to harness modern medical IoT devices.MAIPFE is a multimodal approach integrating pre-emptive analysis,personalized feature selection,and explainable AI for real-time health monitoring and disease detection.By using AI for early disease detection,personalized health recommendations,and transparency,healthcare will be transformed.The Multimodal Approach Integrating Pre-emptive Analysis,Personalized Feature Selection,and Explainable AI(MAIPFE)framework,which combines Firefly Optimizer,Recurrent Neural Network(RNN),Fuzzy C Means(FCM),and Explainable AI,improves disease detection precision over existing methods.Comprehensive metrics show the model’s superiority in real-time health analysis.The proposed framework outperformed existing models by 8.3%in disease detection classification precision,8.5%in accuracy,5.5%in recall,2.9%in specificity,4.5%in AUC(Area Under the Curve),and 4.9%in delay reduction.Disease prediction precision increased by 4.5%,accuracy by 3.9%,recall by 2.5%,specificity by 3.5%,AUC by 1.9%,and delay levels decreased by 9.4%.MAIPFE can revolutionize healthcare with preemptive analysis,personalized health insights,and actionable recommendations.The research shows that this innovative approach improves patient outcomes and healthcare efficiency in the real world.展开更多
This work highlights the unparalleled efficiency of the “n<sup>th</sup>-Order Function/ Feature Adjoint Sensitivity Analysis Methodology for Nonlinear Systems” (n<sup>th</sup>-FASAM-N) by con...This work highlights the unparalleled efficiency of the “n<sup>th</sup>-Order Function/ Feature Adjoint Sensitivity Analysis Methodology for Nonlinear Systems” (n<sup>th</sup>-FASAM-N) by considering the well-known Nordheim-Fuchs reactor dynamics/safety model. This model describes a short-time self-limiting power excursion in a nuclear reactor system having a negative temperature coefficient in which a large amount of reactivity is suddenly inserted, either intentionally or by accident. This nonlinear paradigm model is sufficiently complex to model realistically self-limiting power excursions for short times yet admits closed-form exact expressions for the time-dependent neutron flux, temperature distribution and energy released during the transient power burst. The n<sup>th</sup>-FASAM-N methodology is compared to the extant “n<sup>th</sup>-Order Comprehensive Adjoint Sensitivity Analysis Methodology for Nonlinear Systems” (n<sup>th</sup>-CASAM-N) showing that: (i) the 1<sup>st</sup>-FASAM-N and the 1<sup>st</sup>-CASAM-N methodologies are equally efficient for computing the first-order sensitivities;each methodology requires a single large-scale computation for solving the “First-Level Adjoint Sensitivity System” (1<sup>st</sup>-LASS);(ii) the 2<sup>nd</sup>-FASAM-N methodology is considerably more efficient than the 2<sup>nd</sup>-CASAM-N methodology for computing the second-order sensitivities since the number of feature-functions is much smaller than the number of primary parameters;specifically for the Nordheim-Fuchs model, the 2<sup>nd</sup>-FASAM-N methodology requires 2 large-scale computations to obtain all of the exact expressions of the 28 distinct second-order response sensitivities with respect to the model parameters while the 2<sup>nd</sup>-CASAM-N methodology requires 7 large-scale computations for obtaining these 28 second-order sensitivities;(iii) the 3<sup>rd</sup>-FASAM-N methodology is even more efficient than the 3<sup>rd</sup>-CASAM-N methodology: only 2 large-scale computations are needed to obtain the exact expressions of the 84 distinct third-order response sensitivities with respect to the Nordheim-Fuchs model’s parameters when applying the 3<sup>rd</sup>-FASAM-N methodology, while the application of the 3<sup>rd</sup>-CASAM-N methodology requires at least 22 large-scale computations for computing the same 84 distinct third-order sensitivities. Together, the n<sup>th</sup>-FASAM-N and the n<sup>th</sup>-CASAM-N methodologies are the most practical methodologies for computing response sensitivities of any order comprehensively and accurately, overcoming the curse of dimensionality in sensitivity analysis.展开更多
This work presents the “n<sup>th</sup>-Order Feature Adjoint Sensitivity Analysis Methodology for Nonlinear Systems” (abbreviated as “n<sup>th</sup>-FASAM-N”), which will be shown to be the...This work presents the “n<sup>th</sup>-Order Feature Adjoint Sensitivity Analysis Methodology for Nonlinear Systems” (abbreviated as “n<sup>th</sup>-FASAM-N”), which will be shown to be the most efficient methodology for computing exact expressions of sensitivities, of any order, of model responses with respect to features of model parameters and, subsequently, with respect to the model’s uncertain parameters, boundaries, and internal interfaces. The unparalleled efficiency and accuracy of the n<sup>th</sup>-FASAM-N methodology stems from the maximal reduction of the number of adjoint computations (which are considered to be “large-scale” computations) for computing high-order sensitivities. When applying the n<sup>th</sup>-FASAM-N methodology to compute the second- and higher-order sensitivities, the number of large-scale computations is proportional to the number of “model features” as opposed to being proportional to the number of model parameters (which are considerably more than the number of features).When a model has no “feature” functions of parameters, but only comprises primary parameters, the n<sup>th</sup>-FASAM-N methodology becomes identical to the extant n<sup>th</sup> CASAM-N (“n<sup>th</sup>-Order Comprehensive Adjoint Sensitivity Analysis Methodology for Nonlinear Systems”) methodology. Both the n<sup>th</sup>-FASAM-N and the n<sup>th</sup>-CASAM-N methodologies are formulated in linearly increasing higher-dimensional Hilbert spaces as opposed to exponentially increasing parameter-dimensional spaces thus overcoming the curse of dimensionality in sensitivity analysis of nonlinear systems. Both the n<sup>th</sup>-FASAM-N and the n<sup>th</sup>-CASAM-N are incomparably more efficient and more accurate than any other methods (statistical, finite differences, etc.) for computing exact expressions of response sensitivities of any order with respect to the model’s features and/or primary uncertain parameters, boundaries, and internal interfaces.展开更多
Since leaks in high-pressure pipelines transporting crude oil can cause severe economic losses,a reliable leak risk assessment can assist in developing an effective pipeline maintenance plan and avoiding unexpected in...Since leaks in high-pressure pipelines transporting crude oil can cause severe economic losses,a reliable leak risk assessment can assist in developing an effective pipeline maintenance plan and avoiding unexpected incidents.The fast and accurate leak detection methods are essential for maintaining pipeline safety in pipeline reliability engineering.Current oil pipeline leakage signals are insufficient for feature extraction,while the training time for traditional leakage prediction models is too long.A new leak detection method is proposed based on time-frequency features and the Genetic Algorithm-Levenberg Marquardt(GA-LM)classification model for predicting the leakage status of oil pipelines.The signal that has been processed is transformed to the time and frequency domain,allowing full expression of the original signal.The traditional Back Propagation(BP)neural network is optimized by the Genetic Algorithm(GA)and Levenberg Marquardt(LM)algorithms.The results show that the recognition effect of a combined feature parameter is superior to that of a single feature parameter.The Accuracy,Precision,Recall,and F1score of the GA-LM model is 95%,93.5%,96.7%,and 95.1%,respectively,which proves that the GA-LM model has a good predictive effect and excellent stability for positive and negative samples.The proposed GA-LM model can obviously reduce training time and improve recognition efficiency.In addition,considering that a large number of samples are required for model training,a wavelet threshold method is proposed to generate sample data with higher reliability.The research results can provide an effective theoretical and technical reference for the leakage risk assessment of the actual oil pipelines.展开更多
The ORB-SLAM2 based on the constant velocity model is difficult to determine the search window of the reprojection of map points when the objects are in variable velocity motion,which leads to a false matching,with an...The ORB-SLAM2 based on the constant velocity model is difficult to determine the search window of the reprojection of map points when the objects are in variable velocity motion,which leads to a false matching,with an inaccurate pose estimation or failed tracking.To address the challenge above,a new method of feature point matching is proposed in this paper,which combines the variable velocity model with the reverse optical flow method.First,the constant velocity model is extended to a new variable velocity model,and the expanded variable velocity model is used to provide the initial pixel shifting for the reverse optical flow method.Then the search range of feature points is accurately determined according to the results of the reverse optical flow method,thereby improving the accuracy and reliability of feature matching,with strengthened interframe tracking effects.Finally,we tested on TUM data set based on the RGB-D camera.Experimental results show that this method can reduce the probability of tracking failure and improve localization accuracy on SLAM(Simultaneous Localization and Mapping)systems.Compared with the traditional ORB-SLAM2,the test error of this method on each sequence in the TUM data set is significantly reduced,and the root mean square error is only 63.8%of the original system under the optimal condition.展开更多
In a competitive digital age where data volumes are increasing with time, the ability to extract meaningful knowledge from high-dimensional data using machine learning (ML) and data mining (DM) techniques and making d...In a competitive digital age where data volumes are increasing with time, the ability to extract meaningful knowledge from high-dimensional data using machine learning (ML) and data mining (DM) techniques and making decisions based on the extracted knowledge is becoming increasingly important in all business domains. Nevertheless, high-dimensional data remains a major challenge for classification algorithms due to its high computational cost and storage requirements. The 2016 Demographic and Health Survey of Ethiopia (EDHS 2016) used as the data source for this study which is publicly available contains several features that may not be relevant to the prediction task. In this paper, we developed a hybrid multidimensional metrics framework for predictive modeling for both model performance evaluation and feature selection to overcome the feature selection challenges and select the best model among the available models in DM and ML. The proposed hybrid metrics were used to measure the efficiency of the predictive models. Experimental results show that the decision tree algorithm is the most efficient model. The higher score of HMM (m, r) = 0.47 illustrates the overall significant model that encompasses almost all the user’s requirements, unlike the classical metrics that use a criterion to select the most appropriate model. On the other hand, the ANNs were found to be the most computationally intensive for our prediction task. Moreover, the type of data and the class size of the dataset (unbalanced data) have a significant impact on the efficiency of the model, especially on the computational cost, and the interpretability of the parameters of the model would be hampered. And the efficiency of the predictive model could be improved with other feature selection algorithms (especially hybrid metrics) considering the experts of the knowledge domain, as the understanding of the business domain has a significant impact.展开更多
Background:Early singular nodular hepatocellular carcinoma(HCC)is an ideal surgical indication in clinical practice.However,almost half of the patients have tumor recurrence,and there is no reliable prognostic predict...Background:Early singular nodular hepatocellular carcinoma(HCC)is an ideal surgical indication in clinical practice.However,almost half of the patients have tumor recurrence,and there is no reliable prognostic prediction tool.Besides,it is unclear whether preoperative neoadjuvant therapy is necessary for patients with early singular nodular HCC and which patient needs it.It is critical to identify the patients with high risk of recurrence and to treat these patients preoperatively with neoadjuvant therapy and thus,to improve the outcomes of these patients.The present study aimed to develop two prognostic models to preoperatively predict the recurrence-free survival(RFS)and overall survival(OS)in patients with singular nodular HCC by integrating the clinical data and radiological features.Methods:We retrospective recruited 211 patients with singular nodular HCC from December 2009 to January 2019 at Eastern Hepatobiliary Surgery Hospital(EHBH).They all met the surgical indications and underwent radical resection.We randomly divided the patients into the training cohort(n=132)and the validation cohort(n=79).We established and validated multivariate Cox proportional hazard models by the preoperative clinicopathologic factors and radiological features for association with RFS and OS.By analyzing the receiver operating characteristic(ROC)curve,the discrimination accuracy of the models was compared with that of the traditional predictive models.Results:Our RFS model was based on HBV-DNA score,cirrhosis,tumor diameter and tumor capsule in imaging.RFS nomogram had fine calibration and discrimination capabilities,with a C-index of 0.74(95%CI:0.68-0.80).The OS nomogram,based on cirrhosis,tumor diameter and tumor capsule in imaging,had fine calibration and discrimination capabilities,with a C-index of 0.81(95%CI:0.74-0.87).The area under the receiver operating characteristic curve(AUC)of our model was larger than that of traditional liver cancer staging system,Korea model and Nomograms in Hepatectomy Patients with Hepatitis B VirusRelated Hepatocellular Carcinoma,indicating better discrimination capability.According to the models,we fitted the linear prediction equations.These results were validated in the validation cohort.Conclusions:Compared with previous radiography model,the new-developed predictive model was concise and applicable to predict the postoperative survival of patients with singular nodular HCC.Our models may preoperatively identify patients with high risk of recurrence.These patients may benefit from neoadjuvant therapy which may improve the patients’outcomes.展开更多
Congenital heart defect,accounting for about 30%of congenital defects,is the most common one.Data shows that congenital heart defects have seriously affected the birth rate of healthy newborns.In Fetal andNeonatal Car...Congenital heart defect,accounting for about 30%of congenital defects,is the most common one.Data shows that congenital heart defects have seriously affected the birth rate of healthy newborns.In Fetal andNeonatal Cardiology,medical imaging technology(2D ultrasonic,MRI)has been proved to be helpful to detect congenital defects of the fetal heart and assists sonographers in prenatal diagnosis.It is a highly complex task to recognize 2D fetal heart ultrasonic standard plane(FHUSP)manually.Compared withmanual identification,automatic identification through artificial intelligence can save a lot of time,ensure the efficiency of diagnosis,and improve the accuracy of diagnosis.In this study,a feature extraction method based on texture features(Local Binary Pattern LBP and Histogram of Oriented Gradient HOG)and combined with Bag of Words(BOW)model is carried out,and then feature fusion is performed.Finally,it adopts Support VectorMachine(SVM)to realize automatic recognition and classification of FHUSP.The data includes 788 standard plane data sets and 448 normal and abnormal plane data sets.Compared with some other methods and the single method model,the classification accuracy of our model has been obviously improved,with the highest accuracy reaching 87.35%.Similarly,we also verify the performance of the model in normal and abnormal planes,and the average accuracy in classifying abnormal and normal planes is 84.92%.The experimental results show that thismethod can effectively classify and predict different FHUSP and can provide certain assistance for sonographers to diagnose fetal congenital heart disease.展开更多
It is common for datasets to contain both categorical and continuous variables. However, many feature screening methods designed for high-dimensional classification assume that the variables are continuous. This limit...It is common for datasets to contain both categorical and continuous variables. However, many feature screening methods designed for high-dimensional classification assume that the variables are continuous. This limits the applicability of existing methods in handling this complex scenario. To address this issue, we propose a model-free feature screening approach for ultra-high-dimensional multi-classification that can handle both categorical and continuous variables. Our proposed feature screening method utilizes the Maximal Information Coefficient to assess the predictive power of the variables. By satisfying certain regularity conditions, we have proven that our screening procedure possesses the sure screening property and ranking consistency properties. To validate the effectiveness of our approach, we conduct simulation studies and provide real data analysis examples to demonstrate its performance in finite samples. In summary, our proposed method offers a solution for effectively screening features in ultra-high-dimensional datasets with a mixture of categorical and continuous covariates.展开更多
In ultra-high-dimensional data, it is common for the response variable to be multi-classified. Therefore, this paper proposes a model-free screening method for variables whose response variable is multi-classified fro...In ultra-high-dimensional data, it is common for the response variable to be multi-classified. Therefore, this paper proposes a model-free screening method for variables whose response variable is multi-classified from the point of view of introducing Jensen-Shannon divergence to measure the importance of covariates. The idea of the method is to calculate the Jensen-Shannon divergence between the conditional probability distribution of the covariates on a given response variable and the unconditional probability distribution of the covariates, and then use the probabilities of the response variables as weights to calculate the weighted Jensen-Shannon divergence, where a larger weighted Jensen-Shannon divergence means that the covariates are more important. Additionally, we also investigated an adapted version of the method, which is to measure the relationship between the covariates and the response variable using the weighted Jensen-Shannon divergence adjusted by the logarithmic factor of the number of categories when the number of categories in each covariate varies. Then, through both theoretical and simulation experiments, it was demonstrated that the proposed methods have sure screening and ranking consistency properties. Finally, the results from simulation and real-dataset experiments show that in feature screening, the proposed methods investigated are robust in performance and faster in computational speed compared with an existing method.展开更多
This paper proposes an approach of developing the feature based parametric product modeling system which is suitable for integrated engineering design in CIMS environment.The architecture of ZD--MCADII and the charact...This paper proposes an approach of developing the feature based parametric product modeling system which is suitable for integrated engineering design in CIMS environment.The architecture of ZD--MCADII and the characteristics of its each module are introduced in detail. ZD--MCADII’s product data is managed by an object--oriented database management system OSCAR, and the product model is built according to the standard STEP. The product design is established on a unified product model, and all the product data are globally associated in ZD--MCADII. ZD--MCADII provides various design features to facilitate the product design, and supports the integrity of CAD, CAPP and CAM.展开更多
Sanduao is an important sea-breeding bay in Fujian,South China and holds a high economic status in aquaculture.Quickly and accurately obtaining information including the distribution area,quantity,and aquaculture area...Sanduao is an important sea-breeding bay in Fujian,South China and holds a high economic status in aquaculture.Quickly and accurately obtaining information including the distribution area,quantity,and aquaculture area is important for breeding area planning,production value estimation,ecological survey,and storm surge prevention.However,as the aquaculture area expands,the seawater background becomes increasingly complex and spectral characteristics differ dramatically,making it difficult to determine the aquaculture area.In this study,we used a high-resolution remote-sensing satellite GF-2 image to introduce a deep-learning Richer Convolutional Features(RCF)network model to extract the aquaculture area.Then we used the density of aquaculture as an assessment index to assess the vulnerability of aquaculture areas in Sanduao.The results demonstrate that this method does not require land and water separation of the area in advance,and good extraction can be achieved in the areas with more sediment and waves,with an extraction accuracy>93%,which is suitable for large-scale aquaculture area extraction.Vulnerability assessment results indicate that the density of aquaculture in the eastern part of Sanduao is considerably high,reaching a higher vulnerability level than other parts.展开更多
The present paper deals with the problem of assessing the local influence in a growth curve model with Rao’s simple covariance structure. Based on the likelihood displacement,the curvature measure is employed to eval...The present paper deals with the problem of assessing the local influence in a growth curve model with Rao’s simple covariance structure. Based on the likelihood displacement,the curvature measure is employed to evaluate the effects of some minor perturbations on the statistical inference, thus leading to the large curvature direction, which is the most critical diagnostic statistic in the context of the local influence analysis. As an application, the common covariance-weighted perturbation scheme is thoroughly considered.展开更多
Based on the features extracted from generalized autoregressive (GAR) model parameters of the received waveform, and the use of multilayer perceptron(MLP) neural network classifier, a new digital modulation recognitio...Based on the features extracted from generalized autoregressive (GAR) model parameters of the received waveform, and the use of multilayer perceptron(MLP) neural network classifier, a new digital modulation recognition method is proposed in this paper. Because of the better noise suppression ability of the GAR model and the powerful pattern classification capacity of the MLP neural network classifier, the new method can significantly improve the recognition performance in lower SNR with better robustness. To assess the performance of the new method, computer simulations are also performed.展开更多
Software Product Line(SPL)is a group of software-intensive systems that share common and variable resources for developing a particular system.The feature model is a tree-type structure used to manage SPL’s common an...Software Product Line(SPL)is a group of software-intensive systems that share common and variable resources for developing a particular system.The feature model is a tree-type structure used to manage SPL’s common and variable features with their different relations and problem of Crosstree Constraints(CTC).CTC problems exist in groups of common and variable features among the sub-tree of feature models more diverse in Internet of Things(IoT)devices because different Internet devices and protocols are communicated.Therefore,managing the CTC problem to achieve valid product configuration in IoT-based SPL is more complex,time-consuming,and hard.However,the CTC problem needs to be considered in previously proposed approaches such as Commonality VariabilityModeling of Features(COVAMOF)andGenarch+tool;therefore,invalid products are generated.This research has proposed a novel approach Binary Oriented Feature Selection Crosstree Constraints(BOFS-CTC),to find all possible valid products by selecting the features according to cardinality constraints and cross-tree constraint problems in the featuremodel of SPL.BOFS-CTC removes the invalid products at the early stage of feature selection for the product configuration.Furthermore,this research developed the BOFS-CTC algorithm and applied it to,IoT-based feature models.The findings of this research are that no relationship constraints and CTC violations occur and drive the valid feature product configurations for the application development by removing the invalid product configurations.The accuracy of BOFS-CTC is measured by the integration sampling technique,where different valid product configurations are compared with the product configurations derived by BOFS-CTC and found 100%correct.Using BOFS-CTC eliminates the testing cost and development effort of invalid SPL products.展开更多
This study focuses on meeting the challenges of big data visualization by using of data reduction methods based the feature selection methods.To reduce the volume of big data and minimize model training time(Tt)while ...This study focuses on meeting the challenges of big data visualization by using of data reduction methods based the feature selection methods.To reduce the volume of big data and minimize model training time(Tt)while maintaining data quality.We contributed to meeting the challenges of big data visualization using the embedded method based“Select from model(SFM)”method by using“Random forest Importance algorithm(RFI)”and comparing it with the filter method by using“Select percentile(SP)”method based chi square“Chi2”tool for selecting the most important features,which are then fed into a classification process using the logistic regression(LR)algorithm and the k-nearest neighbor(KNN)algorithm.Thus,the classification accuracy(AC)performance of LRis also compared to theKNN approach in python on eight data sets to see which method produces the best rating when feature selection methods are applied.Consequently,the study concluded that the feature selection methods have a significant impact on the analysis and visualization of the data after removing the repetitive data and the data that do not affect the goal.After making several comparisons,the study suggests(SFMLR)using SFM based on RFI algorithm for feature selection,with LR algorithm for data classify.The proposal proved its efficacy by comparing its results with recent literature.展开更多
In this paper,the influence of the El NioSouthern Oscillation (ENSO) cycle on the sensitivity of nonlinear factors in the numerical simulation is investigated by conducting numerical experiments in a simple air-sea co...In this paper,the influence of the El NioSouthern Oscillation (ENSO) cycle on the sensitivity of nonlinear factors in the numerical simulation is investigated by conducting numerical experiments in a simple air-sea coupled model for ENSO prediction.Two sets of experiments are conducted in which zonal nonlinear factors,meridional nonlinear factors,or both are incorporated into the governing equations for the atmosphere or ocean.The results suggest that the ENSO cycle is very sensitive to the nonlinear factor of the governing equation for the atmosphere or ocean.Thus,incorporating nonlinearity into air-sea coupled models is of exclusive importance for improving ENSO simulation.展开更多
基金supported in part by the National Natural Science Foundation of China(82072019)the Shenzhen Basic Research Program(JCYJ20210324130209023)+5 种基金the Shenzhen-Hong Kong-Macao S&T Program(Category C)(SGDX20201103095002019)the Mainland-Hong Kong Joint Funding Scheme(MHKJFS)(MHP/005/20),the Project of Strategic Importance Fund(P0035421)the Projects of RISA(P0043001)from the Hong Kong Polytechnic University,the Natural Science Foundation of Jiangsu Province(BK20201441)the Provincial and Ministry Co-constructed Project of Henan Province Medical Science and Technology Research(SBGJ202103038,SBGJ202102056)the Henan Province Key R&D and Promotion Project(Science and Technology Research)(222102310015)the Natural Science Foundation of Henan Province(222300420575),and the Henan Province Science and Technology Research(222102310322).
文摘Modern medicine is reliant on various medical imaging technologies for non-invasively observing patients’anatomy.However,the interpretation of medical images can be highly subjective and dependent on the expertise of clinicians.Moreover,some potentially useful quantitative information in medical images,especially that which is not visible to the naked eye,is often ignored during clinical practice.In contrast,radiomics performs high-throughput feature extraction from medical images,which enables quantitative analysis of medical images and prediction of various clinical endpoints.Studies have reported that radiomics exhibits promising performance in diagnosis and predicting treatment responses and prognosis,demonstrating its potential to be a non-invasive auxiliary tool for personalized medicine.However,radiomics remains in a developmental phase as numerous technical challenges have yet to be solved,especially in feature engineering and statistical modeling.In this review,we introduce the current utility of radiomics by summarizing research on its application in the diagnosis,prognosis,and prediction of treatment responses in patients with cancer.We focus on machine learning approaches,for feature extraction and selection during feature engineering and for imbalanced datasets and multi-modality fusion during statistical modeling.Furthermore,we introduce the stability,reproducibility,and interpretability of features,and the generalizability and interpretability of models.Finally,we offer possible solutions to current challenges in radiomics research.
文摘Cleats are the dominant micro-fracture network controlling the macro-mechanical behavior of coal.Improved understanding of the spatial characteristics of cleat networks is therefore important to the coal mining industry.Discrete fracture networks(DFNs)are increasingly used in engineering analyses to spatially model fractures at various scales.The reliability of coal DFNs largely depends on the confidence in the input cleat statistics.Estimates of these parameters can be made from image-based three-dimensional(3D)characterization of coal cleats using X-ray micro-computed tomography(m CT).One key step in this process,after cleat extraction,is the separation of individual cleats,without which the cleats are a connected network and statistics for different cleat sets cannot be measured.In this paper,a feature extraction-based image processing method is introduced to identify and separate distinct cleat groups from 3D X-ray m CT images.Kernels(filters)representing explicit cleat features of coal are built and cleat separation is successfully achieved by convolutional operations on 3D coal images.The new method is applied to a coal specimen with 80 mm in diameter and 100 mm in length acquired from an Anglo American Steelmaking Coal mine in the Bowen Basin,Queensland,Australia.It is demonstrated that the new method produces reliable cleat separation capable of defining individual cleats and preserving 3D topology after separation.Bedding-parallel fractures are also identified and separated,which has his-torically been challenging to delineate and rarely reported.A variety of cleat/fracture statistics is measured which not only can quantitatively characterize the cleat/fracture system but also can be used for DFN modeling.Finally,variability and heterogeneity with respect to the core axis are investigated.Significant heterogeneity is observed and suggests that the representative elementary volume(REV)of the cleat groups for engineering purposes may be a complex problem requiring careful consideration.
文摘Medical Internet of Things(IoT)devices are becoming more and more common in healthcare.This has created a huge need for advanced predictive health modeling strategies that can make good use of the growing amount of multimodal data to find potential health risks early and help individuals in a personalized way.Existing methods,while useful,have limitations in predictive accuracy,delay,personalization,and user interpretability,requiring a more comprehensive and efficient approach to harness modern medical IoT devices.MAIPFE is a multimodal approach integrating pre-emptive analysis,personalized feature selection,and explainable AI for real-time health monitoring and disease detection.By using AI for early disease detection,personalized health recommendations,and transparency,healthcare will be transformed.The Multimodal Approach Integrating Pre-emptive Analysis,Personalized Feature Selection,and Explainable AI(MAIPFE)framework,which combines Firefly Optimizer,Recurrent Neural Network(RNN),Fuzzy C Means(FCM),and Explainable AI,improves disease detection precision over existing methods.Comprehensive metrics show the model’s superiority in real-time health analysis.The proposed framework outperformed existing models by 8.3%in disease detection classification precision,8.5%in accuracy,5.5%in recall,2.9%in specificity,4.5%in AUC(Area Under the Curve),and 4.9%in delay reduction.Disease prediction precision increased by 4.5%,accuracy by 3.9%,recall by 2.5%,specificity by 3.5%,AUC by 1.9%,and delay levels decreased by 9.4%.MAIPFE can revolutionize healthcare with preemptive analysis,personalized health insights,and actionable recommendations.The research shows that this innovative approach improves patient outcomes and healthcare efficiency in the real world.
文摘This work highlights the unparalleled efficiency of the “n<sup>th</sup>-Order Function/ Feature Adjoint Sensitivity Analysis Methodology for Nonlinear Systems” (n<sup>th</sup>-FASAM-N) by considering the well-known Nordheim-Fuchs reactor dynamics/safety model. This model describes a short-time self-limiting power excursion in a nuclear reactor system having a negative temperature coefficient in which a large amount of reactivity is suddenly inserted, either intentionally or by accident. This nonlinear paradigm model is sufficiently complex to model realistically self-limiting power excursions for short times yet admits closed-form exact expressions for the time-dependent neutron flux, temperature distribution and energy released during the transient power burst. The n<sup>th</sup>-FASAM-N methodology is compared to the extant “n<sup>th</sup>-Order Comprehensive Adjoint Sensitivity Analysis Methodology for Nonlinear Systems” (n<sup>th</sup>-CASAM-N) showing that: (i) the 1<sup>st</sup>-FASAM-N and the 1<sup>st</sup>-CASAM-N methodologies are equally efficient for computing the first-order sensitivities;each methodology requires a single large-scale computation for solving the “First-Level Adjoint Sensitivity System” (1<sup>st</sup>-LASS);(ii) the 2<sup>nd</sup>-FASAM-N methodology is considerably more efficient than the 2<sup>nd</sup>-CASAM-N methodology for computing the second-order sensitivities since the number of feature-functions is much smaller than the number of primary parameters;specifically for the Nordheim-Fuchs model, the 2<sup>nd</sup>-FASAM-N methodology requires 2 large-scale computations to obtain all of the exact expressions of the 28 distinct second-order response sensitivities with respect to the model parameters while the 2<sup>nd</sup>-CASAM-N methodology requires 7 large-scale computations for obtaining these 28 second-order sensitivities;(iii) the 3<sup>rd</sup>-FASAM-N methodology is even more efficient than the 3<sup>rd</sup>-CASAM-N methodology: only 2 large-scale computations are needed to obtain the exact expressions of the 84 distinct third-order response sensitivities with respect to the Nordheim-Fuchs model’s parameters when applying the 3<sup>rd</sup>-FASAM-N methodology, while the application of the 3<sup>rd</sup>-CASAM-N methodology requires at least 22 large-scale computations for computing the same 84 distinct third-order sensitivities. Together, the n<sup>th</sup>-FASAM-N and the n<sup>th</sup>-CASAM-N methodologies are the most practical methodologies for computing response sensitivities of any order comprehensively and accurately, overcoming the curse of dimensionality in sensitivity analysis.
文摘This work presents the “n<sup>th</sup>-Order Feature Adjoint Sensitivity Analysis Methodology for Nonlinear Systems” (abbreviated as “n<sup>th</sup>-FASAM-N”), which will be shown to be the most efficient methodology for computing exact expressions of sensitivities, of any order, of model responses with respect to features of model parameters and, subsequently, with respect to the model’s uncertain parameters, boundaries, and internal interfaces. The unparalleled efficiency and accuracy of the n<sup>th</sup>-FASAM-N methodology stems from the maximal reduction of the number of adjoint computations (which are considered to be “large-scale” computations) for computing high-order sensitivities. When applying the n<sup>th</sup>-FASAM-N methodology to compute the second- and higher-order sensitivities, the number of large-scale computations is proportional to the number of “model features” as opposed to being proportional to the number of model parameters (which are considerably more than the number of features).When a model has no “feature” functions of parameters, but only comprises primary parameters, the n<sup>th</sup>-FASAM-N methodology becomes identical to the extant n<sup>th</sup> CASAM-N (“n<sup>th</sup>-Order Comprehensive Adjoint Sensitivity Analysis Methodology for Nonlinear Systems”) methodology. Both the n<sup>th</sup>-FASAM-N and the n<sup>th</sup>-CASAM-N methodologies are formulated in linearly increasing higher-dimensional Hilbert spaces as opposed to exponentially increasing parameter-dimensional spaces thus overcoming the curse of dimensionality in sensitivity analysis of nonlinear systems. Both the n<sup>th</sup>-FASAM-N and the n<sup>th</sup>-CASAM-N are incomparably more efficient and more accurate than any other methods (statistical, finite differences, etc.) for computing exact expressions of response sensitivities of any order with respect to the model’s features and/or primary uncertain parameters, boundaries, and internal interfaces.
基金The National Key Research and Development Program of China:Design and Key Technology Research of Non-metallic Flexible Risers for Deep Sea Mining(2022YFC2803701)The General Program of National Natural Science Foundation of China(52071336,52374022).
文摘Since leaks in high-pressure pipelines transporting crude oil can cause severe economic losses,a reliable leak risk assessment can assist in developing an effective pipeline maintenance plan and avoiding unexpected incidents.The fast and accurate leak detection methods are essential for maintaining pipeline safety in pipeline reliability engineering.Current oil pipeline leakage signals are insufficient for feature extraction,while the training time for traditional leakage prediction models is too long.A new leak detection method is proposed based on time-frequency features and the Genetic Algorithm-Levenberg Marquardt(GA-LM)classification model for predicting the leakage status of oil pipelines.The signal that has been processed is transformed to the time and frequency domain,allowing full expression of the original signal.The traditional Back Propagation(BP)neural network is optimized by the Genetic Algorithm(GA)and Levenberg Marquardt(LM)algorithms.The results show that the recognition effect of a combined feature parameter is superior to that of a single feature parameter.The Accuracy,Precision,Recall,and F1score of the GA-LM model is 95%,93.5%,96.7%,and 95.1%,respectively,which proves that the GA-LM model has a good predictive effect and excellent stability for positive and negative samples.The proposed GA-LM model can obviously reduce training time and improve recognition efficiency.In addition,considering that a large number of samples are required for model training,a wavelet threshold method is proposed to generate sample data with higher reliability.The research results can provide an effective theoretical and technical reference for the leakage risk assessment of the actual oil pipelines.
基金This work was supported by The National Natural Science Foundation of China under Grant No.61304205 and NO.61502240The Natural Science Foundation of Jiangsu Province under Grant No.BK20191401 and No.BK20201136Postgraduate Research&Practice Innovation Program of Jiangsu Province under Grant No.SJCX21_0364 and No.SJCX21_0363.
文摘The ORB-SLAM2 based on the constant velocity model is difficult to determine the search window of the reprojection of map points when the objects are in variable velocity motion,which leads to a false matching,with an inaccurate pose estimation or failed tracking.To address the challenge above,a new method of feature point matching is proposed in this paper,which combines the variable velocity model with the reverse optical flow method.First,the constant velocity model is extended to a new variable velocity model,and the expanded variable velocity model is used to provide the initial pixel shifting for the reverse optical flow method.Then the search range of feature points is accurately determined according to the results of the reverse optical flow method,thereby improving the accuracy and reliability of feature matching,with strengthened interframe tracking effects.Finally,we tested on TUM data set based on the RGB-D camera.Experimental results show that this method can reduce the probability of tracking failure and improve localization accuracy on SLAM(Simultaneous Localization and Mapping)systems.Compared with the traditional ORB-SLAM2,the test error of this method on each sequence in the TUM data set is significantly reduced,and the root mean square error is only 63.8%of the original system under the optimal condition.
文摘In a competitive digital age where data volumes are increasing with time, the ability to extract meaningful knowledge from high-dimensional data using machine learning (ML) and data mining (DM) techniques and making decisions based on the extracted knowledge is becoming increasingly important in all business domains. Nevertheless, high-dimensional data remains a major challenge for classification algorithms due to its high computational cost and storage requirements. The 2016 Demographic and Health Survey of Ethiopia (EDHS 2016) used as the data source for this study which is publicly available contains several features that may not be relevant to the prediction task. In this paper, we developed a hybrid multidimensional metrics framework for predictive modeling for both model performance evaluation and feature selection to overcome the feature selection challenges and select the best model among the available models in DM and ML. The proposed hybrid metrics were used to measure the efficiency of the predictive models. Experimental results show that the decision tree algorithm is the most efficient model. The higher score of HMM (m, r) = 0.47 illustrates the overall significant model that encompasses almost all the user’s requirements, unlike the classical metrics that use a criterion to select the most appropriate model. On the other hand, the ANNs were found to be the most computationally intensive for our prediction task. Moreover, the type of data and the class size of the dataset (unbalanced data) have a significant impact on the efficiency of the model, especially on the computational cost, and the interpretability of the parameters of the model would be hampered. And the efficiency of the predictive model could be improved with other feature selection algorithms (especially hybrid metrics) considering the experts of the knowledge domain, as the understanding of the business domain has a significant impact.
基金supported by grants from the Shanghai Rising-Star Program(19QA1408700)the National Natural Science Founda-tion of China(81972575 and 81521091)Clinical Research Plan of SHDC(SHDC2020CR5007)。
文摘Background:Early singular nodular hepatocellular carcinoma(HCC)is an ideal surgical indication in clinical practice.However,almost half of the patients have tumor recurrence,and there is no reliable prognostic prediction tool.Besides,it is unclear whether preoperative neoadjuvant therapy is necessary for patients with early singular nodular HCC and which patient needs it.It is critical to identify the patients with high risk of recurrence and to treat these patients preoperatively with neoadjuvant therapy and thus,to improve the outcomes of these patients.The present study aimed to develop two prognostic models to preoperatively predict the recurrence-free survival(RFS)and overall survival(OS)in patients with singular nodular HCC by integrating the clinical data and radiological features.Methods:We retrospective recruited 211 patients with singular nodular HCC from December 2009 to January 2019 at Eastern Hepatobiliary Surgery Hospital(EHBH).They all met the surgical indications and underwent radical resection.We randomly divided the patients into the training cohort(n=132)and the validation cohort(n=79).We established and validated multivariate Cox proportional hazard models by the preoperative clinicopathologic factors and radiological features for association with RFS and OS.By analyzing the receiver operating characteristic(ROC)curve,the discrimination accuracy of the models was compared with that of the traditional predictive models.Results:Our RFS model was based on HBV-DNA score,cirrhosis,tumor diameter and tumor capsule in imaging.RFS nomogram had fine calibration and discrimination capabilities,with a C-index of 0.74(95%CI:0.68-0.80).The OS nomogram,based on cirrhosis,tumor diameter and tumor capsule in imaging,had fine calibration and discrimination capabilities,with a C-index of 0.81(95%CI:0.74-0.87).The area under the receiver operating characteristic curve(AUC)of our model was larger than that of traditional liver cancer staging system,Korea model and Nomograms in Hepatectomy Patients with Hepatitis B VirusRelated Hepatocellular Carcinoma,indicating better discrimination capability.According to the models,we fitted the linear prediction equations.These results were validated in the validation cohort.Conclusions:Compared with previous radiography model,the new-developed predictive model was concise and applicable to predict the postoperative survival of patients with singular nodular HCC.Our models may preoperatively identify patients with high risk of recurrence.These patients may benefit from neoadjuvant therapy which may improve the patients’outcomes.
基金supported by Fujian Provincial Science and Technology Major Project(No.2020HZ02014)by the grants from National Natural Science Foundation of Fujian(2021J01133,2021J011404)by the Quanzhou Scientific and Technological Planning Projects(Nos.2018C113R,2019C028R,2019C029R,2019C076R and 2019C099R).
文摘Congenital heart defect,accounting for about 30%of congenital defects,is the most common one.Data shows that congenital heart defects have seriously affected the birth rate of healthy newborns.In Fetal andNeonatal Cardiology,medical imaging technology(2D ultrasonic,MRI)has been proved to be helpful to detect congenital defects of the fetal heart and assists sonographers in prenatal diagnosis.It is a highly complex task to recognize 2D fetal heart ultrasonic standard plane(FHUSP)manually.Compared withmanual identification,automatic identification through artificial intelligence can save a lot of time,ensure the efficiency of diagnosis,and improve the accuracy of diagnosis.In this study,a feature extraction method based on texture features(Local Binary Pattern LBP and Histogram of Oriented Gradient HOG)and combined with Bag of Words(BOW)model is carried out,and then feature fusion is performed.Finally,it adopts Support VectorMachine(SVM)to realize automatic recognition and classification of FHUSP.The data includes 788 standard plane data sets and 448 normal and abnormal plane data sets.Compared with some other methods and the single method model,the classification accuracy of our model has been obviously improved,with the highest accuracy reaching 87.35%.Similarly,we also verify the performance of the model in normal and abnormal planes,and the average accuracy in classifying abnormal and normal planes is 84.92%.The experimental results show that thismethod can effectively classify and predict different FHUSP and can provide certain assistance for sonographers to diagnose fetal congenital heart disease.
文摘It is common for datasets to contain both categorical and continuous variables. However, many feature screening methods designed for high-dimensional classification assume that the variables are continuous. This limits the applicability of existing methods in handling this complex scenario. To address this issue, we propose a model-free feature screening approach for ultra-high-dimensional multi-classification that can handle both categorical and continuous variables. Our proposed feature screening method utilizes the Maximal Information Coefficient to assess the predictive power of the variables. By satisfying certain regularity conditions, we have proven that our screening procedure possesses the sure screening property and ranking consistency properties. To validate the effectiveness of our approach, we conduct simulation studies and provide real data analysis examples to demonstrate its performance in finite samples. In summary, our proposed method offers a solution for effectively screening features in ultra-high-dimensional datasets with a mixture of categorical and continuous covariates.
文摘In ultra-high-dimensional data, it is common for the response variable to be multi-classified. Therefore, this paper proposes a model-free screening method for variables whose response variable is multi-classified from the point of view of introducing Jensen-Shannon divergence to measure the importance of covariates. The idea of the method is to calculate the Jensen-Shannon divergence between the conditional probability distribution of the covariates on a given response variable and the unconditional probability distribution of the covariates, and then use the probabilities of the response variables as weights to calculate the weighted Jensen-Shannon divergence, where a larger weighted Jensen-Shannon divergence means that the covariates are more important. Additionally, we also investigated an adapted version of the method, which is to measure the relationship between the covariates and the response variable using the weighted Jensen-Shannon divergence adjusted by the logarithmic factor of the number of categories when the number of categories in each covariate varies. Then, through both theoretical and simulation experiments, it was demonstrated that the proposed methods have sure screening and ranking consistency properties. Finally, the results from simulation and real-dataset experiments show that in feature screening, the proposed methods investigated are robust in performance and faster in computational speed compared with an existing method.
文摘This paper proposes an approach of developing the feature based parametric product modeling system which is suitable for integrated engineering design in CIMS environment.The architecture of ZD--MCADII and the characteristics of its each module are introduced in detail. ZD--MCADII’s product data is managed by an object--oriented database management system OSCAR, and the product model is built according to the standard STEP. The product design is established on a unified product model, and all the product data are globally associated in ZD--MCADII. ZD--MCADII provides various design features to facilitate the product design, and supports the integrity of CAD, CAPP and CAM.
基金Supported by the National Key Research and Development Program of China(No.2016YFC1402003)the National Natural Science Foundation of China(No.41671436)the Innovation Project of LREIS(No.O88RAA01YA)
文摘Sanduao is an important sea-breeding bay in Fujian,South China and holds a high economic status in aquaculture.Quickly and accurately obtaining information including the distribution area,quantity,and aquaculture area is important for breeding area planning,production value estimation,ecological survey,and storm surge prevention.However,as the aquaculture area expands,the seawater background becomes increasingly complex and spectral characteristics differ dramatically,making it difficult to determine the aquaculture area.In this study,we used a high-resolution remote-sensing satellite GF-2 image to introduce a deep-learning Richer Convolutional Features(RCF)network model to extract the aquaculture area.Then we used the density of aquaculture as an assessment index to assess the vulnerability of aquaculture areas in Sanduao.The results demonstrate that this method does not require land and water separation of the area in advance,and good extraction can be achieved in the areas with more sediment and waves,with an extraction accuracy>93%,which is suitable for large-scale aquaculture area extraction.Vulnerability assessment results indicate that the density of aquaculture in the eastern part of Sanduao is considerably high,reaching a higher vulnerability level than other parts.
文摘The present paper deals with the problem of assessing the local influence in a growth curve model with Rao’s simple covariance structure. Based on the likelihood displacement,the curvature measure is employed to evaluate the effects of some minor perturbations on the statistical inference, thus leading to the large curvature direction, which is the most critical diagnostic statistic in the context of the local influence analysis. As an application, the common covariance-weighted perturbation scheme is thoroughly considered.
文摘Based on the features extracted from generalized autoregressive (GAR) model parameters of the received waveform, and the use of multilayer perceptron(MLP) neural network classifier, a new digital modulation recognition method is proposed in this paper. Because of the better noise suppression ability of the GAR model and the powerful pattern classification capacity of the MLP neural network classifier, the new method can significantly improve the recognition performance in lower SNR with better robustness. To assess the performance of the new method, computer simulations are also performed.
文摘Software Product Line(SPL)is a group of software-intensive systems that share common and variable resources for developing a particular system.The feature model is a tree-type structure used to manage SPL’s common and variable features with their different relations and problem of Crosstree Constraints(CTC).CTC problems exist in groups of common and variable features among the sub-tree of feature models more diverse in Internet of Things(IoT)devices because different Internet devices and protocols are communicated.Therefore,managing the CTC problem to achieve valid product configuration in IoT-based SPL is more complex,time-consuming,and hard.However,the CTC problem needs to be considered in previously proposed approaches such as Commonality VariabilityModeling of Features(COVAMOF)andGenarch+tool;therefore,invalid products are generated.This research has proposed a novel approach Binary Oriented Feature Selection Crosstree Constraints(BOFS-CTC),to find all possible valid products by selecting the features according to cardinality constraints and cross-tree constraint problems in the featuremodel of SPL.BOFS-CTC removes the invalid products at the early stage of feature selection for the product configuration.Furthermore,this research developed the BOFS-CTC algorithm and applied it to,IoT-based feature models.The findings of this research are that no relationship constraints and CTC violations occur and drive the valid feature product configurations for the application development by removing the invalid product configurations.The accuracy of BOFS-CTC is measured by the integration sampling technique,where different valid product configurations are compared with the product configurations derived by BOFS-CTC and found 100%correct.Using BOFS-CTC eliminates the testing cost and development effort of invalid SPL products.
文摘This study focuses on meeting the challenges of big data visualization by using of data reduction methods based the feature selection methods.To reduce the volume of big data and minimize model training time(Tt)while maintaining data quality.We contributed to meeting the challenges of big data visualization using the embedded method based“Select from model(SFM)”method by using“Random forest Importance algorithm(RFI)”and comparing it with the filter method by using“Select percentile(SP)”method based chi square“Chi2”tool for selecting the most important features,which are then fed into a classification process using the logistic regression(LR)algorithm and the k-nearest neighbor(KNN)algorithm.Thus,the classification accuracy(AC)performance of LRis also compared to theKNN approach in python on eight data sets to see which method produces the best rating when feature selection methods are applied.Consequently,the study concluded that the feature selection methods have a significant impact on the analysis and visualization of the data after removing the repetitive data and the data that do not affect the goal.After making several comparisons,the study suggests(SFMLR)using SFM based on RFI algorithm for feature selection,with LR algorithm for data classify.The proposal proved its efficacy by comparing its results with recent literature.
基金supported by the National Natural Science Foundation of China (Grant No.40676016)
文摘In this paper,the influence of the El NioSouthern Oscillation (ENSO) cycle on the sensitivity of nonlinear factors in the numerical simulation is investigated by conducting numerical experiments in a simple air-sea coupled model for ENSO prediction.Two sets of experiments are conducted in which zonal nonlinear factors,meridional nonlinear factors,or both are incorporated into the governing equations for the atmosphere or ocean.The results suggest that the ENSO cycle is very sensitive to the nonlinear factor of the governing equation for the atmosphere or ocean.Thus,incorporating nonlinearity into air-sea coupled models is of exclusive importance for improving ENSO simulation.