Artificial rabbits optimization(ARO)is a recently proposed biology-based optimization algorithm inspired by the detour foraging and random hiding behavior of rabbits in nature.However,for solving optimization problems...Artificial rabbits optimization(ARO)is a recently proposed biology-based optimization algorithm inspired by the detour foraging and random hiding behavior of rabbits in nature.However,for solving optimization problems,the ARO algorithm shows slow convergence speed and can fall into local minima.To overcome these drawbacks,this paper proposes chaotic opposition-based learning ARO(COARO),an improved version of the ARO algorithm that incorporates opposition-based learning(OBL)and chaotic local search(CLS)techniques.By adding OBL to ARO,the convergence speed of the algorithm increases and it explores the search space better.Chaotic maps in CLS provide rapid convergence by scanning the search space efficiently,since their ergodicity and non-repetitive properties.The proposed COARO algorithm has been tested using thirty-three distinct benchmark functions.The outcomes have been compared with the most recent optimization algorithms.Additionally,the COARO algorithm’s problem-solving capabilities have been evaluated using six different engineering design problems and compared with various other algorithms.This study also introduces a binary variant of the continuous COARO algorithm,named BCOARO.The performance of BCOARO was evaluated on the breast cancer dataset.The effectiveness of BCOARO has been compared with different feature selection algorithms.The proposed BCOARO outperforms alternative algorithms,according to the findings obtained for real applications in terms of accuracy performance,and fitness value.Extensive experiments show that the COARO and BCOARO algorithms achieve promising results compared to other metaheuristic algorithms.展开更多
The flying foxes optimization(FFO)algorithm,as a newly introduced metaheuristic algorithm,is inspired by the survival tactics of flying foxes in heat wave environments.FFO preferentially selects the best-performing in...The flying foxes optimization(FFO)algorithm,as a newly introduced metaheuristic algorithm,is inspired by the survival tactics of flying foxes in heat wave environments.FFO preferentially selects the best-performing individuals.This tendency will cause the newly generated solution to remain closely tied to the candidate optimal in the search area.To address this issue,the paper introduces an opposition-based learning-based search mechanism for FFO algorithm(IFFO).Firstly,this paper introduces niching techniques to improve the survival list method,which not only focuses on the adaptability of individuals but also considers the population’s crowding degree to enhance the global search capability.Secondly,an initialization strategy of opposition-based learning is used to perturb the initial population and elevate its quality.Finally,to verify the superiority of the improved search mechanism,IFFO,FFO and the cutting-edge metaheuristic algorithms are compared and analyzed using a set of test functions.The results prove that compared with other algorithms,IFFO is characterized by its rapid convergence,precise results and robust stability.展开更多
The travel time of rock compressional waves is an essential parameter used for estimating important rock properties,such as porosity,permeability,and lithology.Current methods,like wireline logging tests,provide broad...The travel time of rock compressional waves is an essential parameter used for estimating important rock properties,such as porosity,permeability,and lithology.Current methods,like wireline logging tests,provide broad measurements but lack finer resolution.Laboratory-based rock core measurements offer higher resolution but are resource-intensive.Conventionally,wireline logging and rock core measurements have been used independently.This study introduces a novel approach that integrates both data sources.The method leverages the detailed features from limited core data to enhance the resolution of wireline logging data.By combining machine learning with random field theory,the method allows for probabilistic predictions in regions with sparse data sampling.In this framework,12 parameters from wireline tests are used to predict trends in rock core data.The residuals are modeled using random field theory.The outcomes are high-resolution predictions that combine both the predicted trend and the probabilistic realizations of the residual.By utilizing unconditional and conditional random field theories,this method enables unconditional and conditional simulations of the underlying high-resolution rock compressional wave travel time profile and provides uncertainty estimates.This integrated approach optimizes the use of existing core and logging data.Its applicability is confirmed in an oil project in West China.展开更多
As a new bionic algorithm,Spider Monkey Optimization(SMO)has been widely used in various complex optimization problems in recent years.However,the new space exploration power of SMO is limited and the diversity of the...As a new bionic algorithm,Spider Monkey Optimization(SMO)has been widely used in various complex optimization problems in recent years.However,the new space exploration power of SMO is limited and the diversity of the population in SMO is not abundant.Thus,this paper focuses on how to reconstruct SMO to improve its performance,and a novel spider monkey optimization algorithm with opposition-based learning and orthogonal experimental design(SMO^(3))is developed.A position updatingmethod based on the historical optimal domain and particle swarmfor Local Leader Phase(LLP)andGlobal Leader Phase(GLP)is presented to improve the diversity of the population of SMO.Moreover,an opposition-based learning strategy based on self-extremum is proposed to avoid suffering from premature convergence and getting stuck at locally optimal values.Also,a local worst individual elimination method based on orthogonal experimental design is used for helping the SMO algorithm eliminate the poor individuals in time.Furthermore,an extended SMO^(3)named CSMO^(3)is investigated to deal with constrained optimization problems.The proposed algorithm is applied to both unconstrained and constrained functions which include the CEC2006 benchmark set and three engineering problems.Experimental results show that the performance of the proposed algorithm is better than three well-known SMO algorithms and other evolutionary algorithms in unconstrained and constrained problems.展开更多
A machine learning(ML)-based random forest(RF)classification model algorithm was employed to investigate the main factors affecting the formation of the core-shell structure of BaTiO_(3)-based ceramics and their inter...A machine learning(ML)-based random forest(RF)classification model algorithm was employed to investigate the main factors affecting the formation of the core-shell structure of BaTiO_(3)-based ceramics and their interpretability was analyzed by using Shapley additive explanations(SHAP).An F1-score changed from 0.8795 to 0.9310,accuracy from 0.8450 to 0.9070,precision from 0.8714 to 0.9000,recall from 0.8929 to 0.9643,and ROC/AUC value of 0.97±0.03 was achieved by the RF classification with the optimal set of features containing only 5 features,demonstrating the high accuracy of our model and its high robustness.During the interpretability analysis of the model,it was found that the electronegativity,melting point,and sintering temperature of the dopant contribute highly to the formation of the core-shell structure,and based on these characteristics,specific ranges were delineated and twelve elements were finally obtained that met all the requirements,namely Si,Sc,Mn,Fe,Co,Ni,Pd,Er,Tm,Lu,Pa,and Cm.In the process of exploring the structure of the core-shell,the doping elements can be effectively localized to be selected by choosing the range of features.展开更多
BACKGROUND Liver cancer is one of the most prevalent malignant tumors worldwide,and its early detection and treatment are crucial for enhancing patient survival rates and quality of life.However,the early symptoms of ...BACKGROUND Liver cancer is one of the most prevalent malignant tumors worldwide,and its early detection and treatment are crucial for enhancing patient survival rates and quality of life.However,the early symptoms of liver cancer are often not obvious,resulting in a late-stage diagnosis in many patients,which significantly reduces the effectiveness of treatment.Developing a highly targeted,widely applicable,and practical risk prediction model for liver cancer is crucial for enhancing the early diagnosis and long-term survival rates among affected individuals.AIM To develop a liver cancer risk prediction model by employing machine learning techniques,and subsequently assess its performance.METHODS In this study,a total of 550 patients were enrolled,with 190 hepatocellular carcinoma(HCC)and 195 cirrhosis patients serving as the training cohort,and 83 HCC and 82 cirrhosis patients forming the validation cohort.Logistic regression(LR),support vector machine(SVM),random forest(RF),and least absolute shrinkage and selection operator(LASSO)regression models were developed in the training cohort.Model performance was assessed in the validation cohort.Additionally,this study conducted a comparative evaluation of the diagnostic efficacy between the ASAP model and the model developed in this study using receiver operating characteristic curve,calibration curve,and decision curve analysis(DCA)to determine the optimal predictive model for assessing liver cancer risk.RESULTS Six variables including age,white blood cell,red blood cell,platelet counts,alpha-fetoprotein and protein induced by vitamin K absence or antagonist II levels were used to develop LR,SVM,RF,and LASSO regression models.The RF model exhibited superior discrimination,and the area under curve of the training and validation sets was 0.969 and 0.858,respectively.These values significantly surpassed those of the LR(0.850 and 0.827),SVM(0.860 and 0.803),LASSO regression(0.845 and 0.831),and ASAP(0.866 and 0.813)models.Furthermore,calibration and DCA indicated that the RF model exhibited robust calibration and clinical validity.CONCLUSION The RF model demonstrated excellent prediction capabilities for HCC and can facilitate early diagnosis of HCC in clinical practice.展开更多
As the global demand for renewable energy grows,solar energy is gaining attention as a clean,sustainable energy source.Accurate assessment of solar energy resources is crucial for the siting and design of photovoltaic...As the global demand for renewable energy grows,solar energy is gaining attention as a clean,sustainable energy source.Accurate assessment of solar energy resources is crucial for the siting and design of photovoltaic power plants.This study proposes an integrated deep learning-based photovoltaic resource assessment method.Ensemble learning and deep learning methods are fused for photovoltaic resource assessment for the first time.The proposed method combines the random forest,gated recurrent unit,and long short-term memory to effectively improve the accuracy and reliability of photovoltaic resource assessment.The proposed method has strong adaptability and high accuracy even in the photovoltaic resource assessment of complex terrain and landscape.The experimental results show that the proposed method outperforms the comparison algorithm in all evaluation indexes,indicating that the proposed method has higher accuracy and reliability in photovoltaic resource assessment with improved generalization performance traditional single algorithm.展开更多
Manual investigation of chest radiography(CXR)images by physicians is crucial for effective decision-making in COVID-19 diagnosis.However,the high demand during the pandemic necessitates auxiliary help through image a...Manual investigation of chest radiography(CXR)images by physicians is crucial for effective decision-making in COVID-19 diagnosis.However,the high demand during the pandemic necessitates auxiliary help through image analysis and machine learning techniques.This study presents a multi-threshold-based segmentation technique to probe high pixel intensity regions in CXR images of various pathologies,including normal cases.Texture information is extracted using gray co-occurrence matrix(GLCM)-based features,while vessel-like features are obtained using Frangi,Sato,and Meijering filters.Machine learning models employing Decision Tree(DT)and RandomForest(RF)approaches are designed to categorize CXR images into common lung infections,lung opacity(LO),COVID-19,and viral pneumonia(VP).The results demonstrate that the fusion of texture and vesselbased features provides an effective ML model for aiding diagnosis.The ML model validation using performance measures,including an accuracy of approximately 91.8%with an RF-based classifier,supports the usefulness of the feature set and classifier model in categorizing the four different pathologies.Furthermore,the study investigates the importance of the devised features in identifying the underlying pathology and incorporates histogrambased analysis.This analysis reveals varying natural pixel distributions in CXR images belonging to the normal,COVID-19,LO,and VP groups,motivating the incorporation of additional features such as mean,standard deviation,skewness,and percentile based on the filtered images.Notably,the study achieves a considerable improvement in categorizing COVID-19 from LO,with a true positive rate of 97%,further substantiating the effectiveness of the methodology implemented.展开更多
The freshness of fruits is considered to be one of the essential characteristics for consumers in determining their quality,flavor and nutritional value.The primary need for identifying rotten fruits is to ensure that...The freshness of fruits is considered to be one of the essential characteristics for consumers in determining their quality,flavor and nutritional value.The primary need for identifying rotten fruits is to ensure that only fresh and high-quality fruits are sold to consumers.The impact of rotten fruits can foster harmful bacteria,molds and other microorganisms that can cause food poisoning and other illnesses to the consumers.The overall purpose of the study is to classify rotten fruits,which can affect the taste,texture,and appearance of other fresh fruits,thereby reducing their shelf life.The agriculture and food industries are increasingly adopting computer vision technology to detect rotten fruits and forecast their shelf life.Hence,this research work mainly focuses on the Convolutional Neural Network’s(CNN)deep learning model,which helps in the classification of rotten fruits.The proposed methodology involves real-time analysis of a dataset of various types of fruits,including apples,bananas,oranges,papayas and guavas.Similarly,machine learningmodels such as GaussianNaïve Bayes(GNB)and random forest are used to predict the fruit’s shelf life.The results obtained from the various pre-trained models for rotten fruit detection are analysed based on an accuracy score to determine the best model.In comparison to other pre-trained models,the visual geometry group16(VGG16)obtained a higher accuracy score of 95%.Likewise,the random forest model delivers a better accuracy score of 88% when compared with GNB in forecasting the fruit’s shelf life.By developing an accurate classification model,only fresh and safe fruits reach consumers,reducing the risks associated with contaminated produce.Thereby,the proposed approach will have a significant impact on the food industry for efficient fruit distribution and also benefit customers to purchase fresh fruits.展开更多
The application of machine learning(ML)algorithms in various fields of hepatology is an issue of interest.However,we must be cautious with the results.In this letter,based on a published ML prediction model for acute ...The application of machine learning(ML)algorithms in various fields of hepatology is an issue of interest.However,we must be cautious with the results.In this letter,based on a published ML prediction model for acute kidney injury after liver surgery,we discuss some limitations of ML models and how they may be addressed in the future.Although the future faces significant challenges,it also holds a great potential.展开更多
Survival rates following radical surgery for gastric neuroendocrine neoplasms(g-NENs)are low,with high recurrence rates.This fact impacts patient prognosis and complicates postoperative management.Traditional prognost...Survival rates following radical surgery for gastric neuroendocrine neoplasms(g-NENs)are low,with high recurrence rates.This fact impacts patient prognosis and complicates postoperative management.Traditional prognostic models,including the Cox proportional hazards(CoxPH)model,have shown limited predictive power for postoperative survival in gastrointestinal neuroectodermal tumor patients.Machine learning methods offer a unique opportunity to analyze complex relationships within datasets,providing tools and methodologies to assess large volumes of high-dimensional,multimodal data generated by biological sciences.These methods show promise in predicting outcomes across various medical disciplines.In the context of g-NENs,utilizing machine learning to predict survival outcomes holds potential for personalized postoperative management strategies.This editorial reviews a study exploring the advantages and effectiveness of the random survival forest(RSF)model,using the lymph node ratio(LNR),in predicting disease-specific survival(DSS)in postoperative g-NEN patients stratified into low-risk and high-risk groups.The findings demonstrate that the RSF model,incorporating LNR,outperformed the CoxPH model in predicting DSS and constitutes an important step towards precision medicine.展开更多
Machine learning is currently one of the research hotspots in the field of landslide prediction.To clarify and evaluate the differences in characteristics and prediction effects of different machine learning models,Co...Machine learning is currently one of the research hotspots in the field of landslide prediction.To clarify and evaluate the differences in characteristics and prediction effects of different machine learning models,Conghua District,which is the most prone to landslide disasters in Guangzhou,was selected for landslide susceptibility evaluation.The evaluation factors were selected by using correlation analysis and variance expansion factor method.Applying four machine learning methods namely Logistic Regression(LR),Random Forest(RF),Support Vector Machines(SVM),and Extreme Gradient Boosting(XGB),landslide models were constructed.Comparative analysis and evaluation of the model were conducted through statistical indices and receiver operating characteristic(ROC)curves.The results showed that LR,RF,SVM,and XGB models have good predictive performance for landslide susceptibility,with the area under curve(AUC)values of 0.752,0.965,0.996,and 0.998,respectively.XGB model had the highest predictive ability,followed by RF model,SVM model,and LR model.The frequency ratio(FR)accuracy of LR,RF,SVM,and XGB models was 0.775,0.842,0.759,and 0.822,respectively.RF and XGB models were superior to LR and SVM models,indicating that the integrated algorithm has better predictive ability than a single classification algorithm in regional landslide classification problems.展开更多
Power transformer is one of the most crucial devices in power grid.It is significant to determine incipient faults of power transformers fast and accurately.Input features play critical roles in fault diagnosis accura...Power transformer is one of the most crucial devices in power grid.It is significant to determine incipient faults of power transformers fast and accurately.Input features play critical roles in fault diagnosis accuracy.In order to further improve the fault diagnosis performance of power trans-formers,a random forest feature selection method coupled with optimized kernel extreme learning machine is presented in this study.Firstly,the random forest feature selection approach is adopted to rank 42 related input features derived from gas concentration,gas ratio and energy-weighted dissolved gas analysis.Afterwards,a kernel extreme learning machine tuned by the Aquila optimization algorithm is implemented to adjust crucial parameters and select the optimal feature subsets.The diagnosis accuracy is used to assess the fault diagnosis capability of concerned feature subsets.Finally,the optimal feature subsets are applied to establish fault diagnosis model.According to the experimental results based on two public datasets and comparison with 5 conventional approaches,it can be seen that the average accuracy of the pro-posed method is up to 94.5%,which is superior to that of other conventional approaches.Fault diagnosis performances verify that the optimum feature subset obtained by the presented method can dramatically improve power transformers fault diagnosis accuracy.展开更多
Early stroke prediction is vital to prevent damage. A stroke happens when the blood flow to the brain is disrupted by a clot or bleeding, resulting in brain death or injury. However, early diagnosis and treatment redu...Early stroke prediction is vital to prevent damage. A stroke happens when the blood flow to the brain is disrupted by a clot or bleeding, resulting in brain death or injury. However, early diagnosis and treatment reduce long-term needs and lower health costs. We aim for this research to be a machine-learning method for forecasting early warning signs of stroke. The methodology we employed feature selection techniques and multiple algorithms. Utilizing the XGboost Algorithm, the research findings indicate that their proposed model achieved an accuracy rate of 96.45%. This research shows that machine learning can effectively predict early warning signs of stroke, which can help reduce long-term treatment and rehabilitation needs and lower health costs.展开更多
The performance of the state-of-the-art Deep Reinforcement algorithms such as Proximal Policy Optimization, Twin Delayed Deep Deterministic Policy Gradient, and Soft Actor-Critic for generating a quadruped walking gai...The performance of the state-of-the-art Deep Reinforcement algorithms such as Proximal Policy Optimization, Twin Delayed Deep Deterministic Policy Gradient, and Soft Actor-Critic for generating a quadruped walking gait in a virtual environment was presented in previous research work titled “A Comparison of PPO, TD3, and SAC Reinforcement Algorithms for Quadruped Walking Gait Generation”. We demonstrated that the Soft Actor-Critic Reinforcement algorithm had the best performance generating the walking gait for a quadruped in certain instances of sensor configurations in the virtual environment. In this work, we present the performance analysis of the state-of-the-art Deep Reinforcement algorithms above for quadruped walking gait generation in a physical environment. The performance is determined in the physical environment by transfer learning augmented by real-time reinforcement learning for gait generation on a physical quadruped. The performance is analyzed on a quadruped equipped with a range of sensors such as position tracking using a stereo camera, contact sensing of each of the robot legs through force resistive sensors, and proprioceptive information of the robot body and legs using nine inertial measurement units. The performance comparison is presented using the metrics associated with the walking gait: average forward velocity (m/s), average forward velocity variance, average lateral velocity (m/s), average lateral velocity variance, and quaternion root mean square deviation. The strengths and weaknesses of each algorithm for the given task on the physical quadruped are discussed.展开更多
Credit card fraud remains a significant challenge, with financial losses and consumer protection at stake. This study addresses the need for practical, real-time fraud detection methodologies. Using a Kaggle credit ca...Credit card fraud remains a significant challenge, with financial losses and consumer protection at stake. This study addresses the need for practical, real-time fraud detection methodologies. Using a Kaggle credit card dataset, I tackle class imbalance using the Synthetic Minority Oversampling Technique (SMOTE) to enhance modeling efficiency. I compare several machine learning algorithms, including Logistic Regression, Linear Discriminant Analysis, K-nearest Neighbors, Classification and Regression Tree, Naive Bayes, Support Vector, Random Forest, XGBoost, and Light Gradient-Boosting Machine to classify transactions as fraud or genuine. Rigorous evaluation metrics, such as AUC, PRAUC, F1, KS, Recall, and Precision, identify the Random Forest as the best performer in detecting fraudulent activities. The Random Forest model successfully identifies approximately 92% of transactions scoring 90 and above as fraudulent, equating to a detection rate of over 70% for all fraudulent transactions in the test dataset. Moreover, the model captures more than half of the fraud in each bin of the test dataset. SHAP values provide model explainability, with the SHAP summary plot highlighting the global importance of individual features, such as “V12” and “V14”. SHAP force plots offer local interpretability, revealing the impact of specific features on individual predictions. This study demonstrates the potential of machine learning, particularly the Random Forest model, for real-time credit card fraud detection, offering a promising approach to mitigate financial losses and protect consumers.展开更多
Customer churn poses a significant challenge for the banking and finance industry in the United States, directly affecting profitability and market share. This study conducts a comprehensive comparative analysis of ma...Customer churn poses a significant challenge for the banking and finance industry in the United States, directly affecting profitability and market share. This study conducts a comprehensive comparative analysis of machine learning models for customer churn prediction, focusing on the U.S. context. The research evaluates the performance of logistic regression, random forest, and neural networks using industry-specific datasets, considering the economic impact and practical implications of the findings. The exploratory data analysis reveals unique patterns and trends in the U.S. banking and finance industry, such as the age distribution of customers and the prevalence of dormant accounts. The study incorporates macroeconomic factors to capture the potential influence of external conditions on customer churn behavior. The findings highlight the importance of leveraging advanced machine learning techniques and comprehensive customer data to develop effective churn prevention strategies in the U.S. context. By accurately predicting customer churn, financial institutions can proactively identify at-risk customers, implement targeted retention strategies, and optimize resource allocation. The study discusses the limitations and potential future improvements, serving as a roadmap for researchers and practitioners to further advance the field of customer churn prediction in the evolving landscape of the U.S. banking and finance industry.展开更多
The increasing amount and intricacy of network traffic in the modern digital era have worsened the difficulty of identifying abnormal behaviours that may indicate potential security breaches or operational interruptio...The increasing amount and intricacy of network traffic in the modern digital era have worsened the difficulty of identifying abnormal behaviours that may indicate potential security breaches or operational interruptions. Conventional detection approaches face challenges in keeping up with the ever-changing strategies of cyber-attacks, resulting in heightened susceptibility and significant harm to network infrastructures. In order to tackle this urgent issue, this project focused on developing an effective anomaly detection system that utilizes Machine Learning technology. The suggested model utilizes contemporary machine learning algorithms and frameworks to autonomously detect deviations from typical network behaviour. It promptly identifies anomalous activities that may indicate security breaches or performance difficulties. The solution entails a multi-faceted approach encompassing data collection, preprocessing, feature engineering, model training, and evaluation. By utilizing machine learning methods, the model is trained on a wide range of datasets that include both regular and abnormal network traffic patterns. This training ensures that the model can adapt to numerous scenarios. The main priority is to ensure that the system is functional and efficient, with a particular emphasis on reducing false positives to avoid unwanted alerts. Additionally, efforts are directed on improving anomaly detection accuracy so that the model can consistently distinguish between potentially harmful and benign activity. This project aims to greatly strengthen network security by addressing emerging cyber threats and improving their resilience and reliability.展开更多
Hyperparameter tuning is a key step in developing high-performing machine learning models, but searching large hyperparameter spaces requires extensive computation using standard sequential methods. This work analyzes...Hyperparameter tuning is a key step in developing high-performing machine learning models, but searching large hyperparameter spaces requires extensive computation using standard sequential methods. This work analyzes the performance gains from parallel versus sequential hyperparameter optimization. Using scikit-learn’s Randomized SearchCV, this project tuned a Random Forest classifier for fake news detection via randomized grid search. Setting n_jobs to -1 enabled full parallelization across CPU cores. Results show the parallel implementation achieved over 5× faster CPU times and 3× faster total run times compared to sequential tuning. However, test accuracy slightly dropped from 99.26% sequentially to 99.15% with parallelism, indicating a trade-off between evaluation efficiency and model performance. Still, the significant computational gains allow more extensive hyperparameter exploration within reasonable timeframes, outweighing the small accuracy decrease. Further analysis could better quantify this trade-off across different models, tuning techniques, tasks, and hardware.展开更多
基金funded by Firat University Scientific Research Projects Management Unit for the scientific research project of Feyza AltunbeyÖzbay,numbered MF.23.49.
文摘Artificial rabbits optimization(ARO)is a recently proposed biology-based optimization algorithm inspired by the detour foraging and random hiding behavior of rabbits in nature.However,for solving optimization problems,the ARO algorithm shows slow convergence speed and can fall into local minima.To overcome these drawbacks,this paper proposes chaotic opposition-based learning ARO(COARO),an improved version of the ARO algorithm that incorporates opposition-based learning(OBL)and chaotic local search(CLS)techniques.By adding OBL to ARO,the convergence speed of the algorithm increases and it explores the search space better.Chaotic maps in CLS provide rapid convergence by scanning the search space efficiently,since their ergodicity and non-repetitive properties.The proposed COARO algorithm has been tested using thirty-three distinct benchmark functions.The outcomes have been compared with the most recent optimization algorithms.Additionally,the COARO algorithm’s problem-solving capabilities have been evaluated using six different engineering design problems and compared with various other algorithms.This study also introduces a binary variant of the continuous COARO algorithm,named BCOARO.The performance of BCOARO was evaluated on the breast cancer dataset.The effectiveness of BCOARO has been compared with different feature selection algorithms.The proposed BCOARO outperforms alternative algorithms,according to the findings obtained for real applications in terms of accuracy performance,and fitness value.Extensive experiments show that the COARO and BCOARO algorithms achieve promising results compared to other metaheuristic algorithms.
基金support from the Ningxia Natural Science Foundation Project(2023AAC03361).
文摘The flying foxes optimization(FFO)algorithm,as a newly introduced metaheuristic algorithm,is inspired by the survival tactics of flying foxes in heat wave environments.FFO preferentially selects the best-performing individuals.This tendency will cause the newly generated solution to remain closely tied to the candidate optimal in the search area.To address this issue,the paper introduces an opposition-based learning-based search mechanism for FFO algorithm(IFFO).Firstly,this paper introduces niching techniques to improve the survival list method,which not only focuses on the adaptability of individuals but also considers the population’s crowding degree to enhance the global search capability.Secondly,an initialization strategy of opposition-based learning is used to perturb the initial population and elevate its quality.Finally,to verify the superiority of the improved search mechanism,IFFO,FFO and the cutting-edge metaheuristic algorithms are compared and analyzed using a set of test functions.The results prove that compared with other algorithms,IFFO is characterized by its rapid convergence,precise results and robust stability.
基金the Australian Government through the Australian Research Council's Discovery Projects funding scheme(Project DP190101592)the National Natural Science Foundation of China(Grant Nos.41972280 and 52179103).
文摘The travel time of rock compressional waves is an essential parameter used for estimating important rock properties,such as porosity,permeability,and lithology.Current methods,like wireline logging tests,provide broad measurements but lack finer resolution.Laboratory-based rock core measurements offer higher resolution but are resource-intensive.Conventionally,wireline logging and rock core measurements have been used independently.This study introduces a novel approach that integrates both data sources.The method leverages the detailed features from limited core data to enhance the resolution of wireline logging data.By combining machine learning with random field theory,the method allows for probabilistic predictions in regions with sparse data sampling.In this framework,12 parameters from wireline tests are used to predict trends in rock core data.The residuals are modeled using random field theory.The outcomes are high-resolution predictions that combine both the predicted trend and the probabilistic realizations of the residual.By utilizing unconditional and conditional random field theories,this method enables unconditional and conditional simulations of the underlying high-resolution rock compressional wave travel time profile and provides uncertainty estimates.This integrated approach optimizes the use of existing core and logging data.Its applicability is confirmed in an oil project in West China.
基金supported by the First Batch of Teaching Reform Projects of Zhejiang Higher Education“14th Five-Year Plan”(jg20220434)Special Scientific Research Project for Space Debris and Near-Earth Asteroid Defense(KJSP2020020202)+1 种基金Natural Science Foundation of Zhejiang Province(LGG19F030010)National Natural Science Foundation of China(61703183).
文摘As a new bionic algorithm,Spider Monkey Optimization(SMO)has been widely used in various complex optimization problems in recent years.However,the new space exploration power of SMO is limited and the diversity of the population in SMO is not abundant.Thus,this paper focuses on how to reconstruct SMO to improve its performance,and a novel spider monkey optimization algorithm with opposition-based learning and orthogonal experimental design(SMO^(3))is developed.A position updatingmethod based on the historical optimal domain and particle swarmfor Local Leader Phase(LLP)andGlobal Leader Phase(GLP)is presented to improve the diversity of the population of SMO.Moreover,an opposition-based learning strategy based on self-extremum is proposed to avoid suffering from premature convergence and getting stuck at locally optimal values.Also,a local worst individual elimination method based on orthogonal experimental design is used for helping the SMO algorithm eliminate the poor individuals in time.Furthermore,an extended SMO^(3)named CSMO^(3)is investigated to deal with constrained optimization problems.The proposed algorithm is applied to both unconstrained and constrained functions which include the CEC2006 benchmark set and three engineering problems.Experimental results show that the performance of the proposed algorithm is better than three well-known SMO algorithms and other evolutionary algorithms in unconstrained and constrained problems.
基金Funded by the National Key Research and Development Program of China(No.2023YFB3812200)。
文摘A machine learning(ML)-based random forest(RF)classification model algorithm was employed to investigate the main factors affecting the formation of the core-shell structure of BaTiO_(3)-based ceramics and their interpretability was analyzed by using Shapley additive explanations(SHAP).An F1-score changed from 0.8795 to 0.9310,accuracy from 0.8450 to 0.9070,precision from 0.8714 to 0.9000,recall from 0.8929 to 0.9643,and ROC/AUC value of 0.97±0.03 was achieved by the RF classification with the optimal set of features containing only 5 features,demonstrating the high accuracy of our model and its high robustness.During the interpretability analysis of the model,it was found that the electronegativity,melting point,and sintering temperature of the dopant contribute highly to the formation of the core-shell structure,and based on these characteristics,specific ranges were delineated and twelve elements were finally obtained that met all the requirements,namely Si,Sc,Mn,Fe,Co,Ni,Pd,Er,Tm,Lu,Pa,and Cm.In the process of exploring the structure of the core-shell,the doping elements can be effectively localized to be selected by choosing the range of features.
基金Cuiying Scientific and Technological Innovation Program of the Second Hospital,No.CY2021-BJ-A16 and No.CY2022-QN-A18Clinical Medical School of Lanzhou University and Lanzhou Science and Technology Development Guidance Plan Project,No.2023-ZD-85.
文摘BACKGROUND Liver cancer is one of the most prevalent malignant tumors worldwide,and its early detection and treatment are crucial for enhancing patient survival rates and quality of life.However,the early symptoms of liver cancer are often not obvious,resulting in a late-stage diagnosis in many patients,which significantly reduces the effectiveness of treatment.Developing a highly targeted,widely applicable,and practical risk prediction model for liver cancer is crucial for enhancing the early diagnosis and long-term survival rates among affected individuals.AIM To develop a liver cancer risk prediction model by employing machine learning techniques,and subsequently assess its performance.METHODS In this study,a total of 550 patients were enrolled,with 190 hepatocellular carcinoma(HCC)and 195 cirrhosis patients serving as the training cohort,and 83 HCC and 82 cirrhosis patients forming the validation cohort.Logistic regression(LR),support vector machine(SVM),random forest(RF),and least absolute shrinkage and selection operator(LASSO)regression models were developed in the training cohort.Model performance was assessed in the validation cohort.Additionally,this study conducted a comparative evaluation of the diagnostic efficacy between the ASAP model and the model developed in this study using receiver operating characteristic curve,calibration curve,and decision curve analysis(DCA)to determine the optimal predictive model for assessing liver cancer risk.RESULTS Six variables including age,white blood cell,red blood cell,platelet counts,alpha-fetoprotein and protein induced by vitamin K absence or antagonist II levels were used to develop LR,SVM,RF,and LASSO regression models.The RF model exhibited superior discrimination,and the area under curve of the training and validation sets was 0.969 and 0.858,respectively.These values significantly surpassed those of the LR(0.850 and 0.827),SVM(0.860 and 0.803),LASSO regression(0.845 and 0.831),and ASAP(0.866 and 0.813)models.Furthermore,calibration and DCA indicated that the RF model exhibited robust calibration and clinical validity.CONCLUSION The RF model demonstrated excellent prediction capabilities for HCC and can facilitate early diagnosis of HCC in clinical practice.
基金funded by Key-Area Research and Development Program Project of Guangdong Province (2021B0101230003)China Southern Power Grid Science and Technology Project (ZBKJXM20220004).
文摘As the global demand for renewable energy grows,solar energy is gaining attention as a clean,sustainable energy source.Accurate assessment of solar energy resources is crucial for the siting and design of photovoltaic power plants.This study proposes an integrated deep learning-based photovoltaic resource assessment method.Ensemble learning and deep learning methods are fused for photovoltaic resource assessment for the first time.The proposed method combines the random forest,gated recurrent unit,and long short-term memory to effectively improve the accuracy and reliability of photovoltaic resource assessment.The proposed method has strong adaptability and high accuracy even in the photovoltaic resource assessment of complex terrain and landscape.The experimental results show that the proposed method outperforms the comparison algorithm in all evaluation indexes,indicating that the proposed method has higher accuracy and reliability in photovoltaic resource assessment with improved generalization performance traditional single algorithm.
文摘Manual investigation of chest radiography(CXR)images by physicians is crucial for effective decision-making in COVID-19 diagnosis.However,the high demand during the pandemic necessitates auxiliary help through image analysis and machine learning techniques.This study presents a multi-threshold-based segmentation technique to probe high pixel intensity regions in CXR images of various pathologies,including normal cases.Texture information is extracted using gray co-occurrence matrix(GLCM)-based features,while vessel-like features are obtained using Frangi,Sato,and Meijering filters.Machine learning models employing Decision Tree(DT)and RandomForest(RF)approaches are designed to categorize CXR images into common lung infections,lung opacity(LO),COVID-19,and viral pneumonia(VP).The results demonstrate that the fusion of texture and vesselbased features provides an effective ML model for aiding diagnosis.The ML model validation using performance measures,including an accuracy of approximately 91.8%with an RF-based classifier,supports the usefulness of the feature set and classifier model in categorizing the four different pathologies.Furthermore,the study investigates the importance of the devised features in identifying the underlying pathology and incorporates histogrambased analysis.This analysis reveals varying natural pixel distributions in CXR images belonging to the normal,COVID-19,LO,and VP groups,motivating the incorporation of additional features such as mean,standard deviation,skewness,and percentile based on the filtered images.Notably,the study achieves a considerable improvement in categorizing COVID-19 from LO,with a true positive rate of 97%,further substantiating the effectiveness of the methodology implemented.
文摘The freshness of fruits is considered to be one of the essential characteristics for consumers in determining their quality,flavor and nutritional value.The primary need for identifying rotten fruits is to ensure that only fresh and high-quality fruits are sold to consumers.The impact of rotten fruits can foster harmful bacteria,molds and other microorganisms that can cause food poisoning and other illnesses to the consumers.The overall purpose of the study is to classify rotten fruits,which can affect the taste,texture,and appearance of other fresh fruits,thereby reducing their shelf life.The agriculture and food industries are increasingly adopting computer vision technology to detect rotten fruits and forecast their shelf life.Hence,this research work mainly focuses on the Convolutional Neural Network’s(CNN)deep learning model,which helps in the classification of rotten fruits.The proposed methodology involves real-time analysis of a dataset of various types of fruits,including apples,bananas,oranges,papayas and guavas.Similarly,machine learningmodels such as GaussianNaïve Bayes(GNB)and random forest are used to predict the fruit’s shelf life.The results obtained from the various pre-trained models for rotten fruit detection are analysed based on an accuracy score to determine the best model.In comparison to other pre-trained models,the visual geometry group16(VGG16)obtained a higher accuracy score of 95%.Likewise,the random forest model delivers a better accuracy score of 88% when compared with GNB in forecasting the fruit’s shelf life.By developing an accurate classification model,only fresh and safe fruits reach consumers,reducing the risks associated with contaminated produce.Thereby,the proposed approach will have a significant impact on the food industry for efficient fruit distribution and also benefit customers to purchase fresh fruits.
文摘The application of machine learning(ML)algorithms in various fields of hepatology is an issue of interest.However,we must be cautious with the results.In this letter,based on a published ML prediction model for acute kidney injury after liver surgery,we discuss some limitations of ML models and how they may be addressed in the future.Although the future faces significant challenges,it also holds a great potential.
文摘Survival rates following radical surgery for gastric neuroendocrine neoplasms(g-NENs)are low,with high recurrence rates.This fact impacts patient prognosis and complicates postoperative management.Traditional prognostic models,including the Cox proportional hazards(CoxPH)model,have shown limited predictive power for postoperative survival in gastrointestinal neuroectodermal tumor patients.Machine learning methods offer a unique opportunity to analyze complex relationships within datasets,providing tools and methodologies to assess large volumes of high-dimensional,multimodal data generated by biological sciences.These methods show promise in predicting outcomes across various medical disciplines.In the context of g-NENs,utilizing machine learning to predict survival outcomes holds potential for personalized postoperative management strategies.This editorial reviews a study exploring the advantages and effectiveness of the random survival forest(RSF)model,using the lymph node ratio(LNR),in predicting disease-specific survival(DSS)in postoperative g-NEN patients stratified into low-risk and high-risk groups.The findings demonstrate that the RSF model,incorporating LNR,outperformed the CoxPH model in predicting DSS and constitutes an important step towards precision medicine.
基金supported by the projects of the China Geological Survey(DD20221729,DD20190291)Zhuhai Urban Geological Survey(including informatization)(MZCD–2201–008).
文摘Machine learning is currently one of the research hotspots in the field of landslide prediction.To clarify and evaluate the differences in characteristics and prediction effects of different machine learning models,Conghua District,which is the most prone to landslide disasters in Guangzhou,was selected for landslide susceptibility evaluation.The evaluation factors were selected by using correlation analysis and variance expansion factor method.Applying four machine learning methods namely Logistic Regression(LR),Random Forest(RF),Support Vector Machines(SVM),and Extreme Gradient Boosting(XGB),landslide models were constructed.Comparative analysis and evaluation of the model were conducted through statistical indices and receiver operating characteristic(ROC)curves.The results showed that LR,RF,SVM,and XGB models have good predictive performance for landslide susceptibility,with the area under curve(AUC)values of 0.752,0.965,0.996,and 0.998,respectively.XGB model had the highest predictive ability,followed by RF model,SVM model,and LR model.The frequency ratio(FR)accuracy of LR,RF,SVM,and XGB models was 0.775,0.842,0.759,and 0.822,respectively.RF and XGB models were superior to LR and SVM models,indicating that the integrated algorithm has better predictive ability than a single classification algorithm in regional landslide classification problems.
基金support of national natural science foundation of China(No.52067021)natural science foundation of Xinjiang(2022D01C35)+1 种基金excellent youth scientific and technological talents plan of Xinjiang(No.2019Q012)major science and technology special project of Xinjiang Uygur Autonomous Region(2022A01002-2).
文摘Power transformer is one of the most crucial devices in power grid.It is significant to determine incipient faults of power transformers fast and accurately.Input features play critical roles in fault diagnosis accuracy.In order to further improve the fault diagnosis performance of power trans-formers,a random forest feature selection method coupled with optimized kernel extreme learning machine is presented in this study.Firstly,the random forest feature selection approach is adopted to rank 42 related input features derived from gas concentration,gas ratio and energy-weighted dissolved gas analysis.Afterwards,a kernel extreme learning machine tuned by the Aquila optimization algorithm is implemented to adjust crucial parameters and select the optimal feature subsets.The diagnosis accuracy is used to assess the fault diagnosis capability of concerned feature subsets.Finally,the optimal feature subsets are applied to establish fault diagnosis model.According to the experimental results based on two public datasets and comparison with 5 conventional approaches,it can be seen that the average accuracy of the pro-posed method is up to 94.5%,which is superior to that of other conventional approaches.Fault diagnosis performances verify that the optimum feature subset obtained by the presented method can dramatically improve power transformers fault diagnosis accuracy.
文摘Early stroke prediction is vital to prevent damage. A stroke happens when the blood flow to the brain is disrupted by a clot or bleeding, resulting in brain death or injury. However, early diagnosis and treatment reduce long-term needs and lower health costs. We aim for this research to be a machine-learning method for forecasting early warning signs of stroke. The methodology we employed feature selection techniques and multiple algorithms. Utilizing the XGboost Algorithm, the research findings indicate that their proposed model achieved an accuracy rate of 96.45%. This research shows that machine learning can effectively predict early warning signs of stroke, which can help reduce long-term treatment and rehabilitation needs and lower health costs.
文摘The performance of the state-of-the-art Deep Reinforcement algorithms such as Proximal Policy Optimization, Twin Delayed Deep Deterministic Policy Gradient, and Soft Actor-Critic for generating a quadruped walking gait in a virtual environment was presented in previous research work titled “A Comparison of PPO, TD3, and SAC Reinforcement Algorithms for Quadruped Walking Gait Generation”. We demonstrated that the Soft Actor-Critic Reinforcement algorithm had the best performance generating the walking gait for a quadruped in certain instances of sensor configurations in the virtual environment. In this work, we present the performance analysis of the state-of-the-art Deep Reinforcement algorithms above for quadruped walking gait generation in a physical environment. The performance is determined in the physical environment by transfer learning augmented by real-time reinforcement learning for gait generation on a physical quadruped. The performance is analyzed on a quadruped equipped with a range of sensors such as position tracking using a stereo camera, contact sensing of each of the robot legs through force resistive sensors, and proprioceptive information of the robot body and legs using nine inertial measurement units. The performance comparison is presented using the metrics associated with the walking gait: average forward velocity (m/s), average forward velocity variance, average lateral velocity (m/s), average lateral velocity variance, and quaternion root mean square deviation. The strengths and weaknesses of each algorithm for the given task on the physical quadruped are discussed.
文摘Credit card fraud remains a significant challenge, with financial losses and consumer protection at stake. This study addresses the need for practical, real-time fraud detection methodologies. Using a Kaggle credit card dataset, I tackle class imbalance using the Synthetic Minority Oversampling Technique (SMOTE) to enhance modeling efficiency. I compare several machine learning algorithms, including Logistic Regression, Linear Discriminant Analysis, K-nearest Neighbors, Classification and Regression Tree, Naive Bayes, Support Vector, Random Forest, XGBoost, and Light Gradient-Boosting Machine to classify transactions as fraud or genuine. Rigorous evaluation metrics, such as AUC, PRAUC, F1, KS, Recall, and Precision, identify the Random Forest as the best performer in detecting fraudulent activities. The Random Forest model successfully identifies approximately 92% of transactions scoring 90 and above as fraudulent, equating to a detection rate of over 70% for all fraudulent transactions in the test dataset. Moreover, the model captures more than half of the fraud in each bin of the test dataset. SHAP values provide model explainability, with the SHAP summary plot highlighting the global importance of individual features, such as “V12” and “V14”. SHAP force plots offer local interpretability, revealing the impact of specific features on individual predictions. This study demonstrates the potential of machine learning, particularly the Random Forest model, for real-time credit card fraud detection, offering a promising approach to mitigate financial losses and protect consumers.
文摘Customer churn poses a significant challenge for the banking and finance industry in the United States, directly affecting profitability and market share. This study conducts a comprehensive comparative analysis of machine learning models for customer churn prediction, focusing on the U.S. context. The research evaluates the performance of logistic regression, random forest, and neural networks using industry-specific datasets, considering the economic impact and practical implications of the findings. The exploratory data analysis reveals unique patterns and trends in the U.S. banking and finance industry, such as the age distribution of customers and the prevalence of dormant accounts. The study incorporates macroeconomic factors to capture the potential influence of external conditions on customer churn behavior. The findings highlight the importance of leveraging advanced machine learning techniques and comprehensive customer data to develop effective churn prevention strategies in the U.S. context. By accurately predicting customer churn, financial institutions can proactively identify at-risk customers, implement targeted retention strategies, and optimize resource allocation. The study discusses the limitations and potential future improvements, serving as a roadmap for researchers and practitioners to further advance the field of customer churn prediction in the evolving landscape of the U.S. banking and finance industry.
文摘The increasing amount and intricacy of network traffic in the modern digital era have worsened the difficulty of identifying abnormal behaviours that may indicate potential security breaches or operational interruptions. Conventional detection approaches face challenges in keeping up with the ever-changing strategies of cyber-attacks, resulting in heightened susceptibility and significant harm to network infrastructures. In order to tackle this urgent issue, this project focused on developing an effective anomaly detection system that utilizes Machine Learning technology. The suggested model utilizes contemporary machine learning algorithms and frameworks to autonomously detect deviations from typical network behaviour. It promptly identifies anomalous activities that may indicate security breaches or performance difficulties. The solution entails a multi-faceted approach encompassing data collection, preprocessing, feature engineering, model training, and evaluation. By utilizing machine learning methods, the model is trained on a wide range of datasets that include both regular and abnormal network traffic patterns. This training ensures that the model can adapt to numerous scenarios. The main priority is to ensure that the system is functional and efficient, with a particular emphasis on reducing false positives to avoid unwanted alerts. Additionally, efforts are directed on improving anomaly detection accuracy so that the model can consistently distinguish between potentially harmful and benign activity. This project aims to greatly strengthen network security by addressing emerging cyber threats and improving their resilience and reliability.
文摘Hyperparameter tuning is a key step in developing high-performing machine learning models, but searching large hyperparameter spaces requires extensive computation using standard sequential methods. This work analyzes the performance gains from parallel versus sequential hyperparameter optimization. Using scikit-learn’s Randomized SearchCV, this project tuned a Random Forest classifier for fake news detection via randomized grid search. Setting n_jobs to -1 enabled full parallelization across CPU cores. Results show the parallel implementation achieved over 5× faster CPU times and 3× faster total run times compared to sequential tuning. However, test accuracy slightly dropped from 99.26% sequentially to 99.15% with parallelism, indicating a trade-off between evaluation efficiency and model performance. Still, the significant computational gains allow more extensive hyperparameter exploration within reasonable timeframes, outweighing the small accuracy decrease. Further analysis could better quantify this trade-off across different models, tuning techniques, tasks, and hardware.