Strong mechanical vibration and acoustical signals of grinding process contain useful information related to load parameters in ball mills. It is a challenge to extract latent features and construct soft sensor model ...Strong mechanical vibration and acoustical signals of grinding process contain useful information related to load parameters in ball mills. It is a challenge to extract latent features and construct soft sensor model with high dimensional frequency spectra of these signals. This paper aims to develop a selective ensemble modeling approach based on nonlinear latent frequency spectral feature extraction for accurate measurement of material to ball volume ratio. Latent features are first extracted from different vibrations and acoustic spectral segments by kernel partial least squares. Algorithms of bootstrap and least squares support vector machines are employed to produce candidate sub-models using these latent features as inputs. Ensemble sub-models are selected based on genetic algorithm optimization toolbox. Partial least squares regression is used to combine these sub-models to eliminate collinearity among their prediction outputs. Results indicate that the proposed modeling approach has better prediction performance than previous ones.展开更多
Despite the advancement within the last decades in the field of smart grids,energy consumption forecasting utilizing the metrological features is still challenging.This paper proposes a genetic algorithm-based adaptiv...Despite the advancement within the last decades in the field of smart grids,energy consumption forecasting utilizing the metrological features is still challenging.This paper proposes a genetic algorithm-based adaptive error curve learning ensemble(GA-ECLE)model.The proposed technique copes with the stochastic variations of improving energy consumption forecasting using a machine learning-based ensembled approach.A modified ensemble model based on a utilizing error of model as a feature is used to improve the forecast accuracy.This approach combines three models,namely CatBoost(CB),Gradient Boost(GB),and Multilayer Perceptron(MLP).The ensembled CB-GB-MLP model’s inner mechanism consists of generating a meta-data from Gradient Boosting and CatBoost models to compute the final predictions using the Multilayer Perceptron network.A genetic algorithm is used to obtain the optimal features to be used for the model.To prove the proposed model’s effectiveness,we have used a four-phase technique using Jeju island’s real energy consumption data.In the first phase,we have obtained the results by applying the CB-GB-MLP model.In the second phase,we have utilized a GA-ensembled model with optimal features.The third phase is for the comparison of the energy forecasting result with the proposed ECL-based model.The fourth stage is the final stage,where we have applied the GA-ECLE model.We obtained a mean absolute error of 3.05,and a root mean square error of 5.05.Extensive experimental results are provided,demonstrating the superiority of the proposed GA-ECLE model over traditional ensemble models.展开更多
N^(6)-Methyladenine is a dynamic and reversible post translational modification,which plays an essential role in various biological processes.Because of the current inability to identify m6A-containing mRNAs,computati...N^(6)-Methyladenine is a dynamic and reversible post translational modification,which plays an essential role in various biological processes.Because of the current inability to identify m6A-containing mRNAs,computational approaches have been developed to identify m6A sites in DNA sequences.Aiming to improve prediction performance,we introduced a novel ensemble computational approach based on three hybrid deep neural networks,including a convolutional neural network,a capsule network,and a bidirectional gated recurrent unit(BiGRU)with the self-attention mechanism,to identify m6A sites in four tissues of three species.Across a total of 11 datasets,we selected different feature subsets,after optimized from 4933 dimensional features,as input for the deep hybrid neural networks.In addition,to solve the deviation caused by the relatively small number of experimentally verified samples,we constructed an ensemble model through integrating five sub-classifiers based on different training datasets.When compared through 5-fold cross-validation and independent tests,our model showed its superiority to previous methods,im6A-TS-CNN and iRNA-m6A.展开更多
Breast cancer is one of the leading cancers among women.It has the second-highest mortality rate in women after lung cancer.Timely detection,especially in the early stages,can help increase survival rates.However,manu...Breast cancer is one of the leading cancers among women.It has the second-highest mortality rate in women after lung cancer.Timely detection,especially in the early stages,can help increase survival rates.However,manual diagnosis of breast cancer is a tedious and time-consuming process,and the accuracy of detection is reliant on the quality of the images and the radiologist’s experience.However,computer-aided medical diagnosis has recently shown promising results,leading to the need to develop an efficient system that can aid radiologists in diagnosing breast cancer in its early stages.The research presented in this paper is focused on the multi-class classification of breast cancer.The deep transfer learning approach has been utilized to train the deep learning models,and a pre-processing technique has been used to improve the quality of the ultrasound dataset.The proposed technique utilizes two deep learning models,Mobile-NetV2 and DenseNet201,for the composition of the deep ensemble model.Deep learning models are fine-tuned along with hyperparameter tuning to achieve better results.Subsequently,entropy-based feature selection is used.Breast cancer identification using the proposed classification approach was found to attain an accuracy of 97.04%,while the sensitivity and F1 score were 96.87%and 96.76%,respectively.The performance of the proposed model is very effective and outperforms other state-of-the-art techniques presented in the literature.展开更多
Metamaterial Antenna is a subclass of antennas that makes use of metamaterial to improve performance.Metamaterial antennas can overcome the bandwidth constraint associated with tiny antennas.Machine learning is receiv...Metamaterial Antenna is a subclass of antennas that makes use of metamaterial to improve performance.Metamaterial antennas can overcome the bandwidth constraint associated with tiny antennas.Machine learning is receiving a lot of interest in optimizing solutions in a variety of areas.Machine learning methods are already a significant component of ongoing research and are anticipated to play a critical role in today’s technology.The accuracy of the forecast is mostly determined by the model used.The purpose of this article is to provide an optimal ensemble model for predicting the bandwidth and gain of the Metamaterial Antenna.Support Vector Machines(SVM),Random Forest,K-Neighbors Regressor,and Decision Tree Regressor were utilized as the basic models.The Adaptive Dynamic Polar Rose Guided Whale Optimization method,named AD-PRS-Guided WOA,was used to pick the optimal features from the datasets.The suggested model is compared to models based on five variables and to the average ensemble model.The findings indicate that the presented model using Random Forest results in a Root Mean Squared Error(RMSE)of(0.0102)for bandwidth and RMSE of(0.0891)for gain.This is superior to other models and can accurately predict antenna bandwidth and gain.展开更多
Heuristic or clustering based time series aggregation methods are often used to reduce temporal complexity of energy system models by selecting representative days.However,these methods potentially neglect relevant in...Heuristic or clustering based time series aggregation methods are often used to reduce temporal complexity of energy system models by selecting representative days.However,these methods potentially neglect relevant information of time series(e.g.,distribution parameters).To identify relevant time series parameters,feature selection algorithms can be applied.The present research contributes by(a)developing a new feature selection approach based on clustering,nested modeling and regression(CNR)which is designed for applications requiring high selectivity and using different data sets,(b)comparing and evaluating CNR with feature selection methods available from the literature(e.g.,LASSO)and(c)identifying relevant information of the time series applied in energy system models,in particular those of demand,photovoltaic and wind.Results show that CNR achieves on average up to 101%lower mean absolute errors when methods are directly compared.Thus,CNR better identifies relevant information when the number of selected features is restricted.The disadvantage of CNR,however,is its high computational effort.A potential remedy to counter this is the combination with another method(e.g.,as pre-feature selection).In terms of relevant information,energy systems including photovoltaic are mainly characterized by the correlation between demand and photovoltaic time series as well as the range and the 35%quantile of demand.When energy systems include wind power,the minimum and mean of wind as well as the correlation between demand and wind time series are relevant characteristics.The implications of these findings are discussed.展开更多
Railway transportation plays an important role in modern society. As China's massive railway transportation network continues to grow in total mileage and operation density, the energy consumption of trains become...Railway transportation plays an important role in modern society. As China's massive railway transportation network continues to grow in total mileage and operation density, the energy consumption of trains becomes a serious concern. For any given route, the geographic characteristics are known a priori, but the parameters(e.g., loading and marshaling) of trains vary from one trip to another. An extensive analysis of the train operation data suggests that the control gear operation of trains is the most important factor that affects the energy consumption. Such an observation determines that the problem of energy-efficient train driving has to be addressed by considering both the geographic information and the trip parameters. However, the problem is difficult to solve due to its high dimension, nonlinearity, complex constraints, and time-varying characteristics. Faced with these difficulties, we propose an energy-efficient train control framework based on a hierarchical ensemble learning approach. Through hierarchical refinement, we learn prediction models of speed and gear. The learned models can be used to derive optimized driving operations under real-time requirements. This study uses random forest and bagging – REPTree as classification algorithm and regression algorithm, respectively. We conduct an extensive study on the potential of bagging, decision trees, random forest, and feature selection to design an effective hierarchical ensemble learning framework. The proposed framework was testified through simulation. The average energy consumption of the proposed method is over 7% lower than that of human drivers.展开更多
针对滚动轴承故障诊断时所提取的特征值中可能含有较小相关性和冗余性特征,采用基于Wrapper模式的距离评价技术(distance evaluation technique,简称DET)进行特征选择。在分类器的设计中,提出了基于稳健回归的多变量预测模型(Robust reg...针对滚动轴承故障诊断时所提取的特征值中可能含有较小相关性和冗余性特征,采用基于Wrapper模式的距离评价技术(distance evaluation technique,简称DET)进行特征选择。在分类器的设计中,提出了基于稳健回归的多变量预测模型(Robust regression-Variable predictive model based class discriminate,简称RRVPMCD)分类方法,以减小"异常值"对参数估计的影响,从而有望建立更加准确的预测模型。即根据Wrapper模式的特点,首先通过DET方法计算出各特征值对类的敏感度,并结合RRVPMCD分类器,选择敏感度最大的若干特征值组成特征向量矩阵;然后用RRVPMCD方法进行训练,建立预测模型;最后用所建立的预测模型进行模式识别。实验分析结果表明,基于Wrapper模式的特征选择方法和RRVPMCD分类方法相结合可以有效地对滚动轴承的工作状态和故障类型进行识别。展开更多
基金Supported partially by the Post Doctoral Natural Science Foundation of China(2013M532118,2015T81082)the National Natural Science Foundation of China(61573364,61273177,61503066)+2 种基金the State Key Laboratory of Synthetical Automation for Process Industriesthe National High Technology Research and Development Program of China(2015AA043802)the Scientific Research Fund of Liaoning Provincial Education Department(L2013272)
文摘Strong mechanical vibration and acoustical signals of grinding process contain useful information related to load parameters in ball mills. It is a challenge to extract latent features and construct soft sensor model with high dimensional frequency spectra of these signals. This paper aims to develop a selective ensemble modeling approach based on nonlinear latent frequency spectral feature extraction for accurate measurement of material to ball volume ratio. Latent features are first extracted from different vibrations and acoustic spectral segments by kernel partial least squares. Algorithms of bootstrap and least squares support vector machines are employed to produce candidate sub-models using these latent features as inputs. Ensemble sub-models are selected based on genetic algorithm optimization toolbox. Partial least squares regression is used to combine these sub-models to eliminate collinearity among their prediction outputs. Results indicate that the proposed modeling approach has better prediction performance than previous ones.
基金This research was financially supported by the Ministry of Small and Mediumsized Enterprises(SMEs)and Startups(MSS),Korea,under the“Regional Specialized Industry Development Program(R&D,S2855401)”supervised by the Korea Institute for Advancement of Technology(KIAT).
文摘Despite the advancement within the last decades in the field of smart grids,energy consumption forecasting utilizing the metrological features is still challenging.This paper proposes a genetic algorithm-based adaptive error curve learning ensemble(GA-ECLE)model.The proposed technique copes with the stochastic variations of improving energy consumption forecasting using a machine learning-based ensembled approach.A modified ensemble model based on a utilizing error of model as a feature is used to improve the forecast accuracy.This approach combines three models,namely CatBoost(CB),Gradient Boost(GB),and Multilayer Perceptron(MLP).The ensembled CB-GB-MLP model’s inner mechanism consists of generating a meta-data from Gradient Boosting and CatBoost models to compute the final predictions using the Multilayer Perceptron network.A genetic algorithm is used to obtain the optimal features to be used for the model.To prove the proposed model’s effectiveness,we have used a four-phase technique using Jeju island’s real energy consumption data.In the first phase,we have obtained the results by applying the CB-GB-MLP model.In the second phase,we have utilized a GA-ensembled model with optimal features.The third phase is for the comparison of the energy forecasting result with the proposed ECL-based model.The fourth stage is the final stage,where we have applied the GA-ECLE model.We obtained a mean absolute error of 3.05,and a root mean square error of 5.05.Extensive experimental results are provided,demonstrating the superiority of the proposed GA-ECLE model over traditional ensemble models.
基金supported by the National Natural Science Foundation of China(Nos.62071079 and 61803065).
文摘N^(6)-Methyladenine is a dynamic and reversible post translational modification,which plays an essential role in various biological processes.Because of the current inability to identify m6A-containing mRNAs,computational approaches have been developed to identify m6A sites in DNA sequences.Aiming to improve prediction performance,we introduced a novel ensemble computational approach based on three hybrid deep neural networks,including a convolutional neural network,a capsule network,and a bidirectional gated recurrent unit(BiGRU)with the self-attention mechanism,to identify m6A sites in four tissues of three species.Across a total of 11 datasets,we selected different feature subsets,after optimized from 4933 dimensional features,as input for the deep hybrid neural networks.In addition,to solve the deviation caused by the relatively small number of experimentally verified samples,we constructed an ensemble model through integrating five sub-classifiers based on different training datasets.When compared through 5-fold cross-validation and independent tests,our model showed its superiority to previous methods,im6A-TS-CNN and iRNA-m6A.
基金This research work was funded by Institutional Fund Projects under Grant No.(IFPIP:1614-611-1442)from the Ministry of Education and King Abdulaziz University,DSR,Jeddah,Saudi Arabia.
文摘Breast cancer is one of the leading cancers among women.It has the second-highest mortality rate in women after lung cancer.Timely detection,especially in the early stages,can help increase survival rates.However,manual diagnosis of breast cancer is a tedious and time-consuming process,and the accuracy of detection is reliant on the quality of the images and the radiologist’s experience.However,computer-aided medical diagnosis has recently shown promising results,leading to the need to develop an efficient system that can aid radiologists in diagnosing breast cancer in its early stages.The research presented in this paper is focused on the multi-class classification of breast cancer.The deep transfer learning approach has been utilized to train the deep learning models,and a pre-processing technique has been used to improve the quality of the ultrasound dataset.The proposed technique utilizes two deep learning models,Mobile-NetV2 and DenseNet201,for the composition of the deep ensemble model.Deep learning models are fine-tuned along with hyperparameter tuning to achieve better results.Subsequently,entropy-based feature selection is used.Breast cancer identification using the proposed classification approach was found to attain an accuracy of 97.04%,while the sensitivity and F1 score were 96.87%and 96.76%,respectively.The performance of the proposed model is very effective and outperforms other state-of-the-art techniques presented in the literature.
文摘Metamaterial Antenna is a subclass of antennas that makes use of metamaterial to improve performance.Metamaterial antennas can overcome the bandwidth constraint associated with tiny antennas.Machine learning is receiving a lot of interest in optimizing solutions in a variety of areas.Machine learning methods are already a significant component of ongoing research and are anticipated to play a critical role in today’s technology.The accuracy of the forecast is mostly determined by the model used.The purpose of this article is to provide an optimal ensemble model for predicting the bandwidth and gain of the Metamaterial Antenna.Support Vector Machines(SVM),Random Forest,K-Neighbors Regressor,and Decision Tree Regressor were utilized as the basic models.The Adaptive Dynamic Polar Rose Guided Whale Optimization method,named AD-PRS-Guided WOA,was used to pick the optimal features from the datasets.The suggested model is compared to models based on five variables and to the average ensemble model.The findings indicate that the presented model using Random Forest results in a Root Mean Squared Error(RMSE)of(0.0102)for bandwidth and RMSE of(0.0891)for gain.This is superior to other models and can accurately predict antenna bandwidth and gain.
文摘Heuristic or clustering based time series aggregation methods are often used to reduce temporal complexity of energy system models by selecting representative days.However,these methods potentially neglect relevant information of time series(e.g.,distribution parameters).To identify relevant time series parameters,feature selection algorithms can be applied.The present research contributes by(a)developing a new feature selection approach based on clustering,nested modeling and regression(CNR)which is designed for applications requiring high selectivity and using different data sets,(b)comparing and evaluating CNR with feature selection methods available from the literature(e.g.,LASSO)and(c)identifying relevant information of the time series applied in energy system models,in particular those of demand,photovoltaic and wind.Results show that CNR achieves on average up to 101%lower mean absolute errors when methods are directly compared.Thus,CNR better identifies relevant information when the number of selected features is restricted.The disadvantage of CNR,however,is its high computational effort.A potential remedy to counter this is the combination with another method(e.g.,as pre-feature selection).In terms of relevant information,energy systems including photovoltaic are mainly characterized by the correlation between demand and photovoltaic time series as well as the range and the 35%quantile of demand.When energy systems include wind power,the minimum and mean of wind as well as the correlation between demand and wind time series are relevant characteristics.The implications of these findings are discussed.
基金sponsored in part by the National Natural Science Foundation of China(Nos.61872217 and 61527812)Industrial Internet Innovation&Development Project of Ministry of Industry and Information Technology of China+2 种基金National Science and Technology Major Project(No.2016ZX01038101)MIIT IT funds(Research and Application of TCN Key Technologiezs)of Chinathe National Key Technology R&D Program(No.2015BAG14B01-02)
文摘Railway transportation plays an important role in modern society. As China's massive railway transportation network continues to grow in total mileage and operation density, the energy consumption of trains becomes a serious concern. For any given route, the geographic characteristics are known a priori, but the parameters(e.g., loading and marshaling) of trains vary from one trip to another. An extensive analysis of the train operation data suggests that the control gear operation of trains is the most important factor that affects the energy consumption. Such an observation determines that the problem of energy-efficient train driving has to be addressed by considering both the geographic information and the trip parameters. However, the problem is difficult to solve due to its high dimension, nonlinearity, complex constraints, and time-varying characteristics. Faced with these difficulties, we propose an energy-efficient train control framework based on a hierarchical ensemble learning approach. Through hierarchical refinement, we learn prediction models of speed and gear. The learned models can be used to derive optimized driving operations under real-time requirements. This study uses random forest and bagging – REPTree as classification algorithm and regression algorithm, respectively. We conduct an extensive study on the potential of bagging, decision trees, random forest, and feature selection to design an effective hierarchical ensemble learning framework. The proposed framework was testified through simulation. The average energy consumption of the proposed method is over 7% lower than that of human drivers.
文摘针对滚动轴承故障诊断时所提取的特征值中可能含有较小相关性和冗余性特征,采用基于Wrapper模式的距离评价技术(distance evaluation technique,简称DET)进行特征选择。在分类器的设计中,提出了基于稳健回归的多变量预测模型(Robust regression-Variable predictive model based class discriminate,简称RRVPMCD)分类方法,以减小"异常值"对参数估计的影响,从而有望建立更加准确的预测模型。即根据Wrapper模式的特点,首先通过DET方法计算出各特征值对类的敏感度,并结合RRVPMCD分类器,选择敏感度最大的若干特征值组成特征向量矩阵;然后用RRVPMCD方法进行训练,建立预测模型;最后用所建立的预测模型进行模式识别。实验分析结果表明,基于Wrapper模式的特征选择方法和RRVPMCD分类方法相结合可以有效地对滚动轴承的工作状态和故障类型进行识别。