The amount of oxygen blown into the converter is one of the key parameters for the control of the converter blowing process,which directly affects the tap-to-tap time of converter. In this study, a hybrid model based ...The amount of oxygen blown into the converter is one of the key parameters for the control of the converter blowing process,which directly affects the tap-to-tap time of converter. In this study, a hybrid model based on oxygen balance mechanism (OBM) and deep neural network (DNN) was established for predicting oxygen blowing time in converter. A three-step method was utilized in the hybrid model. First, the oxygen consumption volume was predicted by the OBM model and DNN model, respectively. Second, a more accurate oxygen consumption volume was obtained by integrating the OBM model and DNN model. Finally, the converter oxygen blowing time was calculated according to the oxygen consumption volume and the oxygen supply intensity of each heat. The proposed hybrid model was verified using the actual data collected from an integrated steel plant in China, and compared with multiple linear regression model, OBM model, and neural network model including extreme learning machine, back propagation neural network, and DNN. The test results indicate that the hybrid model with a network structure of 3 hidden layer layers, 32-16-8 neurons per hidden layer, and 0.1 learning rate has the best prediction accuracy and stronger generalization ability compared with other models. The predicted hit ratio of oxygen consumption volume within the error±300 m^(3)is 96.67%;determination coefficient (R^(2)) and root mean square error (RMSE) are0.6984 and 150.03 m^(3), respectively. The oxygen blow time prediction hit ratio within the error±0.6 min is 89.50%;R2and RMSE are0.9486 and 0.3592 min, respectively. As a result, the proposed model can effectively predict the oxygen consumption volume and oxygen blowing time in the converter.展开更多
This article focuses on dynamic event-triggered mechanism(DETM)-based model predictive control(MPC) for T-S fuzzy systems.A hybrid dynamic variables-dependent DETM is carefully devised,which includes a multiplicative ...This article focuses on dynamic event-triggered mechanism(DETM)-based model predictive control(MPC) for T-S fuzzy systems.A hybrid dynamic variables-dependent DETM is carefully devised,which includes a multiplicative dynamic variable and an additive dynamic variable.The addressed DETM-based fuzzy MPC issue is described as a “min-max” optimization problem(OP).To facilitate the co-design of the MPC controller and the weighting matrix of the DETM,an auxiliary OP is proposed based on a new Lyapunov function and a new robust positive invariant(RPI) set that contain the membership functions and the hybrid dynamic variables.A dynamic event-triggered fuzzy MPC algorithm is developed accordingly,whose recursive feasibility is analysed by employing the RPI set.With the designed controller,the involved fuzzy system is ensured to be asymptotically stable.Two examples show that the new DETM and DETM-based MPC algorithm have the advantages of reducing resource consumption while yielding the anticipated performance.展开更多
Ethylene glycol(EG)plays a pivotal role as a primary raw material in the polyester industry,and the syngas-to-EG route has become a significant technical route in production.The carbon monoxide(CO)gas-phase catalytic ...Ethylene glycol(EG)plays a pivotal role as a primary raw material in the polyester industry,and the syngas-to-EG route has become a significant technical route in production.The carbon monoxide(CO)gas-phase catalytic coupling to synthesize dimethyl oxalate(DMO)is a crucial process in the syngas-to-EG route,whereby the composition of the reactor outlet exerts influence on the ultimate quality of the EG product and the energy consumption during the subsequent separation process.However,measuring product quality in real time or establishing accurate dynamic mechanism models is challenging.To effectively model the DMO synthesis process,this study proposes a hybrid modeling strategy that integrates process mechanisms and data-driven approaches.The CO gas-phase catalytic coupling mechanism model is developed based on intrinsic kinetics and material balance,while a long short-term memory(LSTM)neural network is employed to predict the macroscopic reaction rate by leveraging temporal relationships derived from archived measurements.The proposed model is trained semi-supervised to accommodate limited-label data scenarios,leveraging historical data.By integrating these predictions with the mechanism model,the hybrid modeling approach provides reliable and interpretable forecasts of mass fractions.Empirical investigations unequivocally validate the superiority of the proposed hybrid modeling approach over conventional data-driven models(DDMs)and other hybrid modeling techniques.展开更多
Forecasting river flow is crucial for optimal planning,management,and sustainability using freshwater resources.Many machine learning(ML)approaches have been enhanced to improve streamflow prediction.Hybrid techniques...Forecasting river flow is crucial for optimal planning,management,and sustainability using freshwater resources.Many machine learning(ML)approaches have been enhanced to improve streamflow prediction.Hybrid techniques have been viewed as a viable method for enhancing the accuracy of univariate streamflow estimation when compared to standalone approaches.Current researchers have also emphasised using hybrid models to improve forecast accuracy.Accordingly,this paper conducts an updated literature review of applications of hybrid models in estimating streamflow over the last five years,summarising data preprocessing,univariate machine learning modelling strategy,advantages and disadvantages of standalone ML techniques,hybrid models,and performance metrics.This study focuses on two types of hybrid models:parameter optimisation-based hybrid models(OBH)and hybridisation of parameter optimisation-based and preprocessing-based hybridmodels(HOPH).Overall,this research supports the idea thatmeta-heuristic approaches precisely improveML techniques.It’s also one of the first efforts to comprehensively examine the efficiency of various meta-heuristic approaches(classified into four primary classes)hybridised with ML techniques.This study revealed that previous research applied swarm,evolutionary,physics,and hybrid metaheuristics with 77%,61%,12%,and 12%,respectively.Finally,there is still room for improving OBH and HOPH models by examining different data pre-processing techniques and metaheuristic algorithms.展开更多
The Indian Himalayan region is frequently experiencing climate change-induced landslides.Thus,landslide susceptibility assessment assumes greater significance for lessening the impact of a landslide hazard.This paper ...The Indian Himalayan region is frequently experiencing climate change-induced landslides.Thus,landslide susceptibility assessment assumes greater significance for lessening the impact of a landslide hazard.This paper makes an attempt to assess landslide susceptibility in Shimla district of the northwest Indian Himalayan region.It examined the effectiveness of random forest(RF),multilayer perceptron(MLP),sequential minimal optimization regression(SMOreg)and bagging ensemble(B-RF,BSMOreg,B-MLP)models.A landslide inventory map comprising 1052 locations of past landslide occurrences was classified into training(70%)and testing(30%)datasets.The site-specific influencing factors were selected by employing a multicollinearity test.The relationship between past landslide occurrences and influencing factors was established using the frequency ratio method.The effectiveness of machine learning models was verified through performance assessors.The landslide susceptibility maps were validated by the area under the receiver operating characteristic curves(ROC-AUC),accuracy,precision,recall and F1-score.The key performance metrics and map validation demonstrated that the BRF model(correlation coefficient:0.988,mean absolute error:0.010,root mean square error:0.058,relative absolute error:2.964,ROC-AUC:0.947,accuracy:0.778,precision:0.819,recall:0.917 and F-1 score:0.865)outperformed the single classifiers and other bagging ensemble models for landslide susceptibility.The results show that the largest area was found under the very high susceptibility zone(33.87%),followed by the low(27.30%),high(20.68%)and moderate(18.16%)susceptibility zones.The factors,namely average annual rainfall,slope,lithology,soil texture and earthquake magnitude have been identified as the influencing factors for very high landslide susceptibility.Soil texture,lineament density and elevation have been attributed to high and moderate susceptibility.Thus,the study calls for devising suitable landslide mitigation measures in the study area.Structural measures,an immediate response system,community participation and coordination among stakeholders may help lessen the detrimental impact of landslides.The findings from this study could aid decision-makers in mitigating future catastrophes and devising suitable strategies in other geographical regions with similar geological characteristics.展开更多
Class Title:Radiological imaging method a comprehensive overview purpose.This GPT paper provides an overview of the different forms of radiological imaging and the potential diagnosis capabilities they offer as well a...Class Title:Radiological imaging method a comprehensive overview purpose.This GPT paper provides an overview of the different forms of radiological imaging and the potential diagnosis capabilities they offer as well as recent advances in the field.Materials and Methods:This paper provides an overview of conventional radiography digital radiography panoramic radiography computed tomography and cone-beam computed tomography.Additionally recent advances in radiological imaging are discussed such as imaging diagnosis and modern computer-aided diagnosis systems.Results:This paper details the differences between the imaging techniques the benefits of each and the current advances in the field to aid in the diagnosis of medical conditions.Conclusion:Radiological imaging is an extremely important tool in modern medicine to assist in medical diagnosis.This work provides an overview of the types of imaging techniques used the recent advances made and their potential applications.展开更多
A previously developed hybrid coupled model(HCM)is composed of an intermediate tropical Pacific Ocean model and a global atmospheric general circulation model(AGCM),denoted as HCMAGCM.In this study,different El Ni...A previously developed hybrid coupled model(HCM)is composed of an intermediate tropical Pacific Ocean model and a global atmospheric general circulation model(AGCM),denoted as HCMAGCM.In this study,different El Niño flavors,namely the Eastern-Pacific(EP)and Central-Pacific(CP)types,and the associated global atmospheric teleconnections are examined in a 1000-yr control simulation of the HCMAGCM.The HCMAGCM indicates profoundly different characteristics among EP and CP El Niño events in terms of related oceanic and atmospheric variables in the tropical Pacific,including the amplitude and spatial patterns of sea surface temperature(SST),zonal wind stress,and precipitation anomalies.An SST budget analysis indicates that the thermocline feedback and zonal advective feedback dominantly contribute to the growth of EP and CP El Niño events,respectively.Corresponding to the shifts in the tropical rainfall and deep convection during EP and CP El Niño events,the model also reproduces the differences in the extratropical atmospheric responses during the boreal winter.In particular,the EP El Niño tends to be dominant in exciting a poleward wave train pattern to the Northern Hemisphere,while the CP El Niño tends to preferably produce a wave train similar to the Pacific North American(PNA)pattern.As a result,different climatic impacts exist in North American regions,with a warm-north and cold-south pattern during an EP El Niño and a warm-northeast and cold-southwest pattern during a CP El Niño,respectively.This modeling result highlights the importance of internal natural processes within the tropical Pacific as they relate to the genesis of ENSO diversity because the active ocean–atmosphere coupling is allowed only in the tropical Pacific within the framework of the HCMAGCM.展开更多
Effort estimation plays a crucial role in software development projects,aiding in resource allocation,project planning,and risk management.Traditional estimation techniques often struggle to provide accurate estimates...Effort estimation plays a crucial role in software development projects,aiding in resource allocation,project planning,and risk management.Traditional estimation techniques often struggle to provide accurate estimates due to the complex nature of software projects.In recent years,machine learning approaches have shown promise in improving the accuracy of effort estimation models.This study proposes a hybrid model that combines Long Short-Term Memory(LSTM)and Random Forest(RF)algorithms to enhance software effort estimation.The proposed hybrid model takes advantage of the strengths of both LSTM and RF algorithms.To evaluate the performance of the hybrid model,an extensive set of software development projects is used as the experimental dataset.The experimental results demonstrate that the proposed hybrid model outperforms traditional estimation techniques in terms of accuracy and reliability.The integration of LSTM and RF enables the model to efficiently capture temporal dependencies and non-linear interactions in the software development data.The hybrid model enhances estimation accuracy,enabling project managers and stakeholders to make more precise predictions of effort needed for upcoming software projects.展开更多
A hybrid identification model based on multilayer artificial neural networks(ANNs) and particle swarm optimization(PSO) algorithm is developed to improve the simultaneous identification efficiency of thermal conductiv...A hybrid identification model based on multilayer artificial neural networks(ANNs) and particle swarm optimization(PSO) algorithm is developed to improve the simultaneous identification efficiency of thermal conductivity and effective absorption coefficient of semitransparent materials.For the direct model,the spherical harmonic method and the finite volume method are used to solve the coupled conduction-radiation heat transfer problem in an absorbing,emitting,and non-scattering 2D axisymmetric gray medium in the background of laser flash method.For the identification part,firstly,the temperature field and the incident radiation field in different positions are chosen as observables.Then,a traditional identification model based on PSO algorithm is established.Finally,multilayer ANNs are built to fit and replace the direct model in the traditional identification model to speed up the identification process.The results show that compared with the traditional identification model,the time cost of the hybrid identification model is reduced by about 1 000 times.Besides,the hybrid identification model remains a high level of accuracy even with measurement errors.展开更多
In response to escalating challenges in energy conservation and emission reduction,this study delves into the complexities of heat transfer in two-phase flows and adjustments to combustion processes within coal-fired ...In response to escalating challenges in energy conservation and emission reduction,this study delves into the complexities of heat transfer in two-phase flows and adjustments to combustion processes within coal-fired boilers.Utilizing a fusion of hybrid modeling and automation technologies,we develop soft measurement models for key combustion parameters,such as the net calorific value of coal,flue gas oxygen content,and fly ash carbon content,within theDistributedControl System(DCS).Validated with performance test data,thesemodels exhibit controlled root mean square error(RMSE)and maximum absolute error(MAXE)values,both within the range of 0.203.Integrated into their respective automatic control systems,thesemodels optimize two-phase flow heat transfer,finetune combustion conditions,and mitigate incomplete combustion.Furthermore,this paper conducts an in-depth exploration of the generationmechanismof nitrogen oxides(NOx)and low oxygen emission reduction technology in coal-fired boilers,demonstrating a substantial reduction in furnace exit NOx generation by 30%to 40%and the power supply coal consumption decreased by 1.62 g/(kW h).The research outcomes highlight the model’s rapid responsiveness,enabling prompt reflection of transient variations in various economic indicator parameters.This provides a more effective means for real-time monitoring of crucial variables in coal-fired boilers and facilitates timely combustion adjustments,underscoring notable achievements in boiler combustion.The research not only provides valuable and practical insights into the intricacies of two-phase flow heat transfer and heat exchange but also establishes a pioneering methodology for tackling industry challenges.展开更多
Spatial heterogeneity refers to the variation or differences in characteristics or features across different locations or areas in space. Spatial data refers to information that explicitly or indirectly belongs to a p...Spatial heterogeneity refers to the variation or differences in characteristics or features across different locations or areas in space. Spatial data refers to information that explicitly or indirectly belongs to a particular geographic region or location, also known as geo-spatial data or geographic information. Focusing on spatial heterogeneity, we present a hybrid machine learning model combining two competitive algorithms: the Random Forest Regressor and CNN. The model is fine-tuned using cross validation for hyper-parameter adjustment and performance evaluation, ensuring robustness and generalization. Our approach integrates Global Moran’s I for examining global autocorrelation, and local Moran’s I for assessing local spatial autocorrelation in the residuals. To validate our approach, we implemented the hybrid model on a real-world dataset and compared its performance with that of the traditional machine learning models. Results indicate superior performance with an R-squared of 0.90, outperforming RF 0.84 and CNN 0.74. This study contributed to a detailed understanding of spatial variations in data considering the geographical information (Longitude & Latitude) present in the dataset. Our results, also assessed using the Root Mean Squared Error (RMSE), indicated that the hybrid yielded lower errors, showing a deviation of 53.65% from the RF model and 63.24% from the CNN model. Additionally, the global Moran’s I index was observed to be 0.10. This study underscores that the hybrid was able to predict correctly the house prices both in clusters and in dispersed areas.展开更多
The paper presents our contribution to the full 3D finite element modelling of a hybrid stepping motor using COMSOL Multiphysics software. This type of four-phase motor has a permanent magnet interposed between the tw...The paper presents our contribution to the full 3D finite element modelling of a hybrid stepping motor using COMSOL Multiphysics software. This type of four-phase motor has a permanent magnet interposed between the two identical and coaxial half stators. The calculation of the field with or without current in the windings (respectively with or without permanent magnet) is done using a mixed formulation with strong coupling. In addition, the local high saturation of the ferromagnetic material and the radial and axial components of the magnetic flux are taken into account. The results obtained make it possible to clearly observe, as a function of the intensity of the bus current or the remanent induction, the saturation zones, the lines, the orientations and the magnetic flux densities. 3D finite element modelling provide more accurate numerical data on the magnetic field through multiphysics analysis. This analysis considers the actual operating conditions and leads to the design of an optimized machine structure, with or without current in the windings and/or permanent magnet.展开更多
Photovoltaic(PV) power generation is characterized by randomness and intermittency due to weather changes.Consequently, large-scale PV power connections to the grid can threaten the stable operation of the power syste...Photovoltaic(PV) power generation is characterized by randomness and intermittency due to weather changes.Consequently, large-scale PV power connections to the grid can threaten the stable operation of the power system. An effective method to resolve this problem is to accurately predict PV power. In this study, an innovative short-term hybrid prediction model(i.e., HKSL) of PV power is established. The model combines K-means++, optimal similar day approach,and long short-term memory(LSTM) network. Historical power data and meteorological factors are utilized. This model searches for the best similar day based on the results of classifying weather types. Then, the data of similar day are inputted into the LSTM network to predict PV power. The validity of the hybrid model is verified based on the datasets from a PV power station in Shandong Province, China. Four evaluation indices, mean absolute error, root mean square error(RMSE),normalized RMSE, and mean absolute deviation, are employed to assess the performance of the HKSL model. The RMSE of the proposed model compared with those of Elman, LSTM, HSE(hybrid model combining similar day approach and Elman), HSL(hybrid model combining similar day approach and LSTM), and HKSE(hybrid model combining K-means++,similar day approach, and LSTM) decreases by 66.73%, 70.22%, 65.59%, 70.51%, and 18.40%, respectively. This proves the reliability and excellent performance of the proposed hybrid model in predicting power.展开更多
This study proposes a combined hybrid energy storage system(HESS) and transmission grid(TG) model, and a corresponding time series operation simulation(TSOS) model is established to relieve the peak-shaving pressure o...This study proposes a combined hybrid energy storage system(HESS) and transmission grid(TG) model, and a corresponding time series operation simulation(TSOS) model is established to relieve the peak-shaving pressure of power systems under the integration of renewable energy. First, a linear model for the optimal operation of the HESS is established, which considers the different power-efficiency characteristics of the pumped storage system, electrochemical storage system, and a new type of liquid compressed air energy storage. Second, a TSOS simulation model for peak shaving is built to maximize the power entering the grid from the wind farms and HESS. Based on the proposed model, this study considers the transmission capacity of a TG. By adding the power-flow constraints of the TG, a TSOS-based HESS and TG combination model for peak shaving is established. Finally, the improved IEEE-39 and IEEE-118 bus systems were considered as examples to verify the effectiveness and feasibility of the proposed model.展开更多
Accurate soil moisture(SM)prediction is critical for understanding hydrological processes.Physics-based(PB)models exhibit large uncertainties in SM predictions arising from uncertain parameterizations and insufficient...Accurate soil moisture(SM)prediction is critical for understanding hydrological processes.Physics-based(PB)models exhibit large uncertainties in SM predictions arising from uncertain parameterizations and insufficient representation of land-surface processes.In addition to PB models,deep learning(DL)models have been widely used in SM predictions recently.However,few pure DL models have notably high success rates due to lacking physical information.Thus,we developed hybrid models to effectively integrate the outputs of PB models into DL models to improve SM predictions.To this end,we first developed a hybrid model based on the attention mechanism to take advantage of PB models at each forecast time scale(attention model).We further built an ensemble model that combined the advantages of different hybrid schemes(ensemble model).We utilized SM forecasts from the Global Forecast System to enhance the convolutional long short-term memory(ConvLSTM)model for 1–16 days of SM predictions.The performances of the proposed hybrid models were investigated and compared with two existing hybrid models.The results showed that the attention model could leverage benefits of PB models and achieved the best predictability of drought events among the different hybrid models.Moreover,the ensemble model performed best among all hybrid models at all forecast time scales and different soil conditions.It is highlighted that the ensemble model outperformed the pure DL model over 79.5%of in situ stations for 16-day predictions.These findings suggest that our proposed hybrid models can adequately exploit the benefits of PB model outputs to aid DL models in making SM predictions.展开更多
Recently,researchers have shown increasing interest in combining more than one programming model into systems running on high performance computing systems(HPCs)to achieve exascale by applying parallelism at multiple ...Recently,researchers have shown increasing interest in combining more than one programming model into systems running on high performance computing systems(HPCs)to achieve exascale by applying parallelism at multiple levels.Combining different programming paradigms,such as Message Passing Interface(MPI),Open Multiple Processing(OpenMP),and Open Accelerators(OpenACC),can increase computation speed and improve performance.During the integration of multiple models,the probability of runtime errors increases,making their detection difficult,especially in the absence of testing techniques that can detect these errors.Numerous studies have been conducted to identify these errors,but no technique exists for detecting errors in three-level programming models.Despite the increasing research that integrates the three programming models,MPI,OpenMP,and OpenACC,a testing technology to detect runtime errors,such as deadlocks and race conditions,which can arise from this integration has not been developed.Therefore,this paper begins with a definition and explanation of runtime errors that result fromintegrating the three programming models that compilers cannot detect.For the first time,this paper presents a classification of operational errors that can result from the integration of the three models.This paper also proposes a parallel hybrid testing technique for detecting runtime errors in systems built in the C++programming language that uses the triple programming models MPI,OpenMP,and OpenACC.This hybrid technology combines static technology and dynamic technology,given that some errors can be detected using static techniques,whereas others can be detected using dynamic technology.The hybrid technique can detect more errors because it combines two distinct technologies.The proposed static technology detects a wide range of error types in less time,whereas a portion of the potential errors that may or may not occur depending on the 4502 CMC,2023,vol.74,no.2 operating environment are left to the dynamic technology,which completes the validation.展开更多
In an era marked by escalating cybersecurity threats,our study addresses the challenge of malware variant detection,a significant concern for amultitude of sectors including petroleum and mining organizations.This pap...In an era marked by escalating cybersecurity threats,our study addresses the challenge of malware variant detection,a significant concern for amultitude of sectors including petroleum and mining organizations.This paper presents an innovative Application Programmable Interface(API)-based hybrid model designed to enhance the detection performance of malware variants.This model integrates eXtreme Gradient Boosting(XGBoost)and an Artificial Neural Network(ANN)classifier,offering a potent response to the sophisticated evasion and obfuscation techniques frequently deployed by malware authors.The model’s design capitalizes on the benefits of both static and dynamic analysis to extract API-based features,providing a holistic and comprehensive view of malware behavior.From these features,we construct two XGBoost predictors,each of which contributes a valuable perspective on the malicious activities under scrutiny.The outputs of these predictors,interpreted as malicious scores,are then fed into an ANN-based classifier,which processes this data to derive a final decision.The strength of the proposed model lies in its capacity to leverage behavioral and signature-based features,and most importantly,in its ability to extract and analyze the hidden relations between these two types of features.The efficacy of our proposed APIbased hybrid model is evident in its performance metrics.It outperformed other models in our tests,achieving an impressive accuracy of 95%and an F-measure of 93%.This significantly improved the detection performance of malware variants,underscoring the value and potential of our approach in the challenging field of cybersecurity.展开更多
Nowadays,wood identification is made by experts using hand lenses,wood atlases,and field manuals which take a lot of cost and time for the training process.The quantity and species must be strictly set up,and accurate...Nowadays,wood identification is made by experts using hand lenses,wood atlases,and field manuals which take a lot of cost and time for the training process.The quantity and species must be strictly set up,and accurate identification of the wood species must be made during exploitation to monitor trade and enforce regulations to stop illegal logging.With the development of science,wood identification should be supported with technology to enhance the perception of fairness of trade.An automatic wood identification system and a dataset of 50 commercial wood species from Asia are established,namely,wood anatomical images collected and used to train for the proposed model.In the convolutional neural network(CNN),the last layers are usually soft-max functions with dense layers.These layers contain the most parameters that affect the speed model.To reduce the number of parameters in the last layers of the CNN model and enhance the accuracy,the structure of the model should be optimized and developed.Therefore,a hybrid of convolutional neural network and random forest model(CNN-RF model)is introduced to wood identification.The accuracy’s hybrid model is more than 98%,and the processing speed is 3 times higher than the CNN model.The highest accuracy is 1.00 in some species,and the lowest is 0.92.These results show the excellent adaptability of the hybrid model in wood identification based on anatomical images.It also facilitates further investigations of wood cells and has implications for wood science.展开更多
In a competitive digital age where data volumes are increasing with time, the ability to extract meaningful knowledge from high-dimensional data using machine learning (ML) and data mining (DM) techniques and making d...In a competitive digital age where data volumes are increasing with time, the ability to extract meaningful knowledge from high-dimensional data using machine learning (ML) and data mining (DM) techniques and making decisions based on the extracted knowledge is becoming increasingly important in all business domains. Nevertheless, high-dimensional data remains a major challenge for classification algorithms due to its high computational cost and storage requirements. The 2016 Demographic and Health Survey of Ethiopia (EDHS 2016) used as the data source for this study which is publicly available contains several features that may not be relevant to the prediction task. In this paper, we developed a hybrid multidimensional metrics framework for predictive modeling for both model performance evaluation and feature selection to overcome the feature selection challenges and select the best model among the available models in DM and ML. The proposed hybrid metrics were used to measure the efficiency of the predictive models. Experimental results show that the decision tree algorithm is the most efficient model. The higher score of HMM (m, r) = 0.47 illustrates the overall significant model that encompasses almost all the user’s requirements, unlike the classical metrics that use a criterion to select the most appropriate model. On the other hand, the ANNs were found to be the most computationally intensive for our prediction task. Moreover, the type of data and the class size of the dataset (unbalanced data) have a significant impact on the efficiency of the model, especially on the computational cost, and the interpretability of the parameters of the model would be hampered. And the efficiency of the predictive model could be improved with other feature selection algorithms (especially hybrid metrics) considering the experts of the knowledge domain, as the understanding of the business domain has a significant impact.展开更多
The majority of spatial data reveal some degree of spatial dependence. The term “spatial dependence” refers to the tendency for phenomena to be more similar when they occur close together than when they occur far ap...The majority of spatial data reveal some degree of spatial dependence. The term “spatial dependence” refers to the tendency for phenomena to be more similar when they occur close together than when they occur far apart in space. This property is ignored in machine learning (ML) for spatial domains of application. Most classical machine learning algorithms are generally inappropriate unless modified in some way to account for it. In this study, we proposed an approach that aimed to improve a ML model to detect the dependence without incorporating any spatial features in the learning process. To detect this dependence while also improving performance, a hybrid model was used based on two representative algorithms. In addition, cross-validation method was used to make the model stable. Furthermore, global moran’s I and local moran were used to capture the spatial dependence in the residuals. The results show that the HM has significant with a R2 of 99.91% performance compared to RBFNN and RF that have 74.22% and 82.26% as R2 respectively. With lower errors, the HM was able to achieve an average test error of 0.033% and a positive global moran’s of 0.12. We concluded that as the R2 value increases, the models become weaker in terms of capturing the dependence.展开更多
基金financially supported by the National Natural Science Foundation of China (Nos.51974023 and52374321)the funding of State Key Laboratory of Advanced Metallurgy,University of Science and Technology Beijing,China (No.41620007)。
文摘The amount of oxygen blown into the converter is one of the key parameters for the control of the converter blowing process,which directly affects the tap-to-tap time of converter. In this study, a hybrid model based on oxygen balance mechanism (OBM) and deep neural network (DNN) was established for predicting oxygen blowing time in converter. A three-step method was utilized in the hybrid model. First, the oxygen consumption volume was predicted by the OBM model and DNN model, respectively. Second, a more accurate oxygen consumption volume was obtained by integrating the OBM model and DNN model. Finally, the converter oxygen blowing time was calculated according to the oxygen consumption volume and the oxygen supply intensity of each heat. The proposed hybrid model was verified using the actual data collected from an integrated steel plant in China, and compared with multiple linear regression model, OBM model, and neural network model including extreme learning machine, back propagation neural network, and DNN. The test results indicate that the hybrid model with a network structure of 3 hidden layer layers, 32-16-8 neurons per hidden layer, and 0.1 learning rate has the best prediction accuracy and stronger generalization ability compared with other models. The predicted hit ratio of oxygen consumption volume within the error±300 m^(3)is 96.67%;determination coefficient (R^(2)) and root mean square error (RMSE) are0.6984 and 150.03 m^(3), respectively. The oxygen blow time prediction hit ratio within the error±0.6 min is 89.50%;R2and RMSE are0.9486 and 0.3592 min, respectively. As a result, the proposed model can effectively predict the oxygen consumption volume and oxygen blowing time in the converter.
基金supported by the National Natural Science Foundation of China (62073303,61673356)Hubei Provincial Natural Science Foundation of China (2015CFA010)the 111 Project(B17040)。
文摘This article focuses on dynamic event-triggered mechanism(DETM)-based model predictive control(MPC) for T-S fuzzy systems.A hybrid dynamic variables-dependent DETM is carefully devised,which includes a multiplicative dynamic variable and an additive dynamic variable.The addressed DETM-based fuzzy MPC issue is described as a “min-max” optimization problem(OP).To facilitate the co-design of the MPC controller and the weighting matrix of the DETM,an auxiliary OP is proposed based on a new Lyapunov function and a new robust positive invariant(RPI) set that contain the membership functions and the hybrid dynamic variables.A dynamic event-triggered fuzzy MPC algorithm is developed accordingly,whose recursive feasibility is analysed by employing the RPI set.With the designed controller,the involved fuzzy system is ensured to be asymptotically stable.Two examples show that the new DETM and DETM-based MPC algorithm have the advantages of reducing resource consumption while yielding the anticipated performance.
基金supported in part by the National Key Research and Development Program of China(2022YFB3305300)the National Natural Science Foundation of China(62173178).
文摘Ethylene glycol(EG)plays a pivotal role as a primary raw material in the polyester industry,and the syngas-to-EG route has become a significant technical route in production.The carbon monoxide(CO)gas-phase catalytic coupling to synthesize dimethyl oxalate(DMO)is a crucial process in the syngas-to-EG route,whereby the composition of the reactor outlet exerts influence on the ultimate quality of the EG product and the energy consumption during the subsequent separation process.However,measuring product quality in real time or establishing accurate dynamic mechanism models is challenging.To effectively model the DMO synthesis process,this study proposes a hybrid modeling strategy that integrates process mechanisms and data-driven approaches.The CO gas-phase catalytic coupling mechanism model is developed based on intrinsic kinetics and material balance,while a long short-term memory(LSTM)neural network is employed to predict the macroscopic reaction rate by leveraging temporal relationships derived from archived measurements.The proposed model is trained semi-supervised to accommodate limited-label data scenarios,leveraging historical data.By integrating these predictions with the mechanism model,the hybrid modeling approach provides reliable and interpretable forecasts of mass fractions.Empirical investigations unequivocally validate the superiority of the proposed hybrid modeling approach over conventional data-driven models(DDMs)and other hybrid modeling techniques.
基金This paper’s logical organisation and content quality have been enhanced,so the authors thank anonymous reviewers and journal editors for assistance.
文摘Forecasting river flow is crucial for optimal planning,management,and sustainability using freshwater resources.Many machine learning(ML)approaches have been enhanced to improve streamflow prediction.Hybrid techniques have been viewed as a viable method for enhancing the accuracy of univariate streamflow estimation when compared to standalone approaches.Current researchers have also emphasised using hybrid models to improve forecast accuracy.Accordingly,this paper conducts an updated literature review of applications of hybrid models in estimating streamflow over the last five years,summarising data preprocessing,univariate machine learning modelling strategy,advantages and disadvantages of standalone ML techniques,hybrid models,and performance metrics.This study focuses on two types of hybrid models:parameter optimisation-based hybrid models(OBH)and hybridisation of parameter optimisation-based and preprocessing-based hybridmodels(HOPH).Overall,this research supports the idea thatmeta-heuristic approaches precisely improveML techniques.It’s also one of the first efforts to comprehensively examine the efficiency of various meta-heuristic approaches(classified into four primary classes)hybridised with ML techniques.This study revealed that previous research applied swarm,evolutionary,physics,and hybrid metaheuristics with 77%,61%,12%,and 12%,respectively.Finally,there is still room for improving OBH and HOPH models by examining different data pre-processing techniques and metaheuristic algorithms.
文摘The Indian Himalayan region is frequently experiencing climate change-induced landslides.Thus,landslide susceptibility assessment assumes greater significance for lessening the impact of a landslide hazard.This paper makes an attempt to assess landslide susceptibility in Shimla district of the northwest Indian Himalayan region.It examined the effectiveness of random forest(RF),multilayer perceptron(MLP),sequential minimal optimization regression(SMOreg)and bagging ensemble(B-RF,BSMOreg,B-MLP)models.A landslide inventory map comprising 1052 locations of past landslide occurrences was classified into training(70%)and testing(30%)datasets.The site-specific influencing factors were selected by employing a multicollinearity test.The relationship between past landslide occurrences and influencing factors was established using the frequency ratio method.The effectiveness of machine learning models was verified through performance assessors.The landslide susceptibility maps were validated by the area under the receiver operating characteristic curves(ROC-AUC),accuracy,precision,recall and F1-score.The key performance metrics and map validation demonstrated that the BRF model(correlation coefficient:0.988,mean absolute error:0.010,root mean square error:0.058,relative absolute error:2.964,ROC-AUC:0.947,accuracy:0.778,precision:0.819,recall:0.917 and F-1 score:0.865)outperformed the single classifiers and other bagging ensemble models for landslide susceptibility.The results show that the largest area was found under the very high susceptibility zone(33.87%),followed by the low(27.30%),high(20.68%)and moderate(18.16%)susceptibility zones.The factors,namely average annual rainfall,slope,lithology,soil texture and earthquake magnitude have been identified as the influencing factors for very high landslide susceptibility.Soil texture,lineament density and elevation have been attributed to high and moderate susceptibility.Thus,the study calls for devising suitable landslide mitigation measures in the study area.Structural measures,an immediate response system,community participation and coordination among stakeholders may help lessen the detrimental impact of landslides.The findings from this study could aid decision-makers in mitigating future catastrophes and devising suitable strategies in other geographical regions with similar geological characteristics.
文摘Class Title:Radiological imaging method a comprehensive overview purpose.This GPT paper provides an overview of the different forms of radiological imaging and the potential diagnosis capabilities they offer as well as recent advances in the field.Materials and Methods:This paper provides an overview of conventional radiography digital radiography panoramic radiography computed tomography and cone-beam computed tomography.Additionally recent advances in radiological imaging are discussed such as imaging diagnosis and modern computer-aided diagnosis systems.Results:This paper details the differences between the imaging techniques the benefits of each and the current advances in the field to aid in the diagnosis of medical conditions.Conclusion:Radiological imaging is an extremely important tool in modern medicine to assist in medical diagnosis.This work provides an overview of the types of imaging techniques used the recent advances made and their potential applications.
基金supported by the National Natural Science Foundation of China(NSFCGrant No.42275061)+3 种基金the Strategic Priority Research Program of the Chinese Academy of Sciences(Grant No.XDB40000000)the Laoshan Laboratory(Grant No.LSKJ202202404)the NSFC(Grant No.42030410)the Startup Foundation for Introducing Talent of Nanjing University of Information Science and Technology.
文摘A previously developed hybrid coupled model(HCM)is composed of an intermediate tropical Pacific Ocean model and a global atmospheric general circulation model(AGCM),denoted as HCMAGCM.In this study,different El Niño flavors,namely the Eastern-Pacific(EP)and Central-Pacific(CP)types,and the associated global atmospheric teleconnections are examined in a 1000-yr control simulation of the HCMAGCM.The HCMAGCM indicates profoundly different characteristics among EP and CP El Niño events in terms of related oceanic and atmospheric variables in the tropical Pacific,including the amplitude and spatial patterns of sea surface temperature(SST),zonal wind stress,and precipitation anomalies.An SST budget analysis indicates that the thermocline feedback and zonal advective feedback dominantly contribute to the growth of EP and CP El Niño events,respectively.Corresponding to the shifts in the tropical rainfall and deep convection during EP and CP El Niño events,the model also reproduces the differences in the extratropical atmospheric responses during the boreal winter.In particular,the EP El Niño tends to be dominant in exciting a poleward wave train pattern to the Northern Hemisphere,while the CP El Niño tends to preferably produce a wave train similar to the Pacific North American(PNA)pattern.As a result,different climatic impacts exist in North American regions,with a warm-north and cold-south pattern during an EP El Niño and a warm-northeast and cold-southwest pattern during a CP El Niño,respectively.This modeling result highlights the importance of internal natural processes within the tropical Pacific as they relate to the genesis of ENSO diversity because the active ocean–atmosphere coupling is allowed only in the tropical Pacific within the framework of the HCMAGCM.
文摘Effort estimation plays a crucial role in software development projects,aiding in resource allocation,project planning,and risk management.Traditional estimation techniques often struggle to provide accurate estimates due to the complex nature of software projects.In recent years,machine learning approaches have shown promise in improving the accuracy of effort estimation models.This study proposes a hybrid model that combines Long Short-Term Memory(LSTM)and Random Forest(RF)algorithms to enhance software effort estimation.The proposed hybrid model takes advantage of the strengths of both LSTM and RF algorithms.To evaluate the performance of the hybrid model,an extensive set of software development projects is used as the experimental dataset.The experimental results demonstrate that the proposed hybrid model outperforms traditional estimation techniques in terms of accuracy and reliability.The integration of LSTM and RF enables the model to efficiently capture temporal dependencies and non-linear interactions in the software development data.The hybrid model enhances estimation accuracy,enabling project managers and stakeholders to make more precise predictions of effort needed for upcoming software projects.
基金supported by the Fundamental Research Funds for the Central Universities (No.3122020072)the Multi-investment Project of Tianjin Applied Basic Research(No.23JCQNJC00250)。
文摘A hybrid identification model based on multilayer artificial neural networks(ANNs) and particle swarm optimization(PSO) algorithm is developed to improve the simultaneous identification efficiency of thermal conductivity and effective absorption coefficient of semitransparent materials.For the direct model,the spherical harmonic method and the finite volume method are used to solve the coupled conduction-radiation heat transfer problem in an absorbing,emitting,and non-scattering 2D axisymmetric gray medium in the background of laser flash method.For the identification part,firstly,the temperature field and the incident radiation field in different positions are chosen as observables.Then,a traditional identification model based on PSO algorithm is established.Finally,multilayer ANNs are built to fit and replace the direct model in the traditional identification model to speed up the identification process.The results show that compared with the traditional identification model,the time cost of the hybrid identification model is reduced by about 1 000 times.Besides,the hybrid identification model remains a high level of accuracy even with measurement errors.
文摘In response to escalating challenges in energy conservation and emission reduction,this study delves into the complexities of heat transfer in two-phase flows and adjustments to combustion processes within coal-fired boilers.Utilizing a fusion of hybrid modeling and automation technologies,we develop soft measurement models for key combustion parameters,such as the net calorific value of coal,flue gas oxygen content,and fly ash carbon content,within theDistributedControl System(DCS).Validated with performance test data,thesemodels exhibit controlled root mean square error(RMSE)and maximum absolute error(MAXE)values,both within the range of 0.203.Integrated into their respective automatic control systems,thesemodels optimize two-phase flow heat transfer,finetune combustion conditions,and mitigate incomplete combustion.Furthermore,this paper conducts an in-depth exploration of the generationmechanismof nitrogen oxides(NOx)and low oxygen emission reduction technology in coal-fired boilers,demonstrating a substantial reduction in furnace exit NOx generation by 30%to 40%and the power supply coal consumption decreased by 1.62 g/(kW h).The research outcomes highlight the model’s rapid responsiveness,enabling prompt reflection of transient variations in various economic indicator parameters.This provides a more effective means for real-time monitoring of crucial variables in coal-fired boilers and facilitates timely combustion adjustments,underscoring notable achievements in boiler combustion.The research not only provides valuable and practical insights into the intricacies of two-phase flow heat transfer and heat exchange but also establishes a pioneering methodology for tackling industry challenges.
文摘Spatial heterogeneity refers to the variation or differences in characteristics or features across different locations or areas in space. Spatial data refers to information that explicitly or indirectly belongs to a particular geographic region or location, also known as geo-spatial data or geographic information. Focusing on spatial heterogeneity, we present a hybrid machine learning model combining two competitive algorithms: the Random Forest Regressor and CNN. The model is fine-tuned using cross validation for hyper-parameter adjustment and performance evaluation, ensuring robustness and generalization. Our approach integrates Global Moran’s I for examining global autocorrelation, and local Moran’s I for assessing local spatial autocorrelation in the residuals. To validate our approach, we implemented the hybrid model on a real-world dataset and compared its performance with that of the traditional machine learning models. Results indicate superior performance with an R-squared of 0.90, outperforming RF 0.84 and CNN 0.74. This study contributed to a detailed understanding of spatial variations in data considering the geographical information (Longitude & Latitude) present in the dataset. Our results, also assessed using the Root Mean Squared Error (RMSE), indicated that the hybrid yielded lower errors, showing a deviation of 53.65% from the RF model and 63.24% from the CNN model. Additionally, the global Moran’s I index was observed to be 0.10. This study underscores that the hybrid was able to predict correctly the house prices both in clusters and in dispersed areas.
文摘The paper presents our contribution to the full 3D finite element modelling of a hybrid stepping motor using COMSOL Multiphysics software. This type of four-phase motor has a permanent magnet interposed between the two identical and coaxial half stators. The calculation of the field with or without current in the windings (respectively with or without permanent magnet) is done using a mixed formulation with strong coupling. In addition, the local high saturation of the ferromagnetic material and the radial and axial components of the magnetic flux are taken into account. The results obtained make it possible to clearly observe, as a function of the intensity of the bus current or the remanent induction, the saturation zones, the lines, the orientations and the magnetic flux densities. 3D finite element modelling provide more accurate numerical data on the magnetic field through multiphysics analysis. This analysis considers the actual operating conditions and leads to the design of an optimized machine structure, with or without current in the windings and/or permanent magnet.
基金supported by the No. 4 National Project in 2022 of the Ministry of Emergency Response (2022YJBG04)the International Clean Energy Talent Program (201904100014)。
文摘Photovoltaic(PV) power generation is characterized by randomness and intermittency due to weather changes.Consequently, large-scale PV power connections to the grid can threaten the stable operation of the power system. An effective method to resolve this problem is to accurately predict PV power. In this study, an innovative short-term hybrid prediction model(i.e., HKSL) of PV power is established. The model combines K-means++, optimal similar day approach,and long short-term memory(LSTM) network. Historical power data and meteorological factors are utilized. This model searches for the best similar day based on the results of classifying weather types. Then, the data of similar day are inputted into the LSTM network to predict PV power. The validity of the hybrid model is verified based on the datasets from a PV power station in Shandong Province, China. Four evaluation indices, mean absolute error, root mean square error(RMSE),normalized RMSE, and mean absolute deviation, are employed to assess the performance of the HKSL model. The RMSE of the proposed model compared with those of Elman, LSTM, HSE(hybrid model combining similar day approach and Elman), HSL(hybrid model combining similar day approach and LSTM), and HKSE(hybrid model combining K-means++,similar day approach, and LSTM) decreases by 66.73%, 70.22%, 65.59%, 70.51%, and 18.40%, respectively. This proves the reliability and excellent performance of the proposed hybrid model in predicting power.
基金supported by the State Grid Science and Technology Project (No.52999821N004)。
文摘This study proposes a combined hybrid energy storage system(HESS) and transmission grid(TG) model, and a corresponding time series operation simulation(TSOS) model is established to relieve the peak-shaving pressure of power systems under the integration of renewable energy. First, a linear model for the optimal operation of the HESS is established, which considers the different power-efficiency characteristics of the pumped storage system, electrochemical storage system, and a new type of liquid compressed air energy storage. Second, a TSOS simulation model for peak shaving is built to maximize the power entering the grid from the wind farms and HESS. Based on the proposed model, this study considers the transmission capacity of a TG. By adding the power-flow constraints of the TG, a TSOS-based HESS and TG combination model for peak shaving is established. Finally, the improved IEEE-39 and IEEE-118 bus systems were considered as examples to verify the effectiveness and feasibility of the proposed model.
基金supported by the Natural Science Foundation of China(Grant Nos.42088101 and 42205149)Zhongwang WEI was supported by the Natural Science Foundation of China(Grant No.42075158)+1 种基金Wei SHANGGUAN was supported by the Natural Science Foundation of China(Grant No.41975122)Yonggen ZHANG was supported by the National Natural Science Foundation of Tianjin(Grant No.20JCQNJC01660).
文摘Accurate soil moisture(SM)prediction is critical for understanding hydrological processes.Physics-based(PB)models exhibit large uncertainties in SM predictions arising from uncertain parameterizations and insufficient representation of land-surface processes.In addition to PB models,deep learning(DL)models have been widely used in SM predictions recently.However,few pure DL models have notably high success rates due to lacking physical information.Thus,we developed hybrid models to effectively integrate the outputs of PB models into DL models to improve SM predictions.To this end,we first developed a hybrid model based on the attention mechanism to take advantage of PB models at each forecast time scale(attention model).We further built an ensemble model that combined the advantages of different hybrid schemes(ensemble model).We utilized SM forecasts from the Global Forecast System to enhance the convolutional long short-term memory(ConvLSTM)model for 1–16 days of SM predictions.The performances of the proposed hybrid models were investigated and compared with two existing hybrid models.The results showed that the attention model could leverage benefits of PB models and achieved the best predictability of drought events among the different hybrid models.Moreover,the ensemble model performed best among all hybrid models at all forecast time scales and different soil conditions.It is highlighted that the ensemble model outperformed the pure DL model over 79.5%of in situ stations for 16-day predictions.These findings suggest that our proposed hybrid models can adequately exploit the benefits of PB model outputs to aid DL models in making SM predictions.
基金[King Abdulaziz University][Deanship of Scientific Research]Grant Number[KEP-PHD-20-611-42].
文摘Recently,researchers have shown increasing interest in combining more than one programming model into systems running on high performance computing systems(HPCs)to achieve exascale by applying parallelism at multiple levels.Combining different programming paradigms,such as Message Passing Interface(MPI),Open Multiple Processing(OpenMP),and Open Accelerators(OpenACC),can increase computation speed and improve performance.During the integration of multiple models,the probability of runtime errors increases,making their detection difficult,especially in the absence of testing techniques that can detect these errors.Numerous studies have been conducted to identify these errors,but no technique exists for detecting errors in three-level programming models.Despite the increasing research that integrates the three programming models,MPI,OpenMP,and OpenACC,a testing technology to detect runtime errors,such as deadlocks and race conditions,which can arise from this integration has not been developed.Therefore,this paper begins with a definition and explanation of runtime errors that result fromintegrating the three programming models that compilers cannot detect.For the first time,this paper presents a classification of operational errors that can result from the integration of the three models.This paper also proposes a parallel hybrid testing technique for detecting runtime errors in systems built in the C++programming language that uses the triple programming models MPI,OpenMP,and OpenACC.This hybrid technology combines static technology and dynamic technology,given that some errors can be detected using static techniques,whereas others can be detected using dynamic technology.The hybrid technique can detect more errors because it combines two distinct technologies.The proposed static technology detects a wide range of error types in less time,whereas a portion of the potential errors that may or may not occur depending on the 4502 CMC,2023,vol.74,no.2 operating environment are left to the dynamic technology,which completes the validation.
基金supported by the Deanship of Scientific Research at Northern Border University for funding work through Research Group No.(RG-NBU-2022-1724).
文摘In an era marked by escalating cybersecurity threats,our study addresses the challenge of malware variant detection,a significant concern for amultitude of sectors including petroleum and mining organizations.This paper presents an innovative Application Programmable Interface(API)-based hybrid model designed to enhance the detection performance of malware variants.This model integrates eXtreme Gradient Boosting(XGBoost)and an Artificial Neural Network(ANN)classifier,offering a potent response to the sophisticated evasion and obfuscation techniques frequently deployed by malware authors.The model’s design capitalizes on the benefits of both static and dynamic analysis to extract API-based features,providing a holistic and comprehensive view of malware behavior.From these features,we construct two XGBoost predictors,each of which contributes a valuable perspective on the malicious activities under scrutiny.The outputs of these predictors,interpreted as malicious scores,are then fed into an ANN-based classifier,which processes this data to derive a final decision.The strength of the proposed model lies in its capacity to leverage behavioral and signature-based features,and most importantly,in its ability to extract and analyze the hidden relations between these two types of features.The efficacy of our proposed APIbased hybrid model is evident in its performance metrics.It outperformed other models in our tests,achieving an impressive accuracy of 95%and an F-measure of 93%.This significantly improved the detection performance of malware variants,underscoring the value and potential of our approach in the challenging field of cybersecurity.
文摘Nowadays,wood identification is made by experts using hand lenses,wood atlases,and field manuals which take a lot of cost and time for the training process.The quantity and species must be strictly set up,and accurate identification of the wood species must be made during exploitation to monitor trade and enforce regulations to stop illegal logging.With the development of science,wood identification should be supported with technology to enhance the perception of fairness of trade.An automatic wood identification system and a dataset of 50 commercial wood species from Asia are established,namely,wood anatomical images collected and used to train for the proposed model.In the convolutional neural network(CNN),the last layers are usually soft-max functions with dense layers.These layers contain the most parameters that affect the speed model.To reduce the number of parameters in the last layers of the CNN model and enhance the accuracy,the structure of the model should be optimized and developed.Therefore,a hybrid of convolutional neural network and random forest model(CNN-RF model)is introduced to wood identification.The accuracy’s hybrid model is more than 98%,and the processing speed is 3 times higher than the CNN model.The highest accuracy is 1.00 in some species,and the lowest is 0.92.These results show the excellent adaptability of the hybrid model in wood identification based on anatomical images.It also facilitates further investigations of wood cells and has implications for wood science.
文摘In a competitive digital age where data volumes are increasing with time, the ability to extract meaningful knowledge from high-dimensional data using machine learning (ML) and data mining (DM) techniques and making decisions based on the extracted knowledge is becoming increasingly important in all business domains. Nevertheless, high-dimensional data remains a major challenge for classification algorithms due to its high computational cost and storage requirements. The 2016 Demographic and Health Survey of Ethiopia (EDHS 2016) used as the data source for this study which is publicly available contains several features that may not be relevant to the prediction task. In this paper, we developed a hybrid multidimensional metrics framework for predictive modeling for both model performance evaluation and feature selection to overcome the feature selection challenges and select the best model among the available models in DM and ML. The proposed hybrid metrics were used to measure the efficiency of the predictive models. Experimental results show that the decision tree algorithm is the most efficient model. The higher score of HMM (m, r) = 0.47 illustrates the overall significant model that encompasses almost all the user’s requirements, unlike the classical metrics that use a criterion to select the most appropriate model. On the other hand, the ANNs were found to be the most computationally intensive for our prediction task. Moreover, the type of data and the class size of the dataset (unbalanced data) have a significant impact on the efficiency of the model, especially on the computational cost, and the interpretability of the parameters of the model would be hampered. And the efficiency of the predictive model could be improved with other feature selection algorithms (especially hybrid metrics) considering the experts of the knowledge domain, as the understanding of the business domain has a significant impact.
文摘The majority of spatial data reveal some degree of spatial dependence. The term “spatial dependence” refers to the tendency for phenomena to be more similar when they occur close together than when they occur far apart in space. This property is ignored in machine learning (ML) for spatial domains of application. Most classical machine learning algorithms are generally inappropriate unless modified in some way to account for it. In this study, we proposed an approach that aimed to improve a ML model to detect the dependence without incorporating any spatial features in the learning process. To detect this dependence while also improving performance, a hybrid model was used based on two representative algorithms. In addition, cross-validation method was used to make the model stable. Furthermore, global moran’s I and local moran were used to capture the spatial dependence in the residuals. The results show that the HM has significant with a R2 of 99.91% performance compared to RBFNN and RF that have 74.22% and 82.26% as R2 respectively. With lower errors, the HM was able to achieve an average test error of 0.033% and a positive global moran’s of 0.12. We concluded that as the R2 value increases, the models become weaker in terms of capturing the dependence.