The complex sand-casting process combined with the interactions between process parameters makes it difficult to control the casting quality,resulting in a high scrap rate.A strategy based on a data-driven model was p...The complex sand-casting process combined with the interactions between process parameters makes it difficult to control the casting quality,resulting in a high scrap rate.A strategy based on a data-driven model was proposed to reduce casting defects and improve production efficiency,which includes the random forest(RF)classification model,the feature importance analysis,and the process parameters optimization with Monte Carlo simulation.The collected data includes four types of defects and corresponding process parameters were used to construct the RF model.Classification results show a recall rate above 90% for all categories.The Gini Index was used to assess the importance of the process parameters in the formation of various defects in the RF model.Finally,the classification model was applied to different production conditions for quality prediction.In the case of process parameters optimization for gas porosity defects,this model serves as an experimental process in the Monte Carlo method to estimate a better temperature distribution.The prediction model,when applied to the factory,greatly improved the efficiency of defect detection.Results show that the scrap rate decreased from 10.16% to 6.68%.展开更多
In the era of advanced machine learning techniques,the development of accurate predictive models for complex medical conditions,such as thyroid cancer,has shown remarkable progress.Accurate predictivemodels for thyroi...In the era of advanced machine learning techniques,the development of accurate predictive models for complex medical conditions,such as thyroid cancer,has shown remarkable progress.Accurate predictivemodels for thyroid cancer enhance early detection,improve resource allocation,and reduce overtreatment.However,the widespread adoption of these models in clinical practice demands predictive performance along with interpretability and transparency.This paper proposes a novel association-rule based feature-integratedmachine learning model which shows better classification and prediction accuracy than present state-of-the-artmodels.Our study also focuses on the application of SHapley Additive exPlanations(SHAP)values as a powerful tool for explaining thyroid cancer prediction models.In the proposed method,the association-rule based feature integration framework identifies frequently occurring attribute combinations in the dataset.The original dataset is used in trainingmachine learning models,and further used in generating SHAP values fromthesemodels.In the next phase,the dataset is integrated with the dominant feature sets identified through association-rule based analysis.This new integrated dataset is used in re-training the machine learning models.The new SHAP values generated from these models help in validating the contributions of feature sets in predicting malignancy.The conventional machine learning models lack interpretability,which can hinder their integration into clinical decision-making systems.In this study,the SHAP values are introduced along with association-rule based feature integration as a comprehensive framework for understanding the contributions of feature sets inmodelling the predictions.The study discusses the importance of reliable predictive models for early diagnosis of thyroid cancer,and a validation framework of explainability.The proposed model shows an accuracy of 93.48%.Performance metrics such as precision,recall,F1-score,and the area under the receiver operating characteristic(AUROC)are also higher than the baseline models.The results of the proposed model help us identify the dominant feature sets that impact thyroid cancer classification and prediction.The features{calcification}and{shape}consistently emerged as the top-ranked features associated with thyroid malignancy,in both association-rule based interestingnessmetric values and SHAPmethods.The paper highlights the potential of the rule-based integrated models with SHAP in bridging the gap between the machine learning predictions and the interpretability of this prediction which is required for real-world medical applications.展开更多
According to groundwater level monitoring data of Shuping landslide in the Three Gorges Reservoir area, based on the response relationship between influential factors such as rainfall and reservoir level and the chang...According to groundwater level monitoring data of Shuping landslide in the Three Gorges Reservoir area, based on the response relationship between influential factors such as rainfall and reservoir level and the change of groundwater level, the influential factors of groundwater level were selected. Then the classification and regression tree(CART) model was constructed by the subset and used to predict the groundwater level. Through the verification, the predictive results of the test sample were consistent with the actually measured values, and the mean absolute error and relative error is 0.28 m and 1.15%respectively. To compare the support vector machine(SVM) model constructed using the same set of factors, the mean absolute error and relative error of predicted results is 1.53 m and 6.11% respectively. It is indicated that CART model has not only better fitting and generalization ability, but also strong advantages in the analysis of landslide groundwater dynamic characteristics and the screening of important variables. It is an effective method for prediction of ground water level in landslides.展开更多
Reversible data hiding in encrypted images(RDH-EI)technology is widely used in cloud storage for image privacy protection.In order to improve the embedding capacity of the RDH-EI algorithm and the security of the encr...Reversible data hiding in encrypted images(RDH-EI)technology is widely used in cloud storage for image privacy protection.In order to improve the embedding capacity of the RDH-EI algorithm and the security of the encrypted images,we proposed a reversible data hiding algorithm for encrypted images based on prediction and adaptive classification scrambling.First,the prediction error image is obtained by a novel prediction method before encryption.Then,the image pixel values are divided into two categories by the threshold range,which is selected adaptively according to the image content.Multiple high-significant bits of pixels within the threshold range are used for embedding data and pixel values outside the threshold range remain unchanged.The optimal threshold selected adaptively ensures the maximum embedding capacity of the algorithm.Moreover,the security of encrypted images can be improved by the combination of XOR encryption and classification scrambling encryption since the embedded data is independent of the pixel position.Experiment results demonstrate that the proposed method has higher embedding capacity compared with the current state-of-the-art methods for images with different texture complexity.展开更多
Many business applications rely on their historical data to predict their business future. The marketing products process is one of the core processes for the business. Customer needs give a useful piece of informatio...Many business applications rely on their historical data to predict their business future. The marketing products process is one of the core processes for the business. Customer needs give a useful piece of information that help</span><span style="font-family:Verdana;"><span style="font-family:Verdana;">s</span></span><span style="font-family:Verdana;"> to market the appropriate products at the appropriate time. Moreover, services are considered recently as products. The development of education and health services </span><span style="font-family:Verdana;"><span style="font-family:Verdana;">is</span></span><span style="font-family:Verdana;"> depending on historical data. For the more, reducing online social media networks problems and crimes need a significant source of information. Data analysts need to use an efficient classification algorithm to predict the future of such businesses. However, dealing with a huge quantity of data requires great time to process. Data mining involves many useful techniques that are used to predict statistical data in a variety of business applications. The classification technique is one of the most widely used with a variety of algorithms. In this paper, various classification algorithms are revised in terms of accuracy in different areas of data mining applications. A comprehensive analysis is made after delegated reading of 20 papers in the literature. This paper aims to help data analysts to choose the most suitable classification algorithm for different business applications including business in general, online social media networks, agriculture, health, and education. Results show FFBPN is the most accurate algorithm in the business domain. The Random Forest algorithm is the most accurate in classifying online social networks (OSN) activities. Na<span style="white-space:nowrap;">ï</span>ve Bayes algorithm is the most accurate to classify agriculture datasets. OneR is the most accurate algorithm to classify instances within the health domain. The C4.5 Decision Tree algorithm is the most accurate to classify students’ records to predict degree completion time.展开更多
The Farmers Property Mortgage Policy is a strategic financial policy in western China, a relatively underdeveloped region. Many contradictions and conflicts exist in the process between the strong demand for the loans...The Farmers Property Mortgage Policy is a strategic financial policy in western China, a relatively underdeveloped region. Many contradictions and conflicts exist in the process between the strong demand for the loans by farmers and the strict risk control by the financial institutions. The rural finance corporations should use scientific analysis and investigation of the potential households for overall evaluation of the customers. These include historical credit rating, present family situation, and other related information. Three different data mining methods were applied in this paper to the specifically-collected household data. The objective was to study which factor could be the most important in determining loan demand for households, and in the meanwhile, to classify and predict the possibility of loan demand for the potential customers. The results obtained from the three methods indicated the similar outputs, income level, land area, the way of loan, and the understanding of policy were four main factors which decided the probability of one specific farmer applying for a credit loan. The results also embodied the difference within the three methods for classifying and predicting the loan anticipation for the testing households. The artificial neural network model had the highest accuracy of 91.4 which is better than the other two methods.展开更多
This research concentrates to model an efficient thyroid prediction approach,which is considered a baseline for significant problems faced by the women community.The major research problem is the lack of automated mod...This research concentrates to model an efficient thyroid prediction approach,which is considered a baseline for significant problems faced by the women community.The major research problem is the lack of automated model to attain earlier prediction.Some existing model fails to give better prediction accuracy.Here,a novel clinical decision support system is framed to make the proper decision during a time of complexity.Multiple stages are followed in the proposed framework,which plays a substantial role in thyroid prediction.These steps include i)data acquisition,ii)outlier prediction,and iii)multi-stage weight-based ensemble learning process(MS-WEL).The weighted analysis of the base classifier and other classifier models helps bridge the gap encountered in one single classifier model.Various classifiers aremerged to handle the issues identified in others and intend to enhance the prediction rate.The proposed model provides superior outcomes and gives good quality prediction rate.The simulation is done in the MATLAB 2020a environment and establishes a better trade-off than various existing approaches.The model gives a prediction accuracy of 97.28%accuracy compared to other models and shows a better trade than others.展开更多
Financial crisis prediction(FCP)received significant attention in the financial sector for decision-making.Proper forecasting of the number of firms possible to fail is important to determine the growth index and stre...Financial crisis prediction(FCP)received significant attention in the financial sector for decision-making.Proper forecasting of the number of firms possible to fail is important to determine the growth index and strength of a nation’s economy.Conventionally,numerous approaches have been developed in the design of accurate FCP processes.At the same time,classifier efficacy and predictive accuracy are inadequate for real-time applications.In addition,several established techniques carry out well to any of the specific datasets but are not adjustable to distinct datasets.Thus,there is a necessity for developing an effectual prediction technique for optimum classifier performance and adjustable to various datasets.This paper presents a novel multi-vs.optimization(MVO)based feature selection(FS)with an optimal variational auto encoder(OVAE)model for FCP.The proposed multi-vs.optimization based feature selection with optimal variational auto encoder(MVOFS-OVAE)model mainly aims to accomplish forecasting the financial crisis.For achieving this,the proposed MVOFS-OVAE model primarily pre-processes the financial data using min-max normalization.In addition,the MVOFS-OVAE model designs a feature subset selection process using the MVOFS approach.Followed by,the variational auto encoder(VAE)model is applied for the categorization of financial data into financial crisis or non-financial crisis.Finally,the differential evolution(DE)algorithm is utilized for the parameter tuning of the VAE model.A series of simulations on the benchmark dataset reported the betterment of the MVOFS-OVAE approach over the recent state of art approaches.展开更多
This research aims to develop a methodology for applying the geostatistical method to generate a groutability classification for granular soils.To ensure the precision of the suggested technique,a total of 103 data sa...This research aims to develop a methodology for applying the geostatistical method to generate a groutability classification for granular soils.To ensure the precision of the suggested technique,a total of 103 data samples were used.Predicting the groutability of granular soils has always been difficult because of many soil characteristics.As a result,a new two-dimensional graph,the groutability classification of granular soil(GCS)chart,was developed.GCS establishment was based on data analysis of the grain size of soil and cement-based grouts(N1 and N2),relative density(Dr)and fines content of the soil(FC),water/cement ratio of grout mixture(w/c),and grouting pressure(P),all of which have a direct impact on the groutability of soil media.The geostatistical method was used to develop and compile the GCS graph based on the aforementioned parameters with the use of coefficient S,which is a coefficient of the scoring set of parameters including P,w/c,Dr,and FC.The validation process was carried out hierarchically,with an additional set of 30 data.The proposed method has a prediction accuracy of roughly 96.7%,demonstrating a helpful tool.The proposed approach can be easily implemented in practical engineering situations because it has a comparable syntax to commonly used formulae.It should be noted that the proposed formula was only tested using the data samples collected,and the applicability of the produced procedure to other situations requires more examination.展开更多
The image emotion classification task aims to use the model to automatically predict the emotional response of people when they see the image.Studies have shown that certain local regions are more likely to inspire an...The image emotion classification task aims to use the model to automatically predict the emotional response of people when they see the image.Studies have shown that certain local regions are more likely to inspire an emotional response than the whole image.However,existing methods perform poorly in predicting the details of emotional regions and are prone to overfitting during training due to the small size of the dataset.Therefore,this study proposes an image emotion classification network based on multilayer attentional interaction and adaptive feature aggregation.To perform more accurate emotional region prediction,this study designs a multilayer attentional interaction module.The module calculates spatial attention maps for higher-layer semantic features and fusion features through amultilayer shuffle attention module.Through layer-by-layer up-sampling and gating operations,the higher-layer features guide the lower-layer features to learn,eventually achieving sentiment region prediction at the optimal scale.To complement the important information lost by layer-by-layer fusion,this study not only adds an intra-layer fusion to the multilayer attention interaction module but also designs an adaptive feature aggregation module.The module uses global average pooling to compress spatial information and connect channel information from all layers.Then,the module adaptively generates a set of aggregated weights through two fully connected layers to augment the original features of each layer.Eventually,the semantics and details of the different layers are aggregated through gating operations and residual connectivity to complement the lost information.To reduce overfitting on small datasets,the network is pre-trained on the FI dataset,and further weight fine-tuning is performed on the small dataset.The experimental results on the FI,Twitter I and Emotion ROI(Region of Interest)datasets show that the proposed network exceeds existing image emotion classification methods,with accuracies of 90.27%,84.66%and 84.96%.展开更多
Machine Learning(ML)-based prediction and classification systems employ data and learning algorithms to forecast target values.However,improving predictive accuracy is a crucial step for informed decision-making.In th...Machine Learning(ML)-based prediction and classification systems employ data and learning algorithms to forecast target values.However,improving predictive accuracy is a crucial step for informed decision-making.In the healthcare domain,data are available in the form of genetic profiles and clinical characteristics to build prediction models for complex tasks like cancer detection or diagnosis.Among ML algorithms,Artificial Neural Networks(ANNs)are considered the most suitable framework for many classification tasks.The network weights and the activation functions are the two crucial elements in the learning process of an ANN.These weights affect the prediction ability and the convergence efficiency of the network.In traditional settings,ANNs assign random weights to the inputs.This research aims to develop a learning system for reliable cancer prediction by initializing more realistic weights computed using a supervised setting instead of random weights.The proposed learning system uses hybrid and traditional machine learning techniques such as Support Vector Machine(SVM),Linear Discriminant Analysis(LDA),Random Forest(RF),k-Nearest Neighbour(kNN),and ANN to achieve better accuracy in colon and breast cancer classification.This system computes the confusion matrix-based metrics for traditional and proposed frameworks.The proposed framework attains the highest accuracy of 89.24 percent using the colon cancer dataset and 72.20 percent using the breast cancer dataset,which outperforms the other models.The results show that the proposed learning system has higher predictive accuracies than conventional classifiers for each dataset,overcoming previous research limitations.Moreover,the proposed framework is of use to predict and classify cancer patients accurately.Consequently,this will facilitate the effective management of cancer patients.展开更多
Rockburst is defined as a phenomenon with immediate dynamic instability under excavation unloading conditions of deep or high geostress areas.Inadequate knowledge and lack of characterizing information prevent enginee...Rockburst is defined as a phenomenon with immediate dynamic instability under excavation unloading conditions of deep or high geostress areas.Inadequate knowledge and lack of characterizing information prevent engineers and experts from achieving appropriate prediction results related to the rockburst behaviour.In this study,a data set including 220 rockburst instances was collected for rockburst classification via the geostatistical method.An update of the 2D graph,the tunnel rockburst classification(TRC)chart,was introduced based on analysing three indicators,namely,elastic energy index(Wet),tangential stress in rock mass(σ_(0)),and uniaxial compressive strength(σ_(c)).Distribution and correlation of data were drawn on 2D plot,and the boundaries of rockburst were distinguished according to the achieved interpolate points by kriging method.Hierarchically,the validation phase was performed using an additional set of 28 case histories obtained from several projects around the world.The results showed that the TRC chart with an average error percentage of 3.6%in the prediction of rockburst had a significant and effective implementation in comparison to the exiting heuristic systems.Despite the initial character of the prediction,the described chart may be a helpful tool in the first steps of design and construction.展开更多
In this paper, Urumqi Airport time-lapse ground man-made observation data from November 2015 to February 2017, European fine grid (0.25 × 0.25) initial field (20 o’clock) and the forecast field within 24 hours w...In this paper, Urumqi Airport time-lapse ground man-made observation data from November 2015 to February 2017, European fine grid (0.25 × 0.25) initial field (20 o’clock) and the forecast field within 24 hours were utilized. From November 2015 to February 2016, the relevant materials were used as research samples (a total of 948 times), and from November 2016 to February 2017 as test samples (a total of 922 times), statistical methods were used to establish the scoring standards. And each relevant element was scored. After the score, the score level range was delineated, and the visibility forecast was performed according to the scope. The conclusions are as follows: 1) European fine grid forecast products are with good correspondence with the visibility of this field are 850 hPa and 2 m high temperature inversion, 850 hPa relative humidity and 850 hPa wind field over the field. 2) Through the statistical analysis of scores, it is defined that the score below 400 is level 4, the score above 1000 is level 1, the difference is significant, and the forecast indication is strong. Level 2 and level 3 are more evenly distributed, with no more concentrated fractions. 3) Applying the test sample to test the above indicators. The forecast accuracy of level 1 is 61.2%, and the forecast accuracy of level 4 is 97.2%, so level 1 and level 4 are expected to obtain better forecast results, which is of practical application value.展开更多
The advantages and disadvantages of genetic algorithm and BP algorithm are introduced. A neural network based on GA-BP algorithm is proposed and applied in the prediction of protein secondary structure, which combines...The advantages and disadvantages of genetic algorithm and BP algorithm are introduced. A neural network based on GA-BP algorithm is proposed and applied in the prediction of protein secondary structure, which combines the advantages of BP and GA. The prediction and training on the neural network are made respectively based on 4 structure classifications of protein so as to get higher rate of predication---the highest prediction rate 75.65%,the average prediction rate 65.04%.展开更多
Potential natural vegetation(PNV)is a valuable reference for ecosystem renovation and has garnered increasing attention worldwide.However,there is limited knowledge on the spatio-temporal distributions,transitional pr...Potential natural vegetation(PNV)is a valuable reference for ecosystem renovation and has garnered increasing attention worldwide.However,there is limited knowledge on the spatio-temporal distributions,transitional processes,and underlying mechanisms of global natural vegetation,particularly in the case of ongoing climate warming.In this study,we visualize the spatio-temporal pattern and inter-transition procedure of global PNV,analyse the shifting distances and directions of global PNV under the influence of climatic disturbance,and explore the mechanisms of global PNV in response to temperature and precipitation fluctuations.To achieve this,we utilize meteorological data,mainly temperature and precipitation,from six phases:the Last Inter-Glacial(LIG),the Last Glacial Maximum(LGM),the Mid Holocene(MH),the Present Day(PD),2030(20212040)and 2090(2081–2100),and employ a widely-accepted comprehensive and sequential classification sy–stem(CSCS)for global PNV classification.We find that the spatial patterns of five PNV groups(forest,shrubland,savanna,grassland and tundra)generally align with their respective ecotopes,although their distributions have shifted due to fluctuating temperature and precipitation.Notably,we observe an unexpected transition between tundra and savanna despite their geographical distance.The shifts in distance and direction of five PNV groups are mainly driven by temperature and precipitation,although there is heterogeneity among these shifts for each group.Indeed,the heterogeneity observed among different global PNV groups suggests that they may possess varying capacities to adjust to and withstand the impacts of changing climate.The spatio-temporal distributions,mutual transitions and shift tendencies of global PNV and its underlying mechanism in face of changing climate,as revealed in this study,can significantly contribute to the development of strategies for mitigating warming and promoting re-vegetation in degraded regions worldwide.展开更多
A large amount of mobile data from growing high-speed train(HST)users makes intelligent HST communications enter the era of big data.The corresponding artificial intelligence(AI)based HST channel modeling becomes a tr...A large amount of mobile data from growing high-speed train(HST)users makes intelligent HST communications enter the era of big data.The corresponding artificial intelligence(AI)based HST channel modeling becomes a trend.This paper provides AI based channel characteristic prediction and scenario classification model for millimeter wave(mmWave)HST communications.Firstly,the ray tracing method verified by measurement data is applied to reconstruct four representative HST scenarios.By setting the positions of transmitter(Tx),receiver(Rx),and other parameters,the multi-scenarios wireless channel big data is acquired.Then,based on the obtained channel database,radial basis function neural network(RBF-NN)and back propagation neural network(BP-NN)are trained for channel characteristic prediction and scenario classification.Finally,the channel characteristic prediction and scenario classification capabilities of the network are evaluated by calculating the root mean square error(RMSE).The results show that RBF-NN can generally achieve better performance than BP-NN,and is more applicable to prediction of HST scenarios.展开更多
The skeletal bone age assessment(BAA)was extremely implemented in development prediction and auxiliary analysis of medicinal issues.X-ray images of hands were detected from the estimation of bone age,whereas the ossif...The skeletal bone age assessment(BAA)was extremely implemented in development prediction and auxiliary analysis of medicinal issues.X-ray images of hands were detected from the estimation of bone age,whereas the ossification centers of epiphysis and carpal bones are important regions.The typical skeletal BAA approaches remove these regions for predicting the bone age,however,few of them attain suitable efficacy or accuracy.Automatic BAA techniques with deep learning(DL)methods are reached the leading efficiency on manual and typical approaches.Therefore,this study introduces an intellectual skeletal bone age assessment and classification with the use of metaheuristic with deep learning(ISBAAC-MDL)model.The presented ISBAAC-MDL technique majorly focuses on the identification of bone age prediction and classification process.To attain this,the presented ISBAAC-MDL model derives a mask Region-related Convolutional Neural Network(Mask-RCNN)with MobileNet as baseline model to extract features.Followed by,the whale optimization algorithm(WOA)is implemented for hyperparameter tuning of the MobileNet method.At last,Deep Feed-Forward Module(DFFM)based age prediction and Radial Basis Function Neural Network(RBFNN)based stage classification approach is utilized.The experimental evaluation of the ISBAAC-MDL model is tested using benchmark dataset and the outcomes are assessed over distinct factors.The experimental outcomes reported the better performances of the ISBAACMDL model over recent approaches with maximum accuracy of 0.9920.展开更多
As ITU-R Recommendations is widely implemented for countries all over the world, the role and status of ITU-R Recommendations are increasingly prominent in the field of radio engineering. ITU and ITU-R Study Groups ar...As ITU-R Recommendations is widely implemented for countries all over the world, the role and status of ITU-R Recommendations are increasingly prominent in the field of radio engineering. ITU and ITU-R Study Groups are summarized. Furthermore, the operating mode of the third study group, and the input documents are interpreted in detail. Lastly, from both wireless system design and electromagnetic compatibility analysis perspective, all of 79 P-series Recommendations are analyzed and classified, and the main contents of each Recommendation are summarized. The above research promote P-series Recommendations are widely used in China.展开更多
基金financially supported by the National Key Research and Development Program of China(2022YFB3706800,2020YFB1710100)the National Natural Science Foundation of China(51821001,52090042,52074183)。
文摘The complex sand-casting process combined with the interactions between process parameters makes it difficult to control the casting quality,resulting in a high scrap rate.A strategy based on a data-driven model was proposed to reduce casting defects and improve production efficiency,which includes the random forest(RF)classification model,the feature importance analysis,and the process parameters optimization with Monte Carlo simulation.The collected data includes four types of defects and corresponding process parameters were used to construct the RF model.Classification results show a recall rate above 90% for all categories.The Gini Index was used to assess the importance of the process parameters in the formation of various defects in the RF model.Finally,the classification model was applied to different production conditions for quality prediction.In the case of process parameters optimization for gas porosity defects,this model serves as an experimental process in the Monte Carlo method to estimate a better temperature distribution.The prediction model,when applied to the factory,greatly improved the efficiency of defect detection.Results show that the scrap rate decreased from 10.16% to 6.68%.
文摘In the era of advanced machine learning techniques,the development of accurate predictive models for complex medical conditions,such as thyroid cancer,has shown remarkable progress.Accurate predictivemodels for thyroid cancer enhance early detection,improve resource allocation,and reduce overtreatment.However,the widespread adoption of these models in clinical practice demands predictive performance along with interpretability and transparency.This paper proposes a novel association-rule based feature-integratedmachine learning model which shows better classification and prediction accuracy than present state-of-the-artmodels.Our study also focuses on the application of SHapley Additive exPlanations(SHAP)values as a powerful tool for explaining thyroid cancer prediction models.In the proposed method,the association-rule based feature integration framework identifies frequently occurring attribute combinations in the dataset.The original dataset is used in trainingmachine learning models,and further used in generating SHAP values fromthesemodels.In the next phase,the dataset is integrated with the dominant feature sets identified through association-rule based analysis.This new integrated dataset is used in re-training the machine learning models.The new SHAP values generated from these models help in validating the contributions of feature sets in predicting malignancy.The conventional machine learning models lack interpretability,which can hinder their integration into clinical decision-making systems.In this study,the SHAP values are introduced along with association-rule based feature integration as a comprehensive framework for understanding the contributions of feature sets inmodelling the predictions.The study discusses the importance of reliable predictive models for early diagnosis of thyroid cancer,and a validation framework of explainability.The proposed model shows an accuracy of 93.48%.Performance metrics such as precision,recall,F1-score,and the area under the receiver operating characteristic(AUROC)are also higher than the baseline models.The results of the proposed model help us identify the dominant feature sets that impact thyroid cancer classification and prediction.The features{calcification}and{shape}consistently emerged as the top-ranked features associated with thyroid malignancy,in both association-rule based interestingnessmetric values and SHAPmethods.The paper highlights the potential of the rule-based integrated models with SHAP in bridging the gap between the machine learning predictions and the interpretability of this prediction which is required for real-world medical applications.
基金supported by the China Earthquake Administration, Institute of Seismology Foundation (IS201526246)
文摘According to groundwater level monitoring data of Shuping landslide in the Three Gorges Reservoir area, based on the response relationship between influential factors such as rainfall and reservoir level and the change of groundwater level, the influential factors of groundwater level were selected. Then the classification and regression tree(CART) model was constructed by the subset and used to predict the groundwater level. Through the verification, the predictive results of the test sample were consistent with the actually measured values, and the mean absolute error and relative error is 0.28 m and 1.15%respectively. To compare the support vector machine(SVM) model constructed using the same set of factors, the mean absolute error and relative error of predicted results is 1.53 m and 6.11% respectively. It is indicated that CART model has not only better fitting and generalization ability, but also strong advantages in the analysis of landslide groundwater dynamic characteristics and the screening of important variables. It is an effective method for prediction of ground water level in landslides.
基金supported by the National Natural Science Foundation of China(61872303,U1936113)the Science and Technology Innovation Talents Program of Sichuan Science and Technology Department(2018RZ0143)the Key Project of Sichuan Science and Technology Innovation Pioneering Miaozi Project(19MZGC0163).
文摘Reversible data hiding in encrypted images(RDH-EI)technology is widely used in cloud storage for image privacy protection.In order to improve the embedding capacity of the RDH-EI algorithm and the security of the encrypted images,we proposed a reversible data hiding algorithm for encrypted images based on prediction and adaptive classification scrambling.First,the prediction error image is obtained by a novel prediction method before encryption.Then,the image pixel values are divided into two categories by the threshold range,which is selected adaptively according to the image content.Multiple high-significant bits of pixels within the threshold range are used for embedding data and pixel values outside the threshold range remain unchanged.The optimal threshold selected adaptively ensures the maximum embedding capacity of the algorithm.Moreover,the security of encrypted images can be improved by the combination of XOR encryption and classification scrambling encryption since the embedded data is independent of the pixel position.Experiment results demonstrate that the proposed method has higher embedding capacity compared with the current state-of-the-art methods for images with different texture complexity.
文摘Many business applications rely on their historical data to predict their business future. The marketing products process is one of the core processes for the business. Customer needs give a useful piece of information that help</span><span style="font-family:Verdana;"><span style="font-family:Verdana;">s</span></span><span style="font-family:Verdana;"> to market the appropriate products at the appropriate time. Moreover, services are considered recently as products. The development of education and health services </span><span style="font-family:Verdana;"><span style="font-family:Verdana;">is</span></span><span style="font-family:Verdana;"> depending on historical data. For the more, reducing online social media networks problems and crimes need a significant source of information. Data analysts need to use an efficient classification algorithm to predict the future of such businesses. However, dealing with a huge quantity of data requires great time to process. Data mining involves many useful techniques that are used to predict statistical data in a variety of business applications. The classification technique is one of the most widely used with a variety of algorithms. In this paper, various classification algorithms are revised in terms of accuracy in different areas of data mining applications. A comprehensive analysis is made after delegated reading of 20 papers in the literature. This paper aims to help data analysts to choose the most suitable classification algorithm for different business applications including business in general, online social media networks, agriculture, health, and education. Results show FFBPN is the most accurate algorithm in the business domain. The Random Forest algorithm is the most accurate in classifying online social networks (OSN) activities. Na<span style="white-space:nowrap;">ï</span>ve Bayes algorithm is the most accurate to classify agriculture datasets. OneR is the most accurate algorithm to classify instances within the health domain. The C4.5 Decision Tree algorithm is the most accurate to classify students’ records to predict degree completion time.
文摘The Farmers Property Mortgage Policy is a strategic financial policy in western China, a relatively underdeveloped region. Many contradictions and conflicts exist in the process between the strong demand for the loans by farmers and the strict risk control by the financial institutions. The rural finance corporations should use scientific analysis and investigation of the potential households for overall evaluation of the customers. These include historical credit rating, present family situation, and other related information. Three different data mining methods were applied in this paper to the specifically-collected household data. The objective was to study which factor could be the most important in determining loan demand for households, and in the meanwhile, to classify and predict the possibility of loan demand for the potential customers. The results obtained from the three methods indicated the similar outputs, income level, land area, the way of loan, and the understanding of policy were four main factors which decided the probability of one specific farmer applying for a credit loan. The results also embodied the difference within the three methods for classifying and predicting the loan anticipation for the testing households. The artificial neural network model had the highest accuracy of 91.4 which is better than the other two methods.
文摘This research concentrates to model an efficient thyroid prediction approach,which is considered a baseline for significant problems faced by the women community.The major research problem is the lack of automated model to attain earlier prediction.Some existing model fails to give better prediction accuracy.Here,a novel clinical decision support system is framed to make the proper decision during a time of complexity.Multiple stages are followed in the proposed framework,which plays a substantial role in thyroid prediction.These steps include i)data acquisition,ii)outlier prediction,and iii)multi-stage weight-based ensemble learning process(MS-WEL).The weighted analysis of the base classifier and other classifier models helps bridge the gap encountered in one single classifier model.Various classifiers aremerged to handle the issues identified in others and intend to enhance the prediction rate.The proposed model provides superior outcomes and gives good quality prediction rate.The simulation is done in the MATLAB 2020a environment and establishes a better trade-off than various existing approaches.The model gives a prediction accuracy of 97.28%accuracy compared to other models and shows a better trade than others.
文摘Financial crisis prediction(FCP)received significant attention in the financial sector for decision-making.Proper forecasting of the number of firms possible to fail is important to determine the growth index and strength of a nation’s economy.Conventionally,numerous approaches have been developed in the design of accurate FCP processes.At the same time,classifier efficacy and predictive accuracy are inadequate for real-time applications.In addition,several established techniques carry out well to any of the specific datasets but are not adjustable to distinct datasets.Thus,there is a necessity for developing an effectual prediction technique for optimum classifier performance and adjustable to various datasets.This paper presents a novel multi-vs.optimization(MVO)based feature selection(FS)with an optimal variational auto encoder(OVAE)model for FCP.The proposed multi-vs.optimization based feature selection with optimal variational auto encoder(MVOFS-OVAE)model mainly aims to accomplish forecasting the financial crisis.For achieving this,the proposed MVOFS-OVAE model primarily pre-processes the financial data using min-max normalization.In addition,the MVOFS-OVAE model designs a feature subset selection process using the MVOFS approach.Followed by,the variational auto encoder(VAE)model is applied for the categorization of financial data into financial crisis or non-financial crisis.Finally,the differential evolution(DE)algorithm is utilized for the parameter tuning of the VAE model.A series of simulations on the benchmark dataset reported the betterment of the MVOFS-OVAE approach over the recent state of art approaches.
文摘This research aims to develop a methodology for applying the geostatistical method to generate a groutability classification for granular soils.To ensure the precision of the suggested technique,a total of 103 data samples were used.Predicting the groutability of granular soils has always been difficult because of many soil characteristics.As a result,a new two-dimensional graph,the groutability classification of granular soil(GCS)chart,was developed.GCS establishment was based on data analysis of the grain size of soil and cement-based grouts(N1 and N2),relative density(Dr)and fines content of the soil(FC),water/cement ratio of grout mixture(w/c),and grouting pressure(P),all of which have a direct impact on the groutability of soil media.The geostatistical method was used to develop and compile the GCS graph based on the aforementioned parameters with the use of coefficient S,which is a coefficient of the scoring set of parameters including P,w/c,Dr,and FC.The validation process was carried out hierarchically,with an additional set of 30 data.The proposed method has a prediction accuracy of roughly 96.7%,demonstrating a helpful tool.The proposed approach can be easily implemented in practical engineering situations because it has a comparable syntax to commonly used formulae.It should be noted that the proposed formula was only tested using the data samples collected,and the applicability of the produced procedure to other situations requires more examination.
基金This study was supported,in part,by the National Nature Science Foundation of China under Grant 62272236in part,by the Natural Science Foundation of Jiangsu Province under Grant BK20201136,BK20191401.
文摘The image emotion classification task aims to use the model to automatically predict the emotional response of people when they see the image.Studies have shown that certain local regions are more likely to inspire an emotional response than the whole image.However,existing methods perform poorly in predicting the details of emotional regions and are prone to overfitting during training due to the small size of the dataset.Therefore,this study proposes an image emotion classification network based on multilayer attentional interaction and adaptive feature aggregation.To perform more accurate emotional region prediction,this study designs a multilayer attentional interaction module.The module calculates spatial attention maps for higher-layer semantic features and fusion features through amultilayer shuffle attention module.Through layer-by-layer up-sampling and gating operations,the higher-layer features guide the lower-layer features to learn,eventually achieving sentiment region prediction at the optimal scale.To complement the important information lost by layer-by-layer fusion,this study not only adds an intra-layer fusion to the multilayer attention interaction module but also designs an adaptive feature aggregation module.The module uses global average pooling to compress spatial information and connect channel information from all layers.Then,the module adaptively generates a set of aggregated weights through two fully connected layers to augment the original features of each layer.Eventually,the semantics and details of the different layers are aggregated through gating operations and residual connectivity to complement the lost information.To reduce overfitting on small datasets,the network is pre-trained on the FI dataset,and further weight fine-tuning is performed on the small dataset.The experimental results on the FI,Twitter I and Emotion ROI(Region of Interest)datasets show that the proposed network exceeds existing image emotion classification methods,with accuracies of 90.27%,84.66%and 84.96%.
文摘Machine Learning(ML)-based prediction and classification systems employ data and learning algorithms to forecast target values.However,improving predictive accuracy is a crucial step for informed decision-making.In the healthcare domain,data are available in the form of genetic profiles and clinical characteristics to build prediction models for complex tasks like cancer detection or diagnosis.Among ML algorithms,Artificial Neural Networks(ANNs)are considered the most suitable framework for many classification tasks.The network weights and the activation functions are the two crucial elements in the learning process of an ANN.These weights affect the prediction ability and the convergence efficiency of the network.In traditional settings,ANNs assign random weights to the inputs.This research aims to develop a learning system for reliable cancer prediction by initializing more realistic weights computed using a supervised setting instead of random weights.The proposed learning system uses hybrid and traditional machine learning techniques such as Support Vector Machine(SVM),Linear Discriminant Analysis(LDA),Random Forest(RF),k-Nearest Neighbour(kNN),and ANN to achieve better accuracy in colon and breast cancer classification.This system computes the confusion matrix-based metrics for traditional and proposed frameworks.The proposed framework attains the highest accuracy of 89.24 percent using the colon cancer dataset and 72.20 percent using the breast cancer dataset,which outperforms the other models.The results show that the proposed learning system has higher predictive accuracies than conventional classifiers for each dataset,overcoming previous research limitations.Moreover,the proposed framework is of use to predict and classify cancer patients accurately.Consequently,this will facilitate the effective management of cancer patients.
文摘Rockburst is defined as a phenomenon with immediate dynamic instability under excavation unloading conditions of deep or high geostress areas.Inadequate knowledge and lack of characterizing information prevent engineers and experts from achieving appropriate prediction results related to the rockburst behaviour.In this study,a data set including 220 rockburst instances was collected for rockburst classification via the geostatistical method.An update of the 2D graph,the tunnel rockburst classification(TRC)chart,was introduced based on analysing three indicators,namely,elastic energy index(Wet),tangential stress in rock mass(σ_(0)),and uniaxial compressive strength(σ_(c)).Distribution and correlation of data were drawn on 2D plot,and the boundaries of rockburst were distinguished according to the achieved interpolate points by kriging method.Hierarchically,the validation phase was performed using an additional set of 28 case histories obtained from several projects around the world.The results showed that the TRC chart with an average error percentage of 3.6%in the prediction of rockburst had a significant and effective implementation in comparison to the exiting heuristic systems.Despite the initial character of the prediction,the described chart may be a helpful tool in the first steps of design and construction.
文摘In this paper, Urumqi Airport time-lapse ground man-made observation data from November 2015 to February 2017, European fine grid (0.25 × 0.25) initial field (20 o’clock) and the forecast field within 24 hours were utilized. From November 2015 to February 2016, the relevant materials were used as research samples (a total of 948 times), and from November 2016 to February 2017 as test samples (a total of 922 times), statistical methods were used to establish the scoring standards. And each relevant element was scored. After the score, the score level range was delineated, and the visibility forecast was performed according to the scope. The conclusions are as follows: 1) European fine grid forecast products are with good correspondence with the visibility of this field are 850 hPa and 2 m high temperature inversion, 850 hPa relative humidity and 850 hPa wind field over the field. 2) Through the statistical analysis of scores, it is defined that the score below 400 is level 4, the score above 1000 is level 1, the difference is significant, and the forecast indication is strong. Level 2 and level 3 are more evenly distributed, with no more concentrated fractions. 3) Applying the test sample to test the above indicators. The forecast accuracy of level 1 is 61.2%, and the forecast accuracy of level 4 is 97.2%, so level 1 and level 4 are expected to obtain better forecast results, which is of practical application value.
文摘The advantages and disadvantages of genetic algorithm and BP algorithm are introduced. A neural network based on GA-BP algorithm is proposed and applied in the prediction of protein secondary structure, which combines the advantages of BP and GA. The prediction and training on the neural network are made respectively based on 4 structure classifications of protein so as to get higher rate of predication---the highest prediction rate 75.65%,the average prediction rate 65.04%.
基金funded by the National Natural Science Foundation of China(grants No.30960264,31160475 and 42071258)Open Research Fund of TPESER(grant No.TPESER202208)+2 种基金Special Fund for Basic Scientific Research of Central Colleges,Chang’an University,China(grant No.300102353501)Natural Science Foundation of Gansu Province,China(grant No.22JR5RA857)Higher Education Novel Foundation of Gansu Province,China(grant No.2021B-130)。
文摘Potential natural vegetation(PNV)is a valuable reference for ecosystem renovation and has garnered increasing attention worldwide.However,there is limited knowledge on the spatio-temporal distributions,transitional processes,and underlying mechanisms of global natural vegetation,particularly in the case of ongoing climate warming.In this study,we visualize the spatio-temporal pattern and inter-transition procedure of global PNV,analyse the shifting distances and directions of global PNV under the influence of climatic disturbance,and explore the mechanisms of global PNV in response to temperature and precipitation fluctuations.To achieve this,we utilize meteorological data,mainly temperature and precipitation,from six phases:the Last Inter-Glacial(LIG),the Last Glacial Maximum(LGM),the Mid Holocene(MH),the Present Day(PD),2030(20212040)and 2090(2081–2100),and employ a widely-accepted comprehensive and sequential classification sy–stem(CSCS)for global PNV classification.We find that the spatial patterns of five PNV groups(forest,shrubland,savanna,grassland and tundra)generally align with their respective ecotopes,although their distributions have shifted due to fluctuating temperature and precipitation.Notably,we observe an unexpected transition between tundra and savanna despite their geographical distance.The shifts in distance and direction of five PNV groups are mainly driven by temperature and precipitation,although there is heterogeneity among these shifts for each group.Indeed,the heterogeneity observed among different global PNV groups suggests that they may possess varying capacities to adjust to and withstand the impacts of changing climate.The spatio-temporal distributions,mutual transitions and shift tendencies of global PNV and its underlying mechanism in face of changing climate,as revealed in this study,can significantly contribute to the development of strategies for mitigating warming and promoting re-vegetation in degraded regions worldwide.
基金supported by the National Key R&D Program of China under Grant 2021YFB1407001the National Natural Science Foundation of China (NSFC) under Grants 62001269 and 61960206006+2 种基金the State Key Laboratory of Rail Traffic Control and Safety (under Grants RCS2022K009)Beijing Jiaotong University, the Future Plan Program for Young Scholars of Shandong Universitythe EU H2020 RISE TESTBED2 project under Grant 872172
文摘A large amount of mobile data from growing high-speed train(HST)users makes intelligent HST communications enter the era of big data.The corresponding artificial intelligence(AI)based HST channel modeling becomes a trend.This paper provides AI based channel characteristic prediction and scenario classification model for millimeter wave(mmWave)HST communications.Firstly,the ray tracing method verified by measurement data is applied to reconstruct four representative HST scenarios.By setting the positions of transmitter(Tx),receiver(Rx),and other parameters,the multi-scenarios wireless channel big data is acquired.Then,based on the obtained channel database,radial basis function neural network(RBF-NN)and back propagation neural network(BP-NN)are trained for channel characteristic prediction and scenario classification.Finally,the channel characteristic prediction and scenario classification capabilities of the network are evaluated by calculating the root mean square error(RMSE).The results show that RBF-NN can generally achieve better performance than BP-NN,and is more applicable to prediction of HST scenarios.
基金Princess Nourah bint Abdulrahman University Researchers Supporting Project number(PNURSP2022R151)Princess Nourah bint Abdulrahman University,Riyadh,Saudi Arabia.The authors would like to thank the Deanship of Scientific Research at Umm Al-Qura University for supporting this work by Grant Code:(22UQU4310373DSR17).
文摘The skeletal bone age assessment(BAA)was extremely implemented in development prediction and auxiliary analysis of medicinal issues.X-ray images of hands were detected from the estimation of bone age,whereas the ossification centers of epiphysis and carpal bones are important regions.The typical skeletal BAA approaches remove these regions for predicting the bone age,however,few of them attain suitable efficacy or accuracy.Automatic BAA techniques with deep learning(DL)methods are reached the leading efficiency on manual and typical approaches.Therefore,this study introduces an intellectual skeletal bone age assessment and classification with the use of metaheuristic with deep learning(ISBAAC-MDL)model.The presented ISBAAC-MDL technique majorly focuses on the identification of bone age prediction and classification process.To attain this,the presented ISBAAC-MDL model derives a mask Region-related Convolutional Neural Network(Mask-RCNN)with MobileNet as baseline model to extract features.Followed by,the whale optimization algorithm(WOA)is implemented for hyperparameter tuning of the MobileNet method.At last,Deep Feed-Forward Module(DFFM)based age prediction and Radial Basis Function Neural Network(RBFNN)based stage classification approach is utilized.The experimental evaluation of the ISBAAC-MDL model is tested using benchmark dataset and the outcomes are assessed over distinct factors.The experimental outcomes reported the better performances of the ISBAACMDL model over recent approaches with maximum accuracy of 0.9920.
文摘As ITU-R Recommendations is widely implemented for countries all over the world, the role and status of ITU-R Recommendations are increasingly prominent in the field of radio engineering. ITU and ITU-R Study Groups are summarized. Furthermore, the operating mode of the third study group, and the input documents are interpreted in detail. Lastly, from both wireless system design and electromagnetic compatibility analysis perspective, all of 79 P-series Recommendations are analyzed and classified, and the main contents of each Recommendation are summarized. The above research promote P-series Recommendations are widely used in China.