Widely used deep neural networks currently face limitations in achieving optimal performance for purchase intention prediction due to constraints on data volume and hyperparameter selection.To address this issue,based...Widely used deep neural networks currently face limitations in achieving optimal performance for purchase intention prediction due to constraints on data volume and hyperparameter selection.To address this issue,based on the deep forest algorithm and further integrating evolutionary ensemble learning methods,this paper proposes a novel Deep Adaptive Evolutionary Ensemble(DAEE)model.This model introduces model diversity into the cascade layer,allowing it to adaptively adjust its structure to accommodate complex and evolving purchasing behavior patterns.Moreover,this paper optimizes the methods of obtaining feature vectors,enhancement vectors,and prediction results within the deep forest algorithm to enhance the model’s predictive accuracy.Results demonstrate that the improved deep forest model not only possesses higher robustness but also shows an increase of 5.02%in AUC value compared to the baseline model.Furthermore,its training runtime speed is 6 times faster than that of deep models,and compared to other improved models,its accuracy has been enhanced by 0.9%.展开更多
This article introduces a new medical internet of things(IoT)framework for intelligent fall detection system of senior people based on our proposed deep forest model.The cascade multi-layer structure of deep forest cl...This article introduces a new medical internet of things(IoT)framework for intelligent fall detection system of senior people based on our proposed deep forest model.The cascade multi-layer structure of deep forest classifier allows to generate new features at each level with minimal hyperparameters compared to deep neural networks.Moreover,the optimal number of the deep forest layers is automatically estimated based on the early stopping criteria of validation accuracy value at each generated layer.The suggested forest classifier was successfully tested and evaluated using a public SmartFall dataset,which is acquired from three-axis accelerometer in a smartwatch.It includes 92781 training samples and 91025 testing samples with two labeled classes,namely non-fall and fall.Classification results of our deep forest classifier demonstrated a superior performance with the best accuracy score of 98.0%compared to three machine learning models,i.e.,K-nearest neighbors,decision trees and traditional random forest,and two deep learning models,which are dense neural networks and convolutional neural networks.By considering security and privacy aspects in the future work,our proposed medical IoT framework for fall detection of old people is valid for real-time healthcare application deployment.展开更多
In the research field of bearing fault diagnosis,classical deep learning models have the problems of too many parameters and high computing cost.In addition,the classical deep learning models are not effective in the ...In the research field of bearing fault diagnosis,classical deep learning models have the problems of too many parameters and high computing cost.In addition,the classical deep learning models are not effective in the scenario of small data.In recent years,deep forest is proposed,which has less hyper parameters and adaptive depth of deep model.In addition,weighted deep forest(WDF)is proposed to further improve deep forest by assigning weights for decisions trees based on the accuracy of each decision tree.In this paper,weighted deep forest model-based bearing fault diagnosis method(WDBM)is proposed.The WDBM is regard as a novel bearing fault diagnosis method,which not only inherits the WDF’s advantages-strong robustness,good generalization,less parameters,faster convergence speed and so on,but also realizes effective diagnosis with high precision and low cost under the condition of small samples.To verify the performance of the WDBM,experiments are carried out on Case Western Reserve University bearing data set(CWRU).Experiments results demonstrate that WDBM can achieve comparative recognition accuracy,with less computational overhead and faster convergence speed.展开更多
With the continuous development of machine learning and the increasing complexity of financial data analysis,it is more popular to use models in the field of machine learning to solve the hot and difficult problems in...With the continuous development of machine learning and the increasing complexity of financial data analysis,it is more popular to use models in the field of machine learning to solve the hot and difficult problems in the financial industry.To improve the effectiveness of stock trend prediction and solve the problems in time series data processing,this paper combines the fuzzy affiliation function with stock-related technical indicators to obtain nominal data that can widely reflect the constituent stocks in the case of time series changes by analysing the S&P 500 index.Meanwhile,in order to optimise the current machine learning algorithm in which the setting and adjustment of hyperparameters rely too much on empirical knowledge,this paper combines the deep forest model to train the stock data separately.The experimental results show that(1)the accuracy of the extreme random forest and the accuracy of the multi-grain cascade forest are both higher than that of the gated recurrent unit(GRU)model when the un-fuzzy index-adjusted dataset is used as features for input,(2)the accuracy of the extreme random forest and the accuracy of the multigranular cascade forest are improved by using the fuzzy index-adjusted dataset as features for input,(3)the accuracy of the fuzzy index-adjusted dataset as features for inputting the extreme random forest is improved by 18.89% compared to that of the un-fuzzy index-adjusted dataset as features for inputting the extreme random forest and(4)the average accuracy of the fuzzy index-adjusted dataset as features for inputting multi-grain cascade forest increased by 5.67%.展开更多
The dark web is a shadow area hidden in the depths of the Internet,which is difficult to access through common search engines.Because of its anonymity,the dark web has gradually become a hotbed for a variety of cyber-...The dark web is a shadow area hidden in the depths of the Internet,which is difficult to access through common search engines.Because of its anonymity,the dark web has gradually become a hotbed for a variety of cyber-crimes.Although some research based on machine learning or deep learning has been shown to be effective in the task of analyzing dark web traffic in recent years,there are still pain points such as low accuracy,insufficient real-time performance,and limited application scenarios.Aiming at the difficulties faced by the existing automated dark web traffic analysis methods,a novel method named Dark-Forest to analyze the behavior of dark web traffic is proposed.In this method,firstly,particle swarm optimization algorithm is used to filter the redundant features of dark web traffic data,which can effectively shorten the training and inference time of the model to meet the realtime requirements of dark web detection task.Then,the selected features of traffic are analyzed and classified using the DeepForest model as a backbone classifier.The comparison experiment with the current mainstream methods shows that Dark-Forest takes into account the advantages of statistical machine learning and deep learning,and achieves an accuracy rate of 87.84%.This method not only outperforms baseline methods such as Random Forest,MLP,CNN,and the original DeepForest in both large-scale and small-scale dataset based learning tasks,but also can detect normal network traffic,tunnel network traffic and anonymous network traffic,which may close the gap between different network traffic analysis tasks.Thus,it has a wider application scenario and higher practical value.展开更多
The paper proposes a new deep structure model,called Densely Connected Cascade Forest-Weighted K Nearest Neighbors(DCCF-WKNNs),to implement the corrosion data modelling and corrosion knowledgemining.Firstly,we collect...The paper proposes a new deep structure model,called Densely Connected Cascade Forest-Weighted K Nearest Neighbors(DCCF-WKNNs),to implement the corrosion data modelling and corrosion knowledgemining.Firstly,we collect 409 outdoor atmospheric corrosion samples of low-alloy steels as experiment datasets.Then,we give the proposed methods process,including random forests-K nearest neighbors(RF-WKNNs)and DCCF-WKNNs.Finally,we use the collected datasets to verify the performance of the proposed method.The results show that compared with commonly used and advanced machine-learning algorithms such as artificial neural network(ANN),support vector regression(SVR),random forests(RF),and cascade forests(cForest),the proposed method can obtain the best prediction results.In addition,the method can predict the corrosion rates with variations of any one single environmental variable,like pH,temperature,relative humidity,SO2,rainfall or Cl-.By this way,the threshold of each variable,upon which the corrosion rate may have a large change,can be further obtained.展开更多
针对大数据环境下并行深度森林算法中存在不相关及冗余特征过多、多粒度扫描不平衡、分类性能不足以及并行化效率低等问题,提出了基于互信息和融合加权的并行深度森林算法(parallel deep forest algorithm based on mutual information ...针对大数据环境下并行深度森林算法中存在不相关及冗余特征过多、多粒度扫描不平衡、分类性能不足以及并行化效率低等问题,提出了基于互信息和融合加权的并行深度森林算法(parallel deep forest algorithm based on mutual information and mixed weighting,PDF-MIMW)。首先,在特征降维阶段提出了基于互信息的特征提取策略(feature extraction strategy based on mutual information,FE-MI),结合特征重要性、交互性和冗余性度量过滤原始特征,剔除过多的不相关和冗余特征;接着,在多粒度扫描阶段提出了基于填充的改进多粒度扫描策略(improved multi-granularity scanning strategy based on padding,IMGS-P),对精简后的特征进行填充并对窗口扫描后的子序列进行随机采样,保证多粒度扫描的平衡;其次,在级联森林构建阶段提出了并行子森林构建策略(sub-forest construction strategy based on mixed weighting,SFC-MW),结合Spark框架并行构建加权子森林,提升模型的分类性能;最后,在类向量合并阶段提出基于混合粒子群算法的负载均衡策略(load balancing strategy based on hybrid particle swarm optimization algorithm,LB-HPSO),优化Spark框架中任务节点的负载分配,降低类向量合并时的等待时长,提高模型的并行化效率。实验表明,PDF-MIMW算法的分类效果更佳,同时在大数据环境下的训练效率更高。展开更多
Forest fires are a significant threat to the environment, causing ecological damage, economic losses, and posing a threat to human life. Hence, timely detection and prevention of forest fires are critical to minimizin...Forest fires are a significant threat to the environment, causing ecological damage, economic losses, and posing a threat to human life. Hence, timely detection and prevention of forest fires are critical to minimizing their impact. In this paper, we review the current state-of-the-art methods in forest fire detection and prevention using predictions based on weather conditions and predictions based on forest fire history. In particular, we discuss different Machine Learning (ML) models that have been used for forest fire detection. Further, we present the challenges faced when implementing the ML-based forest fire detection and prevention systems, such as data availability, model prediction errors and processing speed. Finally, we discuss how recent advances in Deep Learning (DL) can be utilized to improve the performance of current fire detection systems.展开更多
Along with the development of 5G network and Internet of Things technologies,there has been an explosion in personalized healthcare systems.When the 5G and Artificial Intelligence(Al)is introduced into diabetes manage...Along with the development of 5G network and Internet of Things technologies,there has been an explosion in personalized healthcare systems.When the 5G and Artificial Intelligence(Al)is introduced into diabetes management architecture,it can increase the efficiency of existing systems and complications of diabetes can be handled more effectively by taking advantage of 5G.In this article,we propose a 5G-based Artificial Intelligence Diabetes Management architecture(AIDM),which can help physicians and patients to manage both acute complications and chronic complications.The AIDM contains five layers:the sensing layer,the transmission layer,the storage layer,the computing layer,and the application layer.We build a test bed for the transmission and application layers.Specifically,we apply a delay-aware RA optimization based on a double-queue model to improve access efficiency in smart hospital wards in the transmission layer.In application layer,we build a prediction model using a deep forest algorithm.Results on real-world data show that our AIDM can enhance the efficiency of diabetes management and improve the screening rate of diabetes as well.展开更多
Cancer has become a cause of concern in recent years. Cancer genomics is currently a key research direction in the fields of genetic biology and biomedicine. This paper analyzes 5 different types of cancer genes, such...Cancer has become a cause of concern in recent years. Cancer genomics is currently a key research direction in the fields of genetic biology and biomedicine. This paper analyzes 5 different types of cancer genes, such as breast, kidney, colon, lung and prostate through machine learning methods, with the goal of building a robust classification model to identify each type of cancer, which will allow us to identify each type of cancer early, thereby reducing mortality.展开更多
基金supported by Ningxia Key R&D Program (Key)Project (2023BDE02001)Ningxia Key R&D Program (Talent Introduction Special)Project (2022YCZX0013)+2 种基金North Minzu University 2022 School-Level Research Platform“Digital Agriculture Empowering Ningxia Rural Revitalization Innovation Team”,Project Number:2022PT_S10Yinchuan City School-Enterprise Joint Innovation Project (2022XQZD009)“Innovation Team for Imaging and Intelligent Information Processing”of the National Ethnic Affairs Commission.
文摘Widely used deep neural networks currently face limitations in achieving optimal performance for purchase intention prediction due to constraints on data volume and hyperparameter selection.To address this issue,based on the deep forest algorithm and further integrating evolutionary ensemble learning methods,this paper proposes a novel Deep Adaptive Evolutionary Ensemble(DAEE)model.This model introduces model diversity into the cascade layer,allowing it to adaptively adjust its structure to accommodate complex and evolving purchasing behavior patterns.Moreover,this paper optimizes the methods of obtaining feature vectors,enhancement vectors,and prediction results within the deep forest algorithm to enhance the model’s predictive accuracy.Results demonstrate that the improved deep forest model not only possesses higher robustness but also shows an increase of 5.02%in AUC value compared to the baseline model.Furthermore,its training runtime speed is 6 times faster than that of deep models,and compared to other improved models,its accuracy has been enhanced by 0.9%.
基金the Deputyship for Research&Innovation,Ministry of Education in Saudi Arabia for funding this research work through the Project Number(IFP2021-043).
文摘This article introduces a new medical internet of things(IoT)framework for intelligent fall detection system of senior people based on our proposed deep forest model.The cascade multi-layer structure of deep forest classifier allows to generate new features at each level with minimal hyperparameters compared to deep neural networks.Moreover,the optimal number of the deep forest layers is automatically estimated based on the early stopping criteria of validation accuracy value at each generated layer.The suggested forest classifier was successfully tested and evaluated using a public SmartFall dataset,which is acquired from three-axis accelerometer in a smartwatch.It includes 92781 training samples and 91025 testing samples with two labeled classes,namely non-fall and fall.Classification results of our deep forest classifier demonstrated a superior performance with the best accuracy score of 98.0%compared to three machine learning models,i.e.,K-nearest neighbors,decision trees and traditional random forest,and two deep learning models,which are dense neural networks and convolutional neural networks.By considering security and privacy aspects in the future work,our proposed medical IoT framework for fall detection of old people is valid for real-time healthcare application deployment.
基金:The work is supported by the National Key R&D Program of China(No.2021YFB2700500,2021YFB2700503).Tao Wang received the grant and the URLs to sponsors’websites is https://service.most.gov.cn/.
文摘In the research field of bearing fault diagnosis,classical deep learning models have the problems of too many parameters and high computing cost.In addition,the classical deep learning models are not effective in the scenario of small data.In recent years,deep forest is proposed,which has less hyper parameters and adaptive depth of deep model.In addition,weighted deep forest(WDF)is proposed to further improve deep forest by assigning weights for decisions trees based on the accuracy of each decision tree.In this paper,weighted deep forest model-based bearing fault diagnosis method(WDBM)is proposed.The WDBM is regard as a novel bearing fault diagnosis method,which not only inherits the WDF’s advantages-strong robustness,good generalization,less parameters,faster convergence speed and so on,but also realizes effective diagnosis with high precision and low cost under the condition of small samples.To verify the performance of the WDBM,experiments are carried out on Case Western Reserve University bearing data set(CWRU).Experiments results demonstrate that WDBM can achieve comparative recognition accuracy,with less computational overhead and faster convergence speed.
基金Fundamental Research Foundation for Universities of Heilongjiang Province,Grant/Award Number:LGYC2018JQ003。
文摘With the continuous development of machine learning and the increasing complexity of financial data analysis,it is more popular to use models in the field of machine learning to solve the hot and difficult problems in the financial industry.To improve the effectiveness of stock trend prediction and solve the problems in time series data processing,this paper combines the fuzzy affiliation function with stock-related technical indicators to obtain nominal data that can widely reflect the constituent stocks in the case of time series changes by analysing the S&P 500 index.Meanwhile,in order to optimise the current machine learning algorithm in which the setting and adjustment of hyperparameters rely too much on empirical knowledge,this paper combines the deep forest model to train the stock data separately.The experimental results show that(1)the accuracy of the extreme random forest and the accuracy of the multi-grain cascade forest are both higher than that of the gated recurrent unit(GRU)model when the un-fuzzy index-adjusted dataset is used as features for input,(2)the accuracy of the extreme random forest and the accuracy of the multigranular cascade forest are improved by using the fuzzy index-adjusted dataset as features for input,(3)the accuracy of the fuzzy index-adjusted dataset as features for inputting the extreme random forest is improved by 18.89% compared to that of the un-fuzzy index-adjusted dataset as features for inputting the extreme random forest and(4)the average accuracy of the fuzzy index-adjusted dataset as features for inputting multi-grain cascade forest increased by 5.67%.
基金funded by Henan Provincial Key R&D and Promotion Special Project(Science and Technology Tackling)(212102210165)National Social Science Foun-dation Key Project(20AZD114)+1 种基金Henan Provincial Higher Education Key Research Project Program(20B520008)Public Security Behavior Scientific Research and Technological Innovation Project of the Chinese People’s Public Security University(2020SYS08).
文摘The dark web is a shadow area hidden in the depths of the Internet,which is difficult to access through common search engines.Because of its anonymity,the dark web has gradually become a hotbed for a variety of cyber-crimes.Although some research based on machine learning or deep learning has been shown to be effective in the task of analyzing dark web traffic in recent years,there are still pain points such as low accuracy,insufficient real-time performance,and limited application scenarios.Aiming at the difficulties faced by the existing automated dark web traffic analysis methods,a novel method named Dark-Forest to analyze the behavior of dark web traffic is proposed.In this method,firstly,particle swarm optimization algorithm is used to filter the redundant features of dark web traffic data,which can effectively shorten the training and inference time of the model to meet the realtime requirements of dark web detection task.Then,the selected features of traffic are analyzed and classified using the DeepForest model as a backbone classifier.The comparison experiment with the current mainstream methods shows that Dark-Forest takes into account the advantages of statistical machine learning and deep learning,and achieves an accuracy rate of 87.84%.This method not only outperforms baseline methods such as Random Forest,MLP,CNN,and the original DeepForest in both large-scale and small-scale dataset based learning tasks,but also can detect normal network traffic,tunnel network traffic and anonymous network traffic,which may close the gap between different network traffic analysis tasks.Thus,it has a wider application scenario and higher practical value.
基金financially supported by the National Key R&D Program of China(No.2017YFB0702100)the National Natural Science Foundation of China(No.51871024)。
文摘The paper proposes a new deep structure model,called Densely Connected Cascade Forest-Weighted K Nearest Neighbors(DCCF-WKNNs),to implement the corrosion data modelling and corrosion knowledgemining.Firstly,we collect 409 outdoor atmospheric corrosion samples of low-alloy steels as experiment datasets.Then,we give the proposed methods process,including random forests-K nearest neighbors(RF-WKNNs)and DCCF-WKNNs.Finally,we use the collected datasets to verify the performance of the proposed method.The results show that compared with commonly used and advanced machine-learning algorithms such as artificial neural network(ANN),support vector regression(SVR),random forests(RF),and cascade forests(cForest),the proposed method can obtain the best prediction results.In addition,the method can predict the corrosion rates with variations of any one single environmental variable,like pH,temperature,relative humidity,SO2,rainfall or Cl-.By this way,the threshold of each variable,upon which the corrosion rate may have a large change,can be further obtained.
文摘针对大数据环境下并行深度森林算法中存在不相关及冗余特征过多、多粒度扫描不平衡、分类性能不足以及并行化效率低等问题,提出了基于互信息和融合加权的并行深度森林算法(parallel deep forest algorithm based on mutual information and mixed weighting,PDF-MIMW)。首先,在特征降维阶段提出了基于互信息的特征提取策略(feature extraction strategy based on mutual information,FE-MI),结合特征重要性、交互性和冗余性度量过滤原始特征,剔除过多的不相关和冗余特征;接着,在多粒度扫描阶段提出了基于填充的改进多粒度扫描策略(improved multi-granularity scanning strategy based on padding,IMGS-P),对精简后的特征进行填充并对窗口扫描后的子序列进行随机采样,保证多粒度扫描的平衡;其次,在级联森林构建阶段提出了并行子森林构建策略(sub-forest construction strategy based on mixed weighting,SFC-MW),结合Spark框架并行构建加权子森林,提升模型的分类性能;最后,在类向量合并阶段提出基于混合粒子群算法的负载均衡策略(load balancing strategy based on hybrid particle swarm optimization algorithm,LB-HPSO),优化Spark框架中任务节点的负载分配,降低类向量合并时的等待时长,提高模型的并行化效率。实验表明,PDF-MIMW算法的分类效果更佳,同时在大数据环境下的训练效率更高。
文摘Forest fires are a significant threat to the environment, causing ecological damage, economic losses, and posing a threat to human life. Hence, timely detection and prevention of forest fires are critical to minimizing their impact. In this paper, we review the current state-of-the-art methods in forest fire detection and prevention using predictions based on weather conditions and predictions based on forest fire history. In particular, we discuss different Machine Learning (ML) models that have been used for forest fire detection. Further, we present the challenges faced when implementing the ML-based forest fire detection and prevention systems, such as data availability, model prediction errors and processing speed. Finally, we discuss how recent advances in Deep Learning (DL) can be utilized to improve the performance of current fire detection systems.
基金supported by grants from the industry prospecting and common key technology key projects of Jiangsu Province Science and Technology Department(Grant no.BE2020721)the Special guidance funds for service industry of Jiangsu Province Development and Reform Commission(Grant no.(2019)1089)+4 种基金the big data industry development pilot demonstration project of Ministry of Industry and Information Technology of China(Grant no.(2019)243,(2020)84)the Industrial and Information Industry Transformation and Upgrading Guiding Fund of Jiangsu Economy and Information Technology Commission(Grant no.(2018)0419)the Research Project of Jiangsu Province Sciences(Grant no.2019-2020ZZWKT15)the found of Jiangsu Engineering Research Center of Jiangsu Province Development and Reform Commission(Grant no.(2020)1460)the found of Jiangsu Digital Future Integration Innovation Center(Grant no.(2018)498).
文摘Along with the development of 5G network and Internet of Things technologies,there has been an explosion in personalized healthcare systems.When the 5G and Artificial Intelligence(Al)is introduced into diabetes management architecture,it can increase the efficiency of existing systems and complications of diabetes can be handled more effectively by taking advantage of 5G.In this article,we propose a 5G-based Artificial Intelligence Diabetes Management architecture(AIDM),which can help physicians and patients to manage both acute complications and chronic complications.The AIDM contains five layers:the sensing layer,the transmission layer,the storage layer,the computing layer,and the application layer.We build a test bed for the transmission and application layers.Specifically,we apply a delay-aware RA optimization based on a double-queue model to improve access efficiency in smart hospital wards in the transmission layer.In application layer,we build a prediction model using a deep forest algorithm.Results on real-world data show that our AIDM can enhance the efficiency of diabetes management and improve the screening rate of diabetes as well.
文摘Cancer has become a cause of concern in recent years. Cancer genomics is currently a key research direction in the fields of genetic biology and biomedicine. This paper analyzes 5 different types of cancer genes, such as breast, kidney, colon, lung and prostate through machine learning methods, with the goal of building a robust classification model to identify each type of cancer, which will allow us to identify each type of cancer early, thereby reducing mortality.