In the machine learning(ML)paradigm,data augmentation serves as a regularization approach for creating ML models.The increase in the diversification of training samples increases the generalization capabilities,which ...In the machine learning(ML)paradigm,data augmentation serves as a regularization approach for creating ML models.The increase in the diversification of training samples increases the generalization capabilities,which enhances the prediction performance of classifiers when tested on unseen examples.Deep learning(DL)models have a lot of parameters,and they frequently overfit.Effectively,to avoid overfitting,data plays a major role to augment the latest improvements in DL.Nevertheless,reliable data collection is a major limiting factor.Frequently,this problem is undertaken by combining augmentation of data,transfer learning,dropout,and methods of normalization in batches.In this paper,we introduce the application of data augmentation in the field of image classification using Random Multi-model Deep Learning(RMDL)which uses the association approaches of multi-DL to yield random models for classification.We present a methodology for using Generative Adversarial Networks(GANs)to generate images for data augmenting.Through experiments,we discover that samples generated by GANs when fed into RMDL improve both accuracy and model efficiency.Experimenting across both MNIST and CIAFAR-10 datasets show that,error rate with proposed approach has been decreased with different random models.展开更多
Latent factor(LF) models are highly effective in extracting useful knowledge from High-Dimensional and Sparse(HiDS) matrices which are commonly seen in various industrial applications. An LF model usually adopts itera...Latent factor(LF) models are highly effective in extracting useful knowledge from High-Dimensional and Sparse(HiDS) matrices which are commonly seen in various industrial applications. An LF model usually adopts iterative optimizers,which may consume many iterations to achieve a local optima,resulting in considerable time cost. Hence, determining how to accelerate the training process for LF models has become a significant issue. To address this, this work proposes a randomized latent factor(RLF) model. It incorporates the principle of randomized learning techniques from neural networks into the LF analysis of HiDS matrices, thereby greatly alleviating computational burden. It also extends a standard learning process for randomized neural networks in context of LF analysis to make the resulting model represent an HiDS matrix correctly.Experimental results on three HiDS matrices from industrial applications demonstrate that compared with state-of-the-art LF models, RLF is able to achieve significantly higher computational efficiency and comparable prediction accuracy for missing data.I provides an important alternative approach to LF analysis of HiDS matrices, which is especially desired for industrial applications demanding highly efficient models.展开更多
Most forest fires in the Margalla Hills are related to human activities and socioeconomic factors are essential to assess their likelihood of occurrence.This study considers both environmental(altitude,precipitation,f...Most forest fires in the Margalla Hills are related to human activities and socioeconomic factors are essential to assess their likelihood of occurrence.This study considers both environmental(altitude,precipitation,forest type,terrain and humidity index)and socioeconomic(population density,distance from roads and urban areas)factors to analyze how human behavior affects the risk of forest fires.Maximum entropy(Maxent)modelling and random forest(RF)machine learning methods were used to predict the probability and spatial diffusion patterns of forest fires in the Margalla Hills.The receiver operating characteristic(ROC)curve and the area under the ROC curve(AUC)were used to compare the models.We studied the fire history from 1990 to 2019 to establish the relationship between the probability of forest fire and environmental and socioeconomic changes.Using Maxent,the AUC fire probability values for the 1999 s,2009 s,and 2019 s were 0.532,0.569,and 0.518,respectively;using RF,they were 0.782,0.825,and 0.789,respectively.Fires were mainly distributed in urban areas and their probability of occurrence was related to accessibility and human behaviour/activity.AUC principles for validation were greater in the random forest models than in the Maxent models.Our results can be used to establish preventive measures to reduce risks of forest fires by considering socio-economic and environmental conditions.展开更多
Applying the fault diagnosis techniques to twisted pair copper cable is beneficial to improve the stability and reliability of internet access in Digital Subscriber Line(DSL)Access Network System.The network performan...Applying the fault diagnosis techniques to twisted pair copper cable is beneficial to improve the stability and reliability of internet access in Digital Subscriber Line(DSL)Access Network System.The network performance depends on the occurrence of cable fault along the copper cable.Currently,most of the telecommunication providers monitor the network performance degradation hence troubleshoot the present of the fault by using commercial test gear on-site,which may be resolved using data analytics and machine learning algorithm.This paper presents a fault diagnosis method for twisted pair cable fault detection based on knowledge-based and data-driven machine learning methods.The DSL Access Network is emulated in the laboratory to accommodate VDSL2 Technology with various types of cable fault along the cable distance between 100 m to 1200 m.Firstly,the line operation parameters and loop line testing parameters are collected and used to analyze.Secondly,the feature transformation,a knowledge-based method,is utilized to pre-process the fault data.Then,the random forests algorithms(RFs),a data-driven method,are adopted to train the fault diagnosis classifier and regression algorithm with the processed fault data.Finally,the proposed fault diagnosis method is used to detect and locate the cable fault in the DSL Access Network System.The results show that the cable fault detection has an accuracy of more than 97%,with less minimum absolute error in cable fault localization of less than 11%.The proposed algorithm may assist the telecommunication service provider to initiate automated cable faults identification and troubleshooting in the DSL Access Network System.展开更多
Cryogenic ground support equipment (CGSE) is an important part of a famous particle physics experiment - AMS-02. In this paper a design method which optimizes PID parameters of CGSE control system via the particle swa...Cryogenic ground support equipment (CGSE) is an important part of a famous particle physics experiment - AMS-02. In this paper a design method which optimizes PID parameters of CGSE control system via the particle swarm optimization (PSO) algorithm is presented. Firstly, an improved version of the original PSO, cooperative random learning particle swarm optimization (CRPSO), is put forward to enhance the performance of the conventional PSO. Secondly, the way of finding PID coefficient will be studied by using this algorithm. Finally, the experimental results and practical works demonstrate that the CRPSO-PID controller achieves a good performance.展开更多
Instance-specific algorithm selection technologies have been successfully used in many research fields,such as constraint satisfaction and planning. Researchers have been increasingly trying to model the potential rel...Instance-specific algorithm selection technologies have been successfully used in many research fields,such as constraint satisfaction and planning. Researchers have been increasingly trying to model the potential relations between different candidate algorithms for the algorithm selection. In this study, we propose an instancespecific algorithm selection method based on multi-output learning, which can manage these relations more directly.Three kinds of multi-output learning methods are used to predict the performances of the candidate algorithms:(1)multi-output regressor stacking;(2) multi-output extremely randomized trees; and(3) hybrid single-output and multioutput trees. The experimental results obtained using 11 SAT datasets and 5 Max SAT datasets indicate that our proposed methods can obtain a better performance over the state-of-the-art algorithm selection methods.展开更多
基金The researchers would like to thank the Deanship of Scientific Research,Qassim University for funding the publication of this project.
文摘In the machine learning(ML)paradigm,data augmentation serves as a regularization approach for creating ML models.The increase in the diversification of training samples increases the generalization capabilities,which enhances the prediction performance of classifiers when tested on unseen examples.Deep learning(DL)models have a lot of parameters,and they frequently overfit.Effectively,to avoid overfitting,data plays a major role to augment the latest improvements in DL.Nevertheless,reliable data collection is a major limiting factor.Frequently,this problem is undertaken by combining augmentation of data,transfer learning,dropout,and methods of normalization in batches.In this paper,we introduce the application of data augmentation in the field of image classification using Random Multi-model Deep Learning(RMDL)which uses the association approaches of multi-DL to yield random models for classification.We present a methodology for using Generative Adversarial Networks(GANs)to generate images for data augmenting.Through experiments,we discover that samples generated by GANs when fed into RMDL improve both accuracy and model efficiency.Experimenting across both MNIST and CIAFAR-10 datasets show that,error rate with proposed approach has been decreased with different random models.
基金supported in part by the National Natural Science Foundation of China (6177249391646114)+1 种基金Chongqing research program of technology innovation and application (cstc2017rgzn-zdyfX0020)in part by the Pioneer Hundred Talents Program of Chinese Academy of Sciences
文摘Latent factor(LF) models are highly effective in extracting useful knowledge from High-Dimensional and Sparse(HiDS) matrices which are commonly seen in various industrial applications. An LF model usually adopts iterative optimizers,which may consume many iterations to achieve a local optima,resulting in considerable time cost. Hence, determining how to accelerate the training process for LF models has become a significant issue. To address this, this work proposes a randomized latent factor(RLF) model. It incorporates the principle of randomized learning techniques from neural networks into the LF analysis of HiDS matrices, thereby greatly alleviating computational burden. It also extends a standard learning process for randomized neural networks in context of LF analysis to make the resulting model represent an HiDS matrix correctly.Experimental results on three HiDS matrices from industrial applications demonstrate that compared with state-of-the-art LF models, RLF is able to achieve significantly higher computational efficiency and comparable prediction accuracy for missing data.I provides an important alternative approach to LF analysis of HiDS matrices, which is especially desired for industrial applications demanding highly efficient models.
基金supported by the National Key Research and Development Program of China(Grant No.2019YFE0127700)。
文摘Most forest fires in the Margalla Hills are related to human activities and socioeconomic factors are essential to assess their likelihood of occurrence.This study considers both environmental(altitude,precipitation,forest type,terrain and humidity index)and socioeconomic(population density,distance from roads and urban areas)factors to analyze how human behavior affects the risk of forest fires.Maximum entropy(Maxent)modelling and random forest(RF)machine learning methods were used to predict the probability and spatial diffusion patterns of forest fires in the Margalla Hills.The receiver operating characteristic(ROC)curve and the area under the ROC curve(AUC)were used to compare the models.We studied the fire history from 1990 to 2019 to establish the relationship between the probability of forest fire and environmental and socioeconomic changes.Using Maxent,the AUC fire probability values for the 1999 s,2009 s,and 2019 s were 0.532,0.569,and 0.518,respectively;using RF,they were 0.782,0.825,and 0.789,respectively.Fires were mainly distributed in urban areas and their probability of occurrence was related to accessibility and human behaviour/activity.AUC principles for validation were greater in the random forest models than in the Maxent models.Our results can be used to establish preventive measures to reduce risks of forest fires by considering socio-economic and environmental conditions.
基金The authors received the funding from Smart Challenge Fund(SR0218I100)GPPS Grant VOT H404,from Ministry of Science,Technology and Innovation Malaysia,and Research Management Centre(RMC)of Universiti Tun Hussein Onn Malaysia(UTHM)。
文摘Applying the fault diagnosis techniques to twisted pair copper cable is beneficial to improve the stability and reliability of internet access in Digital Subscriber Line(DSL)Access Network System.The network performance depends on the occurrence of cable fault along the copper cable.Currently,most of the telecommunication providers monitor the network performance degradation hence troubleshoot the present of the fault by using commercial test gear on-site,which may be resolved using data analytics and machine learning algorithm.This paper presents a fault diagnosis method for twisted pair cable fault detection based on knowledge-based and data-driven machine learning methods.The DSL Access Network is emulated in the laboratory to accommodate VDSL2 Technology with various types of cable fault along the cable distance between 100 m to 1200 m.Firstly,the line operation parameters and loop line testing parameters are collected and used to analyze.Secondly,the feature transformation,a knowledge-based method,is utilized to pre-process the fault data.Then,the random forests algorithms(RFs),a data-driven method,are adopted to train the fault diagnosis classifier and regression algorithm with the processed fault data.Finally,the proposed fault diagnosis method is used to detect and locate the cable fault in the DSL Access Network System.The results show that the cable fault detection has an accuracy of more than 97%,with less minimum absolute error in cable fault localization of less than 11%.The proposed algorithm may assist the telecommunication service provider to initiate automated cable faults identification and troubleshooting in the DSL Access Network System.
基金the National Basic Research Program (973) of China (No. 2004CB720703)
文摘Cryogenic ground support equipment (CGSE) is an important part of a famous particle physics experiment - AMS-02. In this paper a design method which optimizes PID parameters of CGSE control system via the particle swarm optimization (PSO) algorithm is presented. Firstly, an improved version of the original PSO, cooperative random learning particle swarm optimization (CRPSO), is put forward to enhance the performance of the conventional PSO. Secondly, the way of finding PID coefficient will be studied by using this algorithm. Finally, the experimental results and practical works demonstrate that the CRPSO-PID controller achieves a good performance.
基金mainly supported by the National Natural Science Foundation of China(Nos.61125201,61303070,and U1435219)
文摘Instance-specific algorithm selection technologies have been successfully used in many research fields,such as constraint satisfaction and planning. Researchers have been increasingly trying to model the potential relations between different candidate algorithms for the algorithm selection. In this study, we propose an instancespecific algorithm selection method based on multi-output learning, which can manage these relations more directly.Three kinds of multi-output learning methods are used to predict the performances of the candidate algorithms:(1)multi-output regressor stacking;(2) multi-output extremely randomized trees; and(3) hybrid single-output and multioutput trees. The experimental results obtained using 11 SAT datasets and 5 Max SAT datasets indicate that our proposed methods can obtain a better performance over the state-of-the-art algorithm selection methods.