Machine learning models may outperform traditional statistical regression algorithms for predicting clinical outcomes.Proper validation of building such models and tuning their underlying algorithms is necessary to av...Machine learning models may outperform traditional statistical regression algorithms for predicting clinical outcomes.Proper validation of building such models and tuning their underlying algorithms is necessary to avoid over-fitting and poor generalizability,which smaller datasets can be more prone to.In an effort to educate readers interested in artificial intelligence and model-building based on machine-learning algorithms,we outline important details on crossvalidation techniques that can enhance the performance and generalizability of such models.展开更多
This paper proposes a personalized headrelated transfer function(HRTF)prediction method based on Light GBM using anthropometric data.Considering the overfitting problems of the current training-based prediction method...This paper proposes a personalized headrelated transfer function(HRTF)prediction method based on Light GBM using anthropometric data.Considering the overfitting problems of the current training-based prediction methods,we use Light GBM and a specific network structure to prevent over-fitting and enhance the prediction performance.By decomposing and combining the data to be predicted,we set up 90 Light GBM models to separately predict the 90instants of HRTF in log domain.At the same time,the method of 10-fold cross-validation is used to score the accuracy of the model.For models with scores below 80 points,Bayesian optimization is used to adjust model hyperparameters to obtain a better model structure.The results obtained by Light GBM are evaluated with spectral distortion(SD)which can show the fitting error between the prediction and the original data.The mean SD values of both ears on the whole test set are 2.32 d B and 2.28 d B respectively.Compared with the non-linear regression method and the latest method,SD value of Light GBM-based method relatively decreases by 83.8%and 48.5%.展开更多
The basic idea of multi-class classification is a disassembly method,which is to decompose a multi-class classification task into several binary classification tasks.In order to improve the accuracy of multi-class cla...The basic idea of multi-class classification is a disassembly method,which is to decompose a multi-class classification task into several binary classification tasks.In order to improve the accuracy of multi-class classification in the case of insufficient samples,this paper proposes a multi-class classification method combining K-means and multi-task relationship learning(MTRL).The method first uses the split method of One vs.Rest to disassemble the multi-class classification task into binary classification tasks.K-means is used to down sample the dataset of each task,which can prevent over-fitting of the model while reducing training costs.Finally,the sampled dataset is applied to the MTRL,and multiple binary classifiers are trained together.With the help of MTRL,this method can utilize the inter-task association to train the model,and achieve the purpose of improving the classification accuracy of each binary classifier.The effectiveness of the proposed approach is demonstrated by experimental results on the Iris dataset,Wine dataset,Multiple Features dataset,Wireless Indoor Localization dataset and Avila dataset.展开更多
Flash floods are responsible for loss of life and considerable property damage in many countries.Flood susceptibility maps contribute to flood risk reduction in areas that are prone to this hazard if appropriately use...Flash floods are responsible for loss of life and considerable property damage in many countries.Flood susceptibility maps contribute to flood risk reduction in areas that are prone to this hazard if appropriately used by landuse planners and emergency managers.The main objective of this study is to prepare an accurate flood susceptibility map for the Haraz watershed in Iran using a novel modeling approach(DBPGA)based on Deep Belief Network(DBN)with Back Propagation(BP)algorithm optimized by the Genetic Algorithm(GA).For this task,a database comprising ten conditioning factors and 194 flood locations was created using the One-R Attribute Evaluation(ORAE)technique.Various well-known machine learning and optimization algorithms were used as benchmarks to compare the prediction accuracy of the proposed model.Statistical metrics include sensitivity,specificity accuracy,root mean square error(RMSE),and area under the receiver operatic characteristic curve(AUC)were used to assess the validity of the proposed model.The result shows that the proposed model has the highest goodness-of-fit(AUC=0.989)and prediction accuracy(AUC=0.985),and based on the validation dataset it outperforms benchmark models including LR(0.885),LMT(0.934),BLR(0.936),ADT(0.976),NBT(0.974),REPTree(0.811),ANFIS-BAT(0.944),ANFIS-CA(0.921),ANFIS-IWO(0.939),ANFIS-ICA(0.947),and ANFIS-FA(0.917).We conclude that the DBPGA model is an excellent alternative tool for predicting flash flood susceptibility for other regions prone to flash floods.展开更多
In this paper,the L_(2,∞)normalization of the weight matrices is used to enhance the robustness and accuracy of the deep neural network(DNN)with Relu as activation functions.It is shown that the L_(2,∞)normalization...In this paper,the L_(2,∞)normalization of the weight matrices is used to enhance the robustness and accuracy of the deep neural network(DNN)with Relu as activation functions.It is shown that the L_(2,∞)normalization leads to large dihedral angles between two adjacent faces of the DNN function graph and hence smoother DNN functions,which reduces over-fitting of the DNN.A global measure is proposed for the robustness of a classification DNN,which is the average radius of the maximal robust spheres with the training samples as centers.A lower bound for the robustness measure in terms of the L_(2,∞)norm is given.Finally,an upper bound for the Rademacher complexity of DNNs with L_(2,∞)normalization is given.An algorithm is given to train DNNs with the L_(2,∞)normalization and numerical experimental results are used to show that the L_(2,∞)normalization is effective in terms of improving the robustness and accuracy.展开更多
文摘Machine learning models may outperform traditional statistical regression algorithms for predicting clinical outcomes.Proper validation of building such models and tuning their underlying algorithms is necessary to avoid over-fitting and poor generalizability,which smaller datasets can be more prone to.In an effort to educate readers interested in artificial intelligence and model-building based on machine-learning algorithms,we outline important details on crossvalidation techniques that can enhance the performance and generalizability of such models.
基金supported by the cooperation between BIT and Ericssonpartially supported by the National Natural Science Foundation of China under Grants No.62071039。
文摘This paper proposes a personalized headrelated transfer function(HRTF)prediction method based on Light GBM using anthropometric data.Considering the overfitting problems of the current training-based prediction methods,we use Light GBM and a specific network structure to prevent over-fitting and enhance the prediction performance.By decomposing and combining the data to be predicted,we set up 90 Light GBM models to separately predict the 90instants of HRTF in log domain.At the same time,the method of 10-fold cross-validation is used to score the accuracy of the model.For models with scores below 80 points,Bayesian optimization is used to adjust model hyperparameters to obtain a better model structure.The results obtained by Light GBM are evaluated with spectral distortion(SD)which can show the fitting error between the prediction and the original data.The mean SD values of both ears on the whole test set are 2.32 d B and 2.28 d B respectively.Compared with the non-linear regression method and the latest method,SD value of Light GBM-based method relatively decreases by 83.8%and 48.5%.
基金supported by the National Natural Science Foundation of China(61703131 61703129+1 种基金 61701148 61703128)
文摘The basic idea of multi-class classification is a disassembly method,which is to decompose a multi-class classification task into several binary classification tasks.In order to improve the accuracy of multi-class classification in the case of insufficient samples,this paper proposes a multi-class classification method combining K-means and multi-task relationship learning(MTRL).The method first uses the split method of One vs.Rest to disassemble the multi-class classification task into binary classification tasks.K-means is used to down sample the dataset of each task,which can prevent over-fitting of the model while reducing training costs.Finally,the sampled dataset is applied to the MTRL,and multiple binary classifiers are trained together.With the help of MTRL,this method can utilize the inter-task association to train the model,and achieve the purpose of improving the classification accuracy of each binary classifier.The effectiveness of the proposed approach is demonstrated by experimental results on the Iris dataset,Wine dataset,Multiple Features dataset,Wireless Indoor Localization dataset and Avila dataset.
基金financial supported by the Iran National Science Foundation(INSF)through research project No.96004000the GIS research group(Ton Duc Thang University)for supports via the research project“GIS-based applications for solving realworld problems”。
文摘Flash floods are responsible for loss of life and considerable property damage in many countries.Flood susceptibility maps contribute to flood risk reduction in areas that are prone to this hazard if appropriately used by landuse planners and emergency managers.The main objective of this study is to prepare an accurate flood susceptibility map for the Haraz watershed in Iran using a novel modeling approach(DBPGA)based on Deep Belief Network(DBN)with Back Propagation(BP)algorithm optimized by the Genetic Algorithm(GA).For this task,a database comprising ten conditioning factors and 194 flood locations was created using the One-R Attribute Evaluation(ORAE)technique.Various well-known machine learning and optimization algorithms were used as benchmarks to compare the prediction accuracy of the proposed model.Statistical metrics include sensitivity,specificity accuracy,root mean square error(RMSE),and area under the receiver operatic characteristic curve(AUC)were used to assess the validity of the proposed model.The result shows that the proposed model has the highest goodness-of-fit(AUC=0.989)and prediction accuracy(AUC=0.985),and based on the validation dataset it outperforms benchmark models including LR(0.885),LMT(0.934),BLR(0.936),ADT(0.976),NBT(0.974),REPTree(0.811),ANFIS-BAT(0.944),ANFIS-CA(0.921),ANFIS-IWO(0.939),ANFIS-ICA(0.947),and ANFIS-FA(0.917).We conclude that the DBPGA model is an excellent alternative tool for predicting flash flood susceptibility for other regions prone to flash floods.
基金partially supported by NKRDP under Grant No.2018YFA0704705the National Natural Science Foundation of China under Grant No.12288201.
文摘In this paper,the L_(2,∞)normalization of the weight matrices is used to enhance the robustness and accuracy of the deep neural network(DNN)with Relu as activation functions.It is shown that the L_(2,∞)normalization leads to large dihedral angles between two adjacent faces of the DNN function graph and hence smoother DNN functions,which reduces over-fitting of the DNN.A global measure is proposed for the robustness of a classification DNN,which is the average radius of the maximal robust spheres with the training samples as centers.A lower bound for the robustness measure in terms of the L_(2,∞)norm is given.Finally,an upper bound for the Rademacher complexity of DNNs with L_(2,∞)normalization is given.An algorithm is given to train DNNs with the L_(2,∞)normalization and numerical experimental results are used to show that the L_(2,∞)normalization is effective in terms of improving the robustness and accuracy.