The increase of competition, economic recession and financial crises has increased business failure and depending on this the researchers have attempted to develop new approaches which can yield more correct and more ...The increase of competition, economic recession and financial crises has increased business failure and depending on this the researchers have attempted to develop new approaches which can yield more correct and more reliable results. The classification and regression tree (CART) is one of the new modeling techniques which is developed for this purpose. In this study, the classification and regression trees method is explained and tested the power of the financial failure prediction. CART is applied for the data of industry companies which is trade in Istanbul Stock Exchange (ISE) between 1997-2007. As a result of this study, it has been observed that, CART has a high predicting power of financial failure one, two and three years prior to failure, and profitability ratios being the most important ratios in the prediction of failure.展开更多
According to groundwater level monitoring data of Shuping landslide in the Three Gorges Reservoir area, based on the response relationship between influential factors such as rainfall and reservoir level and the chang...According to groundwater level monitoring data of Shuping landslide in the Three Gorges Reservoir area, based on the response relationship between influential factors such as rainfall and reservoir level and the change of groundwater level, the influential factors of groundwater level were selected. Then the classification and regression tree(CART) model was constructed by the subset and used to predict the groundwater level. Through the verification, the predictive results of the test sample were consistent with the actually measured values, and the mean absolute error and relative error is 0.28 m and 1.15%respectively. To compare the support vector machine(SVM) model constructed using the same set of factors, the mean absolute error and relative error of predicted results is 1.53 m and 6.11% respectively. It is indicated that CART model has not only better fitting and generalization ability, but also strong advantages in the analysis of landslide groundwater dynamic characteristics and the screening of important variables. It is an effective method for prediction of ground water level in landslides.展开更多
This paper presents a supervised learning algorithm for retinal vascular segmentation based on classification and regression tree (CART) algorithm and improved adptive bosting (AdaBoost). Local binary patterns (LBP) t...This paper presents a supervised learning algorithm for retinal vascular segmentation based on classification and regression tree (CART) algorithm and improved adptive bosting (AdaBoost). Local binary patterns (LBP) texture features and local features are extracted by extracting,reversing,dilating and enhancing the green components of retinal images to construct a 17-dimensional feature vector. A dataset is constructed by using the feature vector and the data manually marked by the experts. The feature is used to generate CART binary tree for nodes,where CART binary tree is as the AdaBoost weak classifier,and AdaBoost is improved by adding some re-judgment functions to form a strong classifier. The proposed algorithm is simulated on the digital retinal images for vessel extraction (DRIVE). The experimental results show that the proposed algorithm has higher segmentation accuracy for blood vessels,and the result basically contains complete blood vessel details. Moreover,the segmented blood vessel tree has good connectivity,which basically reflects the distribution trend of blood vessels. Compared with the traditional AdaBoost classification algorithm and the support vector machine (SVM) based classification algorithm,the proposed algorithm has higher average accuracy and reliability index,which is similar to the segmentation results of the state-of-the-art segmentation algorithm.展开更多
Understanding an underlying structure for phylogenetic trees is very important as it informs on the methods that should be employed during phylogenetic inference. The methods used under a structured population differ ...Understanding an underlying structure for phylogenetic trees is very important as it informs on the methods that should be employed during phylogenetic inference. The methods used under a structured population differ from those needed when a population is not structured. In this paper, we compared two supervised machine learning techniques, that is artificial neural network (ANN) and logistic regression models for prediction of an underlying structure for phylogenetic trees. We carried out parameter tuning for the models to identify optimal models. We then performed 10-fold cross-validation on the optimal models for both logistic regression?and ANN. We also performed a non-supervised technique called clustering to identify the number of clusters that could be identified from simulated phylogenetic trees. The trees were from?both structured?and non-structured populations. Clustering and prediction using classification techniques were?done using tree statistics such as Colless, Sackin and cophenetic indices, among others. Results from 10-fold cross-validation revealed that both logistic regression and ANN models had comparable results, with both models having average accuracy rates of over 0.75. Most of the clustering indices used resulted in 2 or 3 as the optimal number of clusters.展开更多
The sub-pixel impervious surface percentage(SPIS) is the fraction of impervious surface area in one pixel,and it is an important indicator of urbanization.Using remote sensing data,the spatial distribution of SPIS val...The sub-pixel impervious surface percentage(SPIS) is the fraction of impervious surface area in one pixel,and it is an important indicator of urbanization.Using remote sensing data,the spatial distribution of SPIS values over large areas can be extracted,and these data are significant for studies of urban climate,environment and hydrology.To develop a stabilized,multi-temporal SPIS estimation method suitable for typical temperate semi-arid climate zones with distinct seasons,an optimal model for estimating SPIS values within Beijing Municipality was built that is based on the classification and regression tree(CART) algorithm.First,models with different input variables for SPIS estimation were built by integrating multi-source remote sensing data with other auxiliary data.The optimal model was selected through the analysis and comparison of the assessed accuracy of these models.Subsequently,multi-temporal SPIS mapping was carried out based on the optimal model.The results are as follows:1) multi-seasonal images and nighttime light(NTL) data are the optimal input variables for SPIS estimation within Beijing Municipality,where the intra-annual variability in vegetation is distinct.The different spectral characteristics in the cultivated land caused by the different farming characteristics and vegetation phenology can be detected by the multi-seasonal images effectively.NLT data can effectively reduce the misestimation caused by the spectral similarity between bare land and impervious surfaces.After testing,the SPIS modeling correlation coefficient(r) is approximately 0.86,the average error(AE) is approximately 12.8%,and the relative error(RE) is approximately 0.39.2) The SPIS results have been divided into areas with high-density impervious cover(70%–100%),medium-density impervious cover(40%–70%),low-density impervious cover(10%–40%) and natural cover(0%–10%).The SPIS model performed better in estimating values for high-density urban areas than other categories.3) Multi-temporal SPIS mapping(1991–2016) was conducted based on the optimized SPIS results for 2005.After testing,AE ranges from 12.7% to 15.2%,RE ranges from 0.39 to 0.46,and r ranges from 0.81 to 0.86.It is demonstrated that the proposed approach for estimating sub-pixel level impervious surface by integrating the CART algorithm and multi-source remote sensing data is feasible and suitable for multi-temporal SPIS mapping of areas with distinct intra-annual variability in vegetation.展开更多
Obstructive Sleep Apnea(OSA)is a respiratory syndrome that occurs due to insufficient airflow through the respiratory or respiratory arrest while sleeping and sometimes due to the reduced oxygen saturation.The aim of ...Obstructive Sleep Apnea(OSA)is a respiratory syndrome that occurs due to insufficient airflow through the respiratory or respiratory arrest while sleeping and sometimes due to the reduced oxygen saturation.The aim of this paper is to analyze the respiratory signal of a person to detect the Normal Breathing Activity and the Sleep Apnea(SA)activity.In the proposed method,the time domain and frequency domain features of respiration signal obtained from the PPG device are extracted.These features are applied to the Classification and Regression Tree(CART)-Particle Swarm Optimization(PSO)classifier which classifies the signal into normal breathing signal and sleep apnea signal.The proposed method is validated to measure the performance metrics like sensitivity,specificity,accuracy and F1 score by applying time domain and frequency domain features separately.Additionally,the performance of the CART-PSO(CPSO)classification algorithm is evaluated through comparing its measures with existing classification algorithms.Concurrently,the effect of the PSO algorithm in the classifier is validated by varying the parameters of PSO.展开更多
Researchers in bioinformatics, biostatistics and other related fields seek biomarkers for many purposes, including risk assessment, disease diagnosis and prognosis, which can be formulated as a patient classification....Researchers in bioinformatics, biostatistics and other related fields seek biomarkers for many purposes, including risk assessment, disease diagnosis and prognosis, which can be formulated as a patient classification. In this paper, a new method of using a tree regression to improve logistic classification model is introduced in biomarker data analysis. The numerical results show that the linear logistic model can be significantly improved by a tree regression on the residuals. Although the classification problem of binary responses is discussed in this research, the idea is easy to extend to the classification of multinomial responses.展开更多
分布式光伏受天气影响较大,测算110kV供电区域的分布式光伏承载能力,对区域供电来说意义重大。基于此,提出基于分类与回归树(calssification and regression tree,CART)的110kV供电区域分布式光伏承载能力测算模型。该模型以分布式电源...分布式光伏受天气影响较大,测算110kV供电区域的分布式光伏承载能力,对区域供电来说意义重大。基于此,提出基于分类与回归树(calssification and regression tree,CART)的110kV供电区域分布式光伏承载能力测算模型。该模型以分布式电源输出功率、区域分布式电源发电量占比、局部分布式电源线损增量等数据为基础,利用CART决策树建立110kV供电区域分布式光伏承载能力测算模型,并使用改进鲸鱼优化算法求解测算结果。经实验测试发现,该模型对分布式光伏承载能力的测算精准度较高,可有效测算不同实验区域在不同季节时的分布式光伏承载能力,具有较高的应用价值。展开更多
为提高植被分类的精度,在利用高光谱图像提取植被信息时需要考虑训练样本和地形等其他因素的影响。以长白山为研究背景,基于CART(Classification And Regression Tree)算法构建决策树模型,对高光谱图像进行植被分类。由于混合像元的影响...为提高植被分类的精度,在利用高光谱图像提取植被信息时需要考虑训练样本和地形等其他因素的影响。以长白山为研究背景,基于CART(Classification And Regression Tree)算法构建决策树模型,对高光谱图像进行植被分类。由于混合像元的影响,以采用PPI(Pixel Purity Index)提取的纯净像元作为训练样本,提取植被指数、纹理和地形等分类特征变量。基于这些变量构建CART决策树对植被分类,并将结果与最大似然法分类结果进行比较。结果表明,CART决策树分类法可实现光谱、纹理和地形特征的有效组合,有较好的分类效果。展开更多
文摘The increase of competition, economic recession and financial crises has increased business failure and depending on this the researchers have attempted to develop new approaches which can yield more correct and more reliable results. The classification and regression tree (CART) is one of the new modeling techniques which is developed for this purpose. In this study, the classification and regression trees method is explained and tested the power of the financial failure prediction. CART is applied for the data of industry companies which is trade in Istanbul Stock Exchange (ISE) between 1997-2007. As a result of this study, it has been observed that, CART has a high predicting power of financial failure one, two and three years prior to failure, and profitability ratios being the most important ratios in the prediction of failure.
基金supported by the China Earthquake Administration, Institute of Seismology Foundation (IS201526246)
文摘According to groundwater level monitoring data of Shuping landslide in the Three Gorges Reservoir area, based on the response relationship between influential factors such as rainfall and reservoir level and the change of groundwater level, the influential factors of groundwater level were selected. Then the classification and regression tree(CART) model was constructed by the subset and used to predict the groundwater level. Through the verification, the predictive results of the test sample were consistent with the actually measured values, and the mean absolute error and relative error is 0.28 m and 1.15%respectively. To compare the support vector machine(SVM) model constructed using the same set of factors, the mean absolute error and relative error of predicted results is 1.53 m and 6.11% respectively. It is indicated that CART model has not only better fitting and generalization ability, but also strong advantages in the analysis of landslide groundwater dynamic characteristics and the screening of important variables. It is an effective method for prediction of ground water level in landslides.
基金National Natural Science Foundation of China(No.61163010)
文摘This paper presents a supervised learning algorithm for retinal vascular segmentation based on classification and regression tree (CART) algorithm and improved adptive bosting (AdaBoost). Local binary patterns (LBP) texture features and local features are extracted by extracting,reversing,dilating and enhancing the green components of retinal images to construct a 17-dimensional feature vector. A dataset is constructed by using the feature vector and the data manually marked by the experts. The feature is used to generate CART binary tree for nodes,where CART binary tree is as the AdaBoost weak classifier,and AdaBoost is improved by adding some re-judgment functions to form a strong classifier. The proposed algorithm is simulated on the digital retinal images for vessel extraction (DRIVE). The experimental results show that the proposed algorithm has higher segmentation accuracy for blood vessels,and the result basically contains complete blood vessel details. Moreover,the segmented blood vessel tree has good connectivity,which basically reflects the distribution trend of blood vessels. Compared with the traditional AdaBoost classification algorithm and the support vector machine (SVM) based classification algorithm,the proposed algorithm has higher average accuracy and reliability index,which is similar to the segmentation results of the state-of-the-art segmentation algorithm.
文摘Understanding an underlying structure for phylogenetic trees is very important as it informs on the methods that should be employed during phylogenetic inference. The methods used under a structured population differ from those needed when a population is not structured. In this paper, we compared two supervised machine learning techniques, that is artificial neural network (ANN) and logistic regression models for prediction of an underlying structure for phylogenetic trees. We carried out parameter tuning for the models to identify optimal models. We then performed 10-fold cross-validation on the optimal models for both logistic regression?and ANN. We also performed a non-supervised technique called clustering to identify the number of clusters that could be identified from simulated phylogenetic trees. The trees were from?both structured?and non-structured populations. Clustering and prediction using classification techniques were?done using tree statistics such as Colless, Sackin and cophenetic indices, among others. Results from 10-fold cross-validation revealed that both logistic regression and ANN models had comparable results, with both models having average accuracy rates of over 0.75. Most of the clustering indices used resulted in 2 or 3 as the optimal number of clusters.
基金Under the auspices of National Natural Science Foundation of China(No.41671339)
文摘The sub-pixel impervious surface percentage(SPIS) is the fraction of impervious surface area in one pixel,and it is an important indicator of urbanization.Using remote sensing data,the spatial distribution of SPIS values over large areas can be extracted,and these data are significant for studies of urban climate,environment and hydrology.To develop a stabilized,multi-temporal SPIS estimation method suitable for typical temperate semi-arid climate zones with distinct seasons,an optimal model for estimating SPIS values within Beijing Municipality was built that is based on the classification and regression tree(CART) algorithm.First,models with different input variables for SPIS estimation were built by integrating multi-source remote sensing data with other auxiliary data.The optimal model was selected through the analysis and comparison of the assessed accuracy of these models.Subsequently,multi-temporal SPIS mapping was carried out based on the optimal model.The results are as follows:1) multi-seasonal images and nighttime light(NTL) data are the optimal input variables for SPIS estimation within Beijing Municipality,where the intra-annual variability in vegetation is distinct.The different spectral characteristics in the cultivated land caused by the different farming characteristics and vegetation phenology can be detected by the multi-seasonal images effectively.NLT data can effectively reduce the misestimation caused by the spectral similarity between bare land and impervious surfaces.After testing,the SPIS modeling correlation coefficient(r) is approximately 0.86,the average error(AE) is approximately 12.8%,and the relative error(RE) is approximately 0.39.2) The SPIS results have been divided into areas with high-density impervious cover(70%–100%),medium-density impervious cover(40%–70%),low-density impervious cover(10%–40%) and natural cover(0%–10%).The SPIS model performed better in estimating values for high-density urban areas than other categories.3) Multi-temporal SPIS mapping(1991–2016) was conducted based on the optimized SPIS results for 2005.After testing,AE ranges from 12.7% to 15.2%,RE ranges from 0.39 to 0.46,and r ranges from 0.81 to 0.86.It is demonstrated that the proposed approach for estimating sub-pixel level impervious surface by integrating the CART algorithm and multi-source remote sensing data is feasible and suitable for multi-temporal SPIS mapping of areas with distinct intra-annual variability in vegetation.
文摘Obstructive Sleep Apnea(OSA)is a respiratory syndrome that occurs due to insufficient airflow through the respiratory or respiratory arrest while sleeping and sometimes due to the reduced oxygen saturation.The aim of this paper is to analyze the respiratory signal of a person to detect the Normal Breathing Activity and the Sleep Apnea(SA)activity.In the proposed method,the time domain and frequency domain features of respiration signal obtained from the PPG device are extracted.These features are applied to the Classification and Regression Tree(CART)-Particle Swarm Optimization(PSO)classifier which classifies the signal into normal breathing signal and sleep apnea signal.The proposed method is validated to measure the performance metrics like sensitivity,specificity,accuracy and F1 score by applying time domain and frequency domain features separately.Additionally,the performance of the CART-PSO(CPSO)classification algorithm is evaluated through comparing its measures with existing classification algorithms.Concurrently,the effect of the PSO algorithm in the classifier is validated by varying the parameters of PSO.
文摘Researchers in bioinformatics, biostatistics and other related fields seek biomarkers for many purposes, including risk assessment, disease diagnosis and prognosis, which can be formulated as a patient classification. In this paper, a new method of using a tree regression to improve logistic classification model is introduced in biomarker data analysis. The numerical results show that the linear logistic model can be significantly improved by a tree regression on the residuals. Although the classification problem of binary responses is discussed in this research, the idea is easy to extend to the classification of multinomial responses.
文摘分布式光伏受天气影响较大,测算110kV供电区域的分布式光伏承载能力,对区域供电来说意义重大。基于此,提出基于分类与回归树(calssification and regression tree,CART)的110kV供电区域分布式光伏承载能力测算模型。该模型以分布式电源输出功率、区域分布式电源发电量占比、局部分布式电源线损增量等数据为基础,利用CART决策树建立110kV供电区域分布式光伏承载能力测算模型,并使用改进鲸鱼优化算法求解测算结果。经实验测试发现,该模型对分布式光伏承载能力的测算精准度较高,可有效测算不同实验区域在不同季节时的分布式光伏承载能力,具有较高的应用价值。