The main objective of this research is to determine the capacity of land cover classification combining spec- tral and textural features of Landsat TM imagery with ancillary geographical data in wetlands of the Sanjia...The main objective of this research is to determine the capacity of land cover classification combining spec- tral and textural features of Landsat TM imagery with ancillary geographical data in wetlands of the Sanjiang Plain, Heilongjiang Province, China. Semi-variograms and Z-test value were calculated to assess the separability of grey-level co-occurrence texture measures to maximize the difference between land cover types. The degree of spatial autocorrelation showed that window sizes of 3×3 pixels and 11×11 pixels were most appropriate for Landsat TM im- age texture calculations. The texture analysis showed that co-occurrence entropy, dissimilarity, and variance texture measures, derived from the Landsat TM spectrum bands and vegetation indices provided the most significant statistical differentiation between land cover types. Subsequently, a Classification and Regression Tree (CART) algorithm was applied to three different combinations of predictors: 1) TM imagery alone (TM-only); 2) TM imagery plus image texture (TM+TXT model); and 3) all predictors including TM imagery, image texture and additional ancillary GIS in- formation (TM+TXT+GIS model). Compared with traditional Maximum Likelihood Classification (MLC) supervised classification, three classification trees predictive models reduced the overall error rate significantly. Image texture measures and ancillary geographical variables depressed the speckle noise effectively and reduced classification error rate of marsh obviously. For classification trees model making use of all available predictors, omission error rate was 12.90% and commission error rate was 10.99% for marsh. The developed method is portable, relatively easy to im- plement and should be applicable in other settings and over larger extents.展开更多
To deal with the problem that arises when the conventional fuzzy class-association method applies repetitive scans of the classifier to classify new texts,which has low efficiency, a new approach based on the FCR-tree...To deal with the problem that arises when the conventional fuzzy class-association method applies repetitive scans of the classifier to classify new texts,which has low efficiency, a new approach based on the FCR-tree(fuzzy classification rules tree)for text categorization is proposed.The compactness of the FCR-tree saves significant space in storing a large set of rules when there are many repeated words in the rules.In comparison with classification rules,the fuzzy classification rules contain not only words,but also the fuzzy sets corresponding to the frequencies of words appearing in texts.Therefore,the construction of an FCR-tree and its structure are different from a CR-tree.To debase the difficulty of FCR-tree construction and rules retrieval,more k-FCR-trees are built.When classifying a new text,it is not necessary to search the paths of the sub-trees led by those words not appearing in this text,thus reducing the number of traveling rules.Experimental results show that the proposed approach obviously outperforms the conventional method in efficiency.展开更多
The diversity of tree species and the complexity of land use in cities create challenging issues for tree species classification.The combination of deep learning methods and RGB optical images obtained by unmanned aer...The diversity of tree species and the complexity of land use in cities create challenging issues for tree species classification.The combination of deep learning methods and RGB optical images obtained by unmanned aerial vehicles(UAVs) provides a new research direction for urban tree species classification.We proposed an RGB optical image dataset with 10 urban tree species,termed TCC10,which is a benchmark for tree canopy classification(TCC).TCC10 dataset contains two types of data:tree canopy images with simple backgrounds and those with complex backgrounds.The objective was to examine the possibility of using deep learning methods(AlexNet,VGG-16,and ResNet-50) for individual tree species classification.The results of convolutional neural networks(CNNs) were compared with those of K-nearest neighbor(KNN) and BP neural network.Our results demonstrated:(1) ResNet-50 achieved an overall accuracy(OA) of 92.6% and a kappa coefficient of 0.91 for tree species classification on TCC10 and outperformed AlexNet and VGG-16.(2) The classification accuracy of KNN and BP neural network was less than70%,while the accuracy of CNNs was relatively higher.(3)The classification accuracy of tree canopy images with complex backgrounds was lower than that for images with simple backgrounds.For the deciduous tree species in TCC10,the classification accuracy of ResNet-50 was higher in summer than that in autumn.Therefore,the deep learning is effective for urban tree species classification using RGB optical images.展开更多
According to groundwater level monitoring data of Shuping landslide in the Three Gorges Reservoir area, based on the response relationship between influential factors such as rainfall and reservoir level and the chang...According to groundwater level monitoring data of Shuping landslide in the Three Gorges Reservoir area, based on the response relationship between influential factors such as rainfall and reservoir level and the change of groundwater level, the influential factors of groundwater level were selected. Then the classification and regression tree(CART) model was constructed by the subset and used to predict the groundwater level. Through the verification, the predictive results of the test sample were consistent with the actually measured values, and the mean absolute error and relative error is 0.28 m and 1.15%respectively. To compare the support vector machine(SVM) model constructed using the same set of factors, the mean absolute error and relative error of predicted results is 1.53 m and 6.11% respectively. It is indicated that CART model has not only better fitting and generalization ability, but also strong advantages in the analysis of landslide groundwater dynamic characteristics and the screening of important variables. It is an effective method for prediction of ground water level in landslides.展开更多
Although airborne hyperspectral data with detailed spatial and spectral information has demonstrated significant potential for tree species classification,it has not been widely used over large areas.A comprehensive p...Although airborne hyperspectral data with detailed spatial and spectral information has demonstrated significant potential for tree species classification,it has not been widely used over large areas.A comprehensive process based on multi-flightline airborne hyperspectral data is lacking over large,forested areas influenced by both the effects of bidirectional reflectance distribution function(BRDF)and cloud shadow contamination.In this study,hyperspectral data were collected over the Mengjiagang Forest Farm in Northeast China in the summer of 2017 using the Chinese Academy of Forestry's LiDAR,CCD,and hyperspectral systems(CAF-LiCHy).After BRDF correction and cloud shadow detection processing,a tree species classification workflow was developed for sunlit and cloud-shaded forest areas with input features of minimum noise fraction reduced bands,spectral vegetation indices,and texture information.Results indicate that BRDF-corrected sunlit hyperspectral data can provide a stable and high classification accuracy based on representative training data.Cloud-shaded pixels also have good spectral separability for species classification.The red-edge spectral information and ratio-based spectral indices with high importance scores are recommended as input features for species classification under varying light conditions.According to the classification accuracies through field survey data at multiple spatial scales,it was found that species classification within an extensive forest area using airborne hyperspectral data under various illuminations can be successfully carried out using the effective radiometric consistency process and feature selection strategy.展开更多
The increase of competition, economic recession and financial crises has increased business failure and depending on this the researchers have attempted to develop new approaches which can yield more correct and more ...The increase of competition, economic recession and financial crises has increased business failure and depending on this the researchers have attempted to develop new approaches which can yield more correct and more reliable results. The classification and regression tree (CART) is one of the new modeling techniques which is developed for this purpose. In this study, the classification and regression trees method is explained and tested the power of the financial failure prediction. CART is applied for the data of industry companies which is trade in Istanbul Stock Exchange (ISE) between 1997-2007. As a result of this study, it has been observed that, CART has a high predicting power of financial failure one, two and three years prior to failure, and profitability ratios being the most important ratios in the prediction of failure.展开更多
The trend toward designing an intelligent distribution system based on students’individual differences and individual needs has taken precedence in view of the traditional dormitory distribution system,which neglects...The trend toward designing an intelligent distribution system based on students’individual differences and individual needs has taken precedence in view of the traditional dormitory distribution system,which neglects the students’personality traits,causes dormitory disputes,and affects the students’quality of life and academic quality.This paper collects freshmen's data according to college students’personal preferences,conducts a classification comparison,uses the decision tree classification algorithm based on the information gain principle as the core algorithm of dormitory allocation,determines the description rules of students’personal preferences and decision tree classification preferences,completes the conceptual design of the database of entity relations and data dictionaries,meets students’personality classification requirements for the dormitory,and lays the foundation for the intelligent dormitory allocation system.展开更多
A machine-learning approach was developed for automated building of knowledgebases for soil resources mapping by using a classification tree to generate knowledge from trainingdata. With this method, building a knowle...A machine-learning approach was developed for automated building of knowledgebases for soil resources mapping by using a classification tree to generate knowledge from trainingdata. With this method, building a knowledge base for automated soil mapping was easier than usingthe conventional knowledge acquisition approach. The knowledge base built by classification tree wasused by the knowledge classifier to perform the soil type classification of Longyou County,Zhejiang Province, China using Landsat TM bi-temporal images and CIS data. To evaluate theperformance of the resultant knowledge bases, the classification results were compared to existingsoil map based on a field survey. The accuracy assessment and analysis of the resultant soil mapssuggested that the knowledge bases built by the machine-learning method was of good quality formapping distribution model of soil classes over the study area.展开更多
[Objective] This study aimed to improve the accuracy of remote sensing classification for Dongting Lake Wetland.[Method] Based on the TM data and ground GIS information of Donting Lake,the decision tree classification...[Objective] This study aimed to improve the accuracy of remote sensing classification for Dongting Lake Wetland.[Method] Based on the TM data and ground GIS information of Donting Lake,the decision tree classification method was established through the expert classification knowledge base.The images of Dongting Lake wetland were classified into water area,mudflat,protection forest beach,Carem spp beach,Phragmites beach,Carex beach and other water body according to decision tree layers.[Result] The accuracy of decision tree classification reached 80.29%,which was much higher than the traditional method,and the total Kappa coefficient was 0.883 9,indicating that the data accuracy of this method could fulfill the requirements of actual practice.In addition,the image classification results based on knowledge could solve some classification mistakes.[Conclusion] Compared with the traditional method,the decision tree classification based on rules could classify the images by using various conditions,which reduced the data processing time and improved the classification accuracy.展开更多
With the increasing interest in e-commerce shopping, customer reviews have become one of the most important elements that determine customer satisfaction regarding products. This demonstrates the importance of working...With the increasing interest in e-commerce shopping, customer reviews have become one of the most important elements that determine customer satisfaction regarding products. This demonstrates the importance of working with Text Mining. This study is based on The Women’s Clothing E-Commerce Reviews database, which consists of reviews written by real customers. The aim of this paper is to conduct a Text Mining approach on a set of customer reviews. Each review was classified as either a positive or negative review by employing a classification method. Four tree-based methods were applied to solve the classification problem, namely Classification Tree, Random Forest, Gradient Boosting and XGBoost. The dataset was categorized into training and test sets. The results indicate that the Random Forest method displays an overfitting, XGBoost displays an overfitting if the number of trees is too high, Classification Tree is good at detecting negative reviews and bad at detecting positive reviews and the Gradient Boosting shows stable values and quality measures above 77% for the test dataset. A consensus between the applied methods is noted for important classification terms.展开更多
This paper presents a supervised learning algorithm for retinal vascular segmentation based on classification and regression tree (CART) algorithm and improved adptive bosting (AdaBoost). Local binary patterns (LBP) t...This paper presents a supervised learning algorithm for retinal vascular segmentation based on classification and regression tree (CART) algorithm and improved adptive bosting (AdaBoost). Local binary patterns (LBP) texture features and local features are extracted by extracting,reversing,dilating and enhancing the green components of retinal images to construct a 17-dimensional feature vector. A dataset is constructed by using the feature vector and the data manually marked by the experts. The feature is used to generate CART binary tree for nodes,where CART binary tree is as the AdaBoost weak classifier,and AdaBoost is improved by adding some re-judgment functions to form a strong classifier. The proposed algorithm is simulated on the digital retinal images for vessel extraction (DRIVE). The experimental results show that the proposed algorithm has higher segmentation accuracy for blood vessels,and the result basically contains complete blood vessel details. Moreover,the segmented blood vessel tree has good connectivity,which basically reflects the distribution trend of blood vessels. Compared with the traditional AdaBoost classification algorithm and the support vector machine (SVM) based classification algorithm,the proposed algorithm has higher average accuracy and reliability index,which is similar to the segmentation results of the state-of-the-art segmentation algorithm.展开更多
Antarctic sea ice is an important part of the Earth’s atmospheric system,and satellite remote sensing is an important technology for observing Antarctic sea ice.Whether Chinese Haiyang-2B(HY-2B)satellite altimeter da...Antarctic sea ice is an important part of the Earth’s atmospheric system,and satellite remote sensing is an important technology for observing Antarctic sea ice.Whether Chinese Haiyang-2B(HY-2B)satellite altimeter data could be used to estimate sea ice freeboard and provide alternative Antarctic sea ice thickness information with a high precision and long time series,as other radar altimetry satellites can,needs further investigation.This paper proposed an algorithm to discriminate leads and then retrieve sea ice freeboard and thickness from HY-2B radar altimeter data.We first collected the Moderate-resolution Imaging Spectroradiometer ice surface temperature(IST)product from the National Aeronautics and Space Administration to extract leads from the Antarctic waters and verified their accuracy through Sentinel-1 Synthetic Aperture Radar images.Second,a surface classification decision tree was generated for HY-2B satellite altimeter measurements of the Antarctic waters to extract leads and calculate local sea surface heights.We then estimated the Antarctic sea ice freeboard and thickness based on local sea surface heights and the static equilibrium equation.Finally,the retrieved HY-2B Antarctic sea ice thickness was compared with the CryoSat-2 sea ice thickness and the Antarctic Sea Ice Processes and Climate(ASPeCt)ship-based observed sea ice thickness.The results indicate that our classification decision tree constructed for HY-2B satellite altimeter measurements was reasonable,and the root mean square error of the obtained sea ice thickness compared to the ship measurements was 0.62 m.The proposed sea ice thickness algorithm for the HY-2B radar satellite fills a gap in this application domain for the HY-series satellites and can be a complement to existing Antarctic sea ice thickness products;this algorithm could provide long-time-series and large-scale sea ice thickness data that contribute to research on global climate change.展开更多
We built a classification tree (CT) model to estimate climatic factors controlling the cold temperate coniferous forest (CTCF) distributions in Yunnan province and to predict its potential habitats under the curre...We built a classification tree (CT) model to estimate climatic factors controlling the cold temperate coniferous forest (CTCF) distributions in Yunnan province and to predict its potential habitats under the current and future climates, using seven climate change scenarios, projected over the years of 2070-2099. The accurate CT model on CTCFs showed that minimum temperature of coldest month (TMW) was the overwhelmingly potent factor among the six climate variables. The areas of TMW〈-4.05 were suitable habitats of CTCF, and the areas of -1.35 〈 TMW were non-habitats, where temperate conifer and broad-leaved mixed forests (TCBLFs) were distribute in lower elevation, bordering on the CTCF. Dominant species of Abies, Picea, and Larix in the CTCFs, are more tolerant to winter coldness than Tsuga and broad-leaved trees including deciduous broad-leaved Acer and Betula, evergreen broad- leaved Cyclobalanopsis and Lithocarpus in TCBLFs. Winter coldness may actually limit the cool-side distributions of TCBLFs in the areas between -1.35℃ and -4.05℃, and the warm-side distributions of CTCFs may be controlled by competition to the species of TCBLFs. Under future climate scenarios, the vulnerable area, where current potential (suitable + marginal) habitats (80,749 km^2) shift to non-habitats, was predicted to decrease to 55.91% (45,053 km^2) of the current area. Inferring from the current vegetation distribution pattern, TCBLFs will replace declining CTCFs. Vulnerable areas predicted by models are important in determining priority of ecosystem conservation.展开更多
In enterprise operations,maintaining manual rules for enterprise processes can be expensive,time-consuming,and dependent on specialized domain knowledge in that enterprise domain.Recently,rule-generation has been auto...In enterprise operations,maintaining manual rules for enterprise processes can be expensive,time-consuming,and dependent on specialized domain knowledge in that enterprise domain.Recently,rule-generation has been automated in enterprises,particularly through Machine Learning,to streamline routine tasks.Typically,these machine models are black boxes where the reasons for the decisions are not always transparent,and the end users need to verify the model proposals as a part of the user acceptance testing to trust it.In such scenarios,rules excel over Machine Learning models as the end-users can verify the rules and have more trust.In many scenarios,the truth label changes frequently thus,it becomes difficult for the Machine Learning model to learn till a considerable amount of data has been accumulated,but with rules,the truth can be adapted.This paper presents a novel framework for generating human-understandable rules using the Classification and Regression Tree(CART)decision tree method,which ensures both optimization and user trust in automated decision-making processes.The framework generates comprehensible rules in the form of if condition and then predicts class even in domains where noise is present.The proposed system transforms enterprise operations by automating the production of human-readable rules from structured data,resulting in increased efficiency and transparency.Removing the need for human rule construction saves time and money while guaranteeing that users can readily check and trust the automatic judgments of the system.The remarkable performance metrics of the framework,which achieve 99.85%accuracy and 96.30%precision,further support its efficiency in translating complex data into comprehensible rules,eventually empowering users and enhancing organizational decision-making processes.展开更多
The accurate prediction of poverty is critical to efforts of poverty reduction,and high-resolution remote sensing(HRRS)data have shown great promise for facilitating such prediction.Accordingly,the present study used ...The accurate prediction of poverty is critical to efforts of poverty reduction,and high-resolution remote sensing(HRRS)data have shown great promise for facilitating such prediction.Accordingly,the present study used HRRS with 1 m resolution and 238 households data to evaluate the utility and optimal scale of HRRS data for predicting household poverty in a grassland region of Inner Mongolia,China.The prediction of household poverty was improved by using remote sensing indicators at multiple scales,instead of indicators at a single scale,and a model that combined indicators from four scales(building land,household,neighborhood,and regional)provided the most accurate prediction of household poverty,with testing and training accuracies of 48.57%and 70.83%,respectively.Furthermore,building area was the most efficient indicator of household poverty.When compared to conducting household surveys,the analysis of HRRS data is a cheaper and more time-efficient method for predicting household poverty and,in this case study,it reduced study time and cost by about 75%and 90%,respectively.This study provides the first evaluation of HRRS data for the prediction of household poverty in pastoral areas and thus provides technical support for the identification of poverty in pastoral areas around the world.展开更多
Here,we demonstrate the application of Decision Tree Classification(DTC)method for lithological mapping from multi-spectral satellite imagery.The area of investigation is the Lake Magadi in the East African Rift Valle...Here,we demonstrate the application of Decision Tree Classification(DTC)method for lithological mapping from multi-spectral satellite imagery.The area of investigation is the Lake Magadi in the East African Rift Valley in Kenya.The work involves the collection of rock and soil samples in the field,their analyses using reflectance and emittance spectroscopy,and the processing and interpretation of Advanced Spaceborne Thermal Emission and Reflection Radiometer data through the DTC method.The latter method is strictly non-parametric,flexible and simple which does not require assumptions regarding the distributions of the input data.It has been successfully used in a wide range of classification problems.The DTC method successfully mapped the chert and trachyte series rocks,including clay minerals and evaporites of the area with higher overall accuracy(86%).Higher classification accuracies of the developed decision tree suggest its ability to adapt to noise and nonlinear relations often observed on the surface materials in space-borne spectral image data without making assumptions on the distribution of input data.Moreover,the present work found the DTC method useful in mapping lithological variations in the vast rugged terrain accurately,which are inherently equipped with different sources of noises even when subjected to considerable radiance and atmospheric correction.展开更多
Asphaltenes have always been an attractive subject for researchers.However,the application of this fraction of the geochemical field has only been studied in a limited way.In other words,despite many studies on asphal...Asphaltenes have always been an attractive subject for researchers.However,the application of this fraction of the geochemical field has only been studied in a limited way.In other words,despite many studies on asphaltene structure,the application of asphaltene structures in organic geochemistry has not so far been assessed.Oil-oil correlation is a wellknown concept in geochemical studies and plays a vital role in basin modeling and the reconstruction of the burial history of basin sediments,as well as accurate characterization of the relevant petroleum system.This study aims to propose the Xray diffraction(XRD)technique as a novel method for oil-oil correlation and investigate its reliability and accuracy for different crude oils.To this end,13 crude oil samples from the Iranian sector of the Persian Gulf region,which had previously been correlated by traditional geochemical tools such as biomarker ratios and isotope values,in four distinct genetic groups,were selected and their asphaltene fractions analyzed by two prevalent methods of XRD and Fouriertransform infrared spectroscopy(FTIR).For oil-oil correlation assessment,various cross-plots,as well as principal component analysis(PCA),were conducted,based on the structural parameters of the studied asphaltenes.The results indicate that asphaltene structural parameters can also be used for oil-oil correlation purposes,their results being completely in accord with the previous classifications.The average values of distance between saturated portions(d_(r))and the distance between two aromatic layers(d_(m))of asphaltene molecules belonging to the studied oil samples are 4.69Aand 3.54A,respectively.Furthermore,the average diameter of the aromatic sheets(L_(a)),the height of the clusters(L_(c)),the number of carbons per aromatic unit(C_(au)),the number of aromatic rings per layer(R_(a)),the number of sheets in the cluster(M_(e))and aromaticity(f_(a))values of these asphaltene samples are 10.09A,34.04A,17.42A,3.78A,10.61Aand 0.26A,respectively.The results of XRD parameters indicate that plots of dr vs.d_(m),d_(r) vs.M_(e),d_(r) vs.f_(a),d_(m) vs.L_(c),L_(c) vs.L_(a),and f_(a) vs.L_(a) perform appropriately for distinguishing genetic groups.A comparison between XRD and FTIR results indicated that the XRD method is more accurate for this purpose.In addition,decision tree classification,one of the most efficacious approaches of machine learning,was employed for the geochemical groups of this study for the first time.This tree,which was constructed using XRD data,can distinguish genetic groups accurately and can also determine the characteristics of each geochemical group.In conclusion,the obtaining of structural parameters for asphaltene by the XRD technique is a novel,precise and inexpensive method,which can be deployed as a new approach for oil-oil correlation goals.The findings of this study can help in the prompt determination of genetic groups as a screening method and can also be useful for assessing oil samples affected by secondary processes.展开更多
BACKGROUND Liver disease indicates any pathology that can harm or destroy the liver or prevent it from normal functioning.The global community has recently witnessed an increase in the mortality rate due to liver dise...BACKGROUND Liver disease indicates any pathology that can harm or destroy the liver or prevent it from normal functioning.The global community has recently witnessed an increase in the mortality rate due to liver disease.This could be attributed to many factors,among which are human habits,awareness issues,poor healthcare,and late detection.To curb the growing threats from liver disease,early detection is critical to help reduce the risks and improve treatment outcome.Emerging technologies such as machine learning,as shown in this study,could be deployed to assist in enhancing its prediction and treatment.AIM To present a more efficient system for timely prediction of liver disease using a hybrid eXtreme Gradient Boosting model with hyperparameter tuning with a view to assist in early detection,diagnosis,and reduction of risks and mortality associated with the disease.METHODS The dataset used in this study consisted of 416 people with liver problems and 167 with no such history.The data were collected from the state of Andhra Pradesh,India,through https://www.kaggle.com/datasets/uciml/indian-liver-patientrecords.The population was divided into two sets depending on the disease state of the patient.This binary information was recorded in the attribute"is_patient".RESULTS The results indicated that the chi-square automated interaction detection and classification and regression trees models achieved an accuracy level of 71.36%and 73.24%,respectively,which was much better than the conventional method.The proposed solution would assist patients and physicians in tackling the problem of liver disease and ensuring that cases are detected early to prevent it from developing into cirrhosis(scarring)and to enhance the survival of patients.The study showed the potential of machine learning in health care,especially as it concerns disease prediction and monitoring.CONCLUSION This study contributed to the knowledge of machine learning application to health and to the efforts toward combating the problem of liver disease.However,relevant authorities have to invest more into machine learning research and other health technologies to maximize their potential.展开更多
The contribution of this paper is comparing three popular machine learning methods for software fault prediction. They are classification tree, neural network and case-based reasoning. First, three different classifie...The contribution of this paper is comparing three popular machine learning methods for software fault prediction. They are classification tree, neural network and case-based reasoning. First, three different classifiers are built based on these three different approaches. Second, the three different classifiers utilize the same product metrics as predictor variables to identify the fault-prone components. Third, the predicting results are compared on two aspects, how good prediction capabilities these models are, and how the models support understanding a process represented by the data.展开更多
The sub-pixel impervious surface percentage(SPIS) is the fraction of impervious surface area in one pixel,and it is an important indicator of urbanization.Using remote sensing data,the spatial distribution of SPIS val...The sub-pixel impervious surface percentage(SPIS) is the fraction of impervious surface area in one pixel,and it is an important indicator of urbanization.Using remote sensing data,the spatial distribution of SPIS values over large areas can be extracted,and these data are significant for studies of urban climate,environment and hydrology.To develop a stabilized,multi-temporal SPIS estimation method suitable for typical temperate semi-arid climate zones with distinct seasons,an optimal model for estimating SPIS values within Beijing Municipality was built that is based on the classification and regression tree(CART) algorithm.First,models with different input variables for SPIS estimation were built by integrating multi-source remote sensing data with other auxiliary data.The optimal model was selected through the analysis and comparison of the assessed accuracy of these models.Subsequently,multi-temporal SPIS mapping was carried out based on the optimal model.The results are as follows:1) multi-seasonal images and nighttime light(NTL) data are the optimal input variables for SPIS estimation within Beijing Municipality,where the intra-annual variability in vegetation is distinct.The different spectral characteristics in the cultivated land caused by the different farming characteristics and vegetation phenology can be detected by the multi-seasonal images effectively.NLT data can effectively reduce the misestimation caused by the spectral similarity between bare land and impervious surfaces.After testing,the SPIS modeling correlation coefficient(r) is approximately 0.86,the average error(AE) is approximately 12.8%,and the relative error(RE) is approximately 0.39.2) The SPIS results have been divided into areas with high-density impervious cover(70%–100%),medium-density impervious cover(40%–70%),low-density impervious cover(10%–40%) and natural cover(0%–10%).The SPIS model performed better in estimating values for high-density urban areas than other categories.3) Multi-temporal SPIS mapping(1991–2016) was conducted based on the optimized SPIS results for 2005.After testing,AE ranges from 12.7% to 15.2%,RE ranges from 0.39 to 0.46,and r ranges from 0.81 to 0.86.It is demonstrated that the proposed approach for estimating sub-pixel level impervious surface by integrating the CART algorithm and multi-source remote sensing data is feasible and suitable for multi-temporal SPIS mapping of areas with distinct intra-annual variability in vegetation.展开更多
基金Under the auspices of National Natural Science Foundation of China (No. 40871188) National Key Technologies R&D Program of China (No. 2006BAD23B03)
文摘The main objective of this research is to determine the capacity of land cover classification combining spec- tral and textural features of Landsat TM imagery with ancillary geographical data in wetlands of the Sanjiang Plain, Heilongjiang Province, China. Semi-variograms and Z-test value were calculated to assess the separability of grey-level co-occurrence texture measures to maximize the difference between land cover types. The degree of spatial autocorrelation showed that window sizes of 3×3 pixels and 11×11 pixels were most appropriate for Landsat TM im- age texture calculations. The texture analysis showed that co-occurrence entropy, dissimilarity, and variance texture measures, derived from the Landsat TM spectrum bands and vegetation indices provided the most significant statistical differentiation between land cover types. Subsequently, a Classification and Regression Tree (CART) algorithm was applied to three different combinations of predictors: 1) TM imagery alone (TM-only); 2) TM imagery plus image texture (TM+TXT model); and 3) all predictors including TM imagery, image texture and additional ancillary GIS in- formation (TM+TXT+GIS model). Compared with traditional Maximum Likelihood Classification (MLC) supervised classification, three classification trees predictive models reduced the overall error rate significantly. Image texture measures and ancillary geographical variables depressed the speckle noise effectively and reduced classification error rate of marsh obviously. For classification trees model making use of all available predictors, omission error rate was 12.90% and commission error rate was 10.99% for marsh. The developed method is portable, relatively easy to im- plement and should be applicable in other settings and over larger extents.
基金The National Natural Science Foundation of China(No.60473045)the Technology Research Project of Hebei Province(No.05213573)the Research Plan of Education Office of Hebei Province(No.2004406)
文摘To deal with the problem that arises when the conventional fuzzy class-association method applies repetitive scans of the classifier to classify new texts,which has low efficiency, a new approach based on the FCR-tree(fuzzy classification rules tree)for text categorization is proposed.The compactness of the FCR-tree saves significant space in storing a large set of rules when there are many repeated words in the rules.In comparison with classification rules,the fuzzy classification rules contain not only words,but also the fuzzy sets corresponding to the frequencies of words appearing in texts.Therefore,the construction of an FCR-tree and its structure are different from a CR-tree.To debase the difficulty of FCR-tree construction and rules retrieval,more k-FCR-trees are built.When classifying a new text,it is not necessary to search the paths of the sub-trees led by those words not appearing in this text,thus reducing the number of traveling rules.Experimental results show that the proposed approach obviously outperforms the conventional method in efficiency.
基金supported by Joint Fund of Natural Science Foundation of Zhejiang-Qingshanhu Science and Technology City(Grant No.LQY18C160002)National Natural Science Foundation of China(Grant No.U1809208)+1 种基金Zhejiang Science and Technology Key R&D Program Funded Project(Grant No.2018C02013)Natural Science Foundation of Zhejiang Province(Grant No.LQ20F020005).
文摘The diversity of tree species and the complexity of land use in cities create challenging issues for tree species classification.The combination of deep learning methods and RGB optical images obtained by unmanned aerial vehicles(UAVs) provides a new research direction for urban tree species classification.We proposed an RGB optical image dataset with 10 urban tree species,termed TCC10,which is a benchmark for tree canopy classification(TCC).TCC10 dataset contains two types of data:tree canopy images with simple backgrounds and those with complex backgrounds.The objective was to examine the possibility of using deep learning methods(AlexNet,VGG-16,and ResNet-50) for individual tree species classification.The results of convolutional neural networks(CNNs) were compared with those of K-nearest neighbor(KNN) and BP neural network.Our results demonstrated:(1) ResNet-50 achieved an overall accuracy(OA) of 92.6% and a kappa coefficient of 0.91 for tree species classification on TCC10 and outperformed AlexNet and VGG-16.(2) The classification accuracy of KNN and BP neural network was less than70%,while the accuracy of CNNs was relatively higher.(3)The classification accuracy of tree canopy images with complex backgrounds was lower than that for images with simple backgrounds.For the deciduous tree species in TCC10,the classification accuracy of ResNet-50 was higher in summer than that in autumn.Therefore,the deep learning is effective for urban tree species classification using RGB optical images.
基金supported by the China Earthquake Administration, Institute of Seismology Foundation (IS201526246)
文摘According to groundwater level monitoring data of Shuping landslide in the Three Gorges Reservoir area, based on the response relationship between influential factors such as rainfall and reservoir level and the change of groundwater level, the influential factors of groundwater level were selected. Then the classification and regression tree(CART) model was constructed by the subset and used to predict the groundwater level. Through the verification, the predictive results of the test sample were consistent with the actually measured values, and the mean absolute error and relative error is 0.28 m and 1.15%respectively. To compare the support vector machine(SVM) model constructed using the same set of factors, the mean absolute error and relative error of predicted results is 1.53 m and 6.11% respectively. It is indicated that CART model has not only better fitting and generalization ability, but also strong advantages in the analysis of landslide groundwater dynamic characteristics and the screening of important variables. It is an effective method for prediction of ground water level in landslides.
基金supported by the National Natural Science Foundation of China (Grant No.42101403)the National Key Researchand Development Program of China (Grant No.2017YFD0600404)。
文摘Although airborne hyperspectral data with detailed spatial and spectral information has demonstrated significant potential for tree species classification,it has not been widely used over large areas.A comprehensive process based on multi-flightline airborne hyperspectral data is lacking over large,forested areas influenced by both the effects of bidirectional reflectance distribution function(BRDF)and cloud shadow contamination.In this study,hyperspectral data were collected over the Mengjiagang Forest Farm in Northeast China in the summer of 2017 using the Chinese Academy of Forestry's LiDAR,CCD,and hyperspectral systems(CAF-LiCHy).After BRDF correction and cloud shadow detection processing,a tree species classification workflow was developed for sunlit and cloud-shaded forest areas with input features of minimum noise fraction reduced bands,spectral vegetation indices,and texture information.Results indicate that BRDF-corrected sunlit hyperspectral data can provide a stable and high classification accuracy based on representative training data.Cloud-shaded pixels also have good spectral separability for species classification.The red-edge spectral information and ratio-based spectral indices with high importance scores are recommended as input features for species classification under varying light conditions.According to the classification accuracies through field survey data at multiple spatial scales,it was found that species classification within an extensive forest area using airborne hyperspectral data under various illuminations can be successfully carried out using the effective radiometric consistency process and feature selection strategy.
文摘The increase of competition, economic recession and financial crises has increased business failure and depending on this the researchers have attempted to develop new approaches which can yield more correct and more reliable results. The classification and regression tree (CART) is one of the new modeling techniques which is developed for this purpose. In this study, the classification and regression trees method is explained and tested the power of the financial failure prediction. CART is applied for the data of industry companies which is trade in Istanbul Stock Exchange (ISE) between 1997-2007. As a result of this study, it has been observed that, CART has a high predicting power of financial failure one, two and three years prior to failure, and profitability ratios being the most important ratios in the prediction of failure.
文摘The trend toward designing an intelligent distribution system based on students’individual differences and individual needs has taken precedence in view of the traditional dormitory distribution system,which neglects the students’personality traits,causes dormitory disputes,and affects the students’quality of life and academic quality.This paper collects freshmen's data according to college students’personal preferences,conducts a classification comparison,uses the decision tree classification algorithm based on the information gain principle as the core algorithm of dormitory allocation,determines the description rules of students’personal preferences and decision tree classification preferences,completes the conceptual design of the database of entity relations and data dictionaries,meets students’personality classification requirements for the dormitory,and lays the foundation for the intelligent dormitory allocation system.
基金Project supported by the National Natural Science Foundation of China(Nos.40101014 and 40001008).
文摘A machine-learning approach was developed for automated building of knowledgebases for soil resources mapping by using a classification tree to generate knowledge from trainingdata. With this method, building a knowledge base for automated soil mapping was easier than usingthe conventional knowledge acquisition approach. The knowledge base built by classification tree wasused by the knowledge classifier to perform the soil type classification of Longyou County,Zhejiang Province, China using Landsat TM bi-temporal images and CIS data. To evaluate theperformance of the resultant knowledge bases, the classification results were compared to existingsoil map based on a field survey. The accuracy assessment and analysis of the resultant soil mapssuggested that the knowledge bases built by the machine-learning method was of good quality formapping distribution model of soil classes over the study area.
文摘[Objective] This study aimed to improve the accuracy of remote sensing classification for Dongting Lake Wetland.[Method] Based on the TM data and ground GIS information of Donting Lake,the decision tree classification method was established through the expert classification knowledge base.The images of Dongting Lake wetland were classified into water area,mudflat,protection forest beach,Carem spp beach,Phragmites beach,Carex beach and other water body according to decision tree layers.[Result] The accuracy of decision tree classification reached 80.29%,which was much higher than the traditional method,and the total Kappa coefficient was 0.883 9,indicating that the data accuracy of this method could fulfill the requirements of actual practice.In addition,the image classification results based on knowledge could solve some classification mistakes.[Conclusion] Compared with the traditional method,the decision tree classification based on rules could classify the images by using various conditions,which reduced the data processing time and improved the classification accuracy.
文摘With the increasing interest in e-commerce shopping, customer reviews have become one of the most important elements that determine customer satisfaction regarding products. This demonstrates the importance of working with Text Mining. This study is based on The Women’s Clothing E-Commerce Reviews database, which consists of reviews written by real customers. The aim of this paper is to conduct a Text Mining approach on a set of customer reviews. Each review was classified as either a positive or negative review by employing a classification method. Four tree-based methods were applied to solve the classification problem, namely Classification Tree, Random Forest, Gradient Boosting and XGBoost. The dataset was categorized into training and test sets. The results indicate that the Random Forest method displays an overfitting, XGBoost displays an overfitting if the number of trees is too high, Classification Tree is good at detecting negative reviews and bad at detecting positive reviews and the Gradient Boosting shows stable values and quality measures above 77% for the test dataset. A consensus between the applied methods is noted for important classification terms.
基金National Natural Science Foundation of China(No.61163010)
文摘This paper presents a supervised learning algorithm for retinal vascular segmentation based on classification and regression tree (CART) algorithm and improved adptive bosting (AdaBoost). Local binary patterns (LBP) texture features and local features are extracted by extracting,reversing,dilating and enhancing the green components of retinal images to construct a 17-dimensional feature vector. A dataset is constructed by using the feature vector and the data manually marked by the experts. The feature is used to generate CART binary tree for nodes,where CART binary tree is as the AdaBoost weak classifier,and AdaBoost is improved by adding some re-judgment functions to form a strong classifier. The proposed algorithm is simulated on the digital retinal images for vessel extraction (DRIVE). The experimental results show that the proposed algorithm has higher segmentation accuracy for blood vessels,and the result basically contains complete blood vessel details. Moreover,the segmented blood vessel tree has good connectivity,which basically reflects the distribution trend of blood vessels. Compared with the traditional AdaBoost classification algorithm and the support vector machine (SVM) based classification algorithm,the proposed algorithm has higher average accuracy and reliability index,which is similar to the segmentation results of the state-of-the-art segmentation algorithm.
基金The National Natural Science Foundation of China under contract No.42076235.
文摘Antarctic sea ice is an important part of the Earth’s atmospheric system,and satellite remote sensing is an important technology for observing Antarctic sea ice.Whether Chinese Haiyang-2B(HY-2B)satellite altimeter data could be used to estimate sea ice freeboard and provide alternative Antarctic sea ice thickness information with a high precision and long time series,as other radar altimetry satellites can,needs further investigation.This paper proposed an algorithm to discriminate leads and then retrieve sea ice freeboard and thickness from HY-2B radar altimeter data.We first collected the Moderate-resolution Imaging Spectroradiometer ice surface temperature(IST)product from the National Aeronautics and Space Administration to extract leads from the Antarctic waters and verified their accuracy through Sentinel-1 Synthetic Aperture Radar images.Second,a surface classification decision tree was generated for HY-2B satellite altimeter measurements of the Antarctic waters to extract leads and calculate local sea surface heights.We then estimated the Antarctic sea ice freeboard and thickness based on local sea surface heights and the static equilibrium equation.Finally,the retrieved HY-2B Antarctic sea ice thickness was compared with the CryoSat-2 sea ice thickness and the Antarctic Sea Ice Processes and Climate(ASPeCt)ship-based observed sea ice thickness.The results indicate that our classification decision tree constructed for HY-2B satellite altimeter measurements was reasonable,and the root mean square error of the obtained sea ice thickness compared to the ship measurements was 0.62 m.The proposed sea ice thickness algorithm for the HY-2B radar satellite fills a gap in this application domain for the HY-series satellites and can be a complement to existing Antarctic sea ice thickness products;this algorithm could provide long-time-series and large-scale sea ice thickness data that contribute to research on global climate change.
基金supported by the Environment Research and Technology Development Fund (S-14) of the Ministry of the EnvironmentJapan and JSPS KAKENHI Grant Numbers 15H02833
文摘We built a classification tree (CT) model to estimate climatic factors controlling the cold temperate coniferous forest (CTCF) distributions in Yunnan province and to predict its potential habitats under the current and future climates, using seven climate change scenarios, projected over the years of 2070-2099. The accurate CT model on CTCFs showed that minimum temperature of coldest month (TMW) was the overwhelmingly potent factor among the six climate variables. The areas of TMW〈-4.05 were suitable habitats of CTCF, and the areas of -1.35 〈 TMW were non-habitats, where temperate conifer and broad-leaved mixed forests (TCBLFs) were distribute in lower elevation, bordering on the CTCF. Dominant species of Abies, Picea, and Larix in the CTCFs, are more tolerant to winter coldness than Tsuga and broad-leaved trees including deciduous broad-leaved Acer and Betula, evergreen broad- leaved Cyclobalanopsis and Lithocarpus in TCBLFs. Winter coldness may actually limit the cool-side distributions of TCBLFs in the areas between -1.35℃ and -4.05℃, and the warm-side distributions of CTCFs may be controlled by competition to the species of TCBLFs. Under future climate scenarios, the vulnerable area, where current potential (suitable + marginal) habitats (80,749 km^2) shift to non-habitats, was predicted to decrease to 55.91% (45,053 km^2) of the current area. Inferring from the current vegetation distribution pattern, TCBLFs will replace declining CTCFs. Vulnerable areas predicted by models are important in determining priority of ecosystem conservation.
文摘In enterprise operations,maintaining manual rules for enterprise processes can be expensive,time-consuming,and dependent on specialized domain knowledge in that enterprise domain.Recently,rule-generation has been automated in enterprises,particularly through Machine Learning,to streamline routine tasks.Typically,these machine models are black boxes where the reasons for the decisions are not always transparent,and the end users need to verify the model proposals as a part of the user acceptance testing to trust it.In such scenarios,rules excel over Machine Learning models as the end-users can verify the rules and have more trust.In many scenarios,the truth label changes frequently thus,it becomes difficult for the Machine Learning model to learn till a considerable amount of data has been accumulated,but with rules,the truth can be adapted.This paper presents a novel framework for generating human-understandable rules using the Classification and Regression Tree(CART)decision tree method,which ensures both optimization and user trust in automated decision-making processes.The framework generates comprehensible rules in the form of if condition and then predicts class even in domains where noise is present.The proposed system transforms enterprise operations by automating the production of human-readable rules from structured data,resulting in increased efficiency and transparency.Removing the need for human rule construction saves time and money while guaranteeing that users can readily check and trust the automatic judgments of the system.The remarkable performance metrics of the framework,which achieve 99.85%accuracy and 96.30%precision,further support its efficiency in translating complex data into comprehensible rules,eventually empowering users and enhancing organizational decision-making processes.
基金This study was supported by the Key Science and Technology Program of Inner Mongolia(Grant No.ZDZX2018020,2020GG0007,2019GG009)Natural Science Founda-tion of Inner Mongolia(Grant No.2020MS03068)+1 种基金Research Project of China Institute of Water Resources and Hydropower Research(Grant No.MK2019J02)Grassland Talents Program of Inner Mongolia(Grant No.CYYC9013).
文摘The accurate prediction of poverty is critical to efforts of poverty reduction,and high-resolution remote sensing(HRRS)data have shown great promise for facilitating such prediction.Accordingly,the present study used HRRS with 1 m resolution and 238 households data to evaluate the utility and optimal scale of HRRS data for predicting household poverty in a grassland region of Inner Mongolia,China.The prediction of household poverty was improved by using remote sensing indicators at multiple scales,instead of indicators at a single scale,and a model that combined indicators from four scales(building land,household,neighborhood,and regional)provided the most accurate prediction of household poverty,with testing and training accuracies of 48.57%and 70.83%,respectively.Furthermore,building area was the most efficient indicator of household poverty.When compared to conducting household surveys,the analysis of HRRS data is a cheaper and more time-efficient method for predicting household poverty and,in this case study,it reduced study time and cost by about 75%and 90%,respectively.This study provides the first evaluation of HRRS data for the prediction of household poverty in pastoral areas and thus provides technical support for the identification of poverty in pastoral areas around the world.
文摘Here,we demonstrate the application of Decision Tree Classification(DTC)method for lithological mapping from multi-spectral satellite imagery.The area of investigation is the Lake Magadi in the East African Rift Valley in Kenya.The work involves the collection of rock and soil samples in the field,their analyses using reflectance and emittance spectroscopy,and the processing and interpretation of Advanced Spaceborne Thermal Emission and Reflection Radiometer data through the DTC method.The latter method is strictly non-parametric,flexible and simple which does not require assumptions regarding the distributions of the input data.It has been successfully used in a wide range of classification problems.The DTC method successfully mapped the chert and trachyte series rocks,including clay minerals and evaporites of the area with higher overall accuracy(86%).Higher classification accuracies of the developed decision tree suggest its ability to adapt to noise and nonlinear relations often observed on the surface materials in space-borne spectral image data without making assumptions on the distribution of input data.Moreover,the present work found the DTC method useful in mapping lithological variations in the vast rugged terrain accurately,which are inherently equipped with different sources of noises even when subjected to considerable radiance and atmospheric correction.
文摘Asphaltenes have always been an attractive subject for researchers.However,the application of this fraction of the geochemical field has only been studied in a limited way.In other words,despite many studies on asphaltene structure,the application of asphaltene structures in organic geochemistry has not so far been assessed.Oil-oil correlation is a wellknown concept in geochemical studies and plays a vital role in basin modeling and the reconstruction of the burial history of basin sediments,as well as accurate characterization of the relevant petroleum system.This study aims to propose the Xray diffraction(XRD)technique as a novel method for oil-oil correlation and investigate its reliability and accuracy for different crude oils.To this end,13 crude oil samples from the Iranian sector of the Persian Gulf region,which had previously been correlated by traditional geochemical tools such as biomarker ratios and isotope values,in four distinct genetic groups,were selected and their asphaltene fractions analyzed by two prevalent methods of XRD and Fouriertransform infrared spectroscopy(FTIR).For oil-oil correlation assessment,various cross-plots,as well as principal component analysis(PCA),were conducted,based on the structural parameters of the studied asphaltenes.The results indicate that asphaltene structural parameters can also be used for oil-oil correlation purposes,their results being completely in accord with the previous classifications.The average values of distance between saturated portions(d_(r))and the distance between two aromatic layers(d_(m))of asphaltene molecules belonging to the studied oil samples are 4.69Aand 3.54A,respectively.Furthermore,the average diameter of the aromatic sheets(L_(a)),the height of the clusters(L_(c)),the number of carbons per aromatic unit(C_(au)),the number of aromatic rings per layer(R_(a)),the number of sheets in the cluster(M_(e))and aromaticity(f_(a))values of these asphaltene samples are 10.09A,34.04A,17.42A,3.78A,10.61Aand 0.26A,respectively.The results of XRD parameters indicate that plots of dr vs.d_(m),d_(r) vs.M_(e),d_(r) vs.f_(a),d_(m) vs.L_(c),L_(c) vs.L_(a),and f_(a) vs.L_(a) perform appropriately for distinguishing genetic groups.A comparison between XRD and FTIR results indicated that the XRD method is more accurate for this purpose.In addition,decision tree classification,one of the most efficacious approaches of machine learning,was employed for the geochemical groups of this study for the first time.This tree,which was constructed using XRD data,can distinguish genetic groups accurately and can also determine the characteristics of each geochemical group.In conclusion,the obtaining of structural parameters for asphaltene by the XRD technique is a novel,precise and inexpensive method,which can be deployed as a new approach for oil-oil correlation goals.The findings of this study can help in the prompt determination of genetic groups as a screening method and can also be useful for assessing oil samples affected by secondary processes.
文摘BACKGROUND Liver disease indicates any pathology that can harm or destroy the liver or prevent it from normal functioning.The global community has recently witnessed an increase in the mortality rate due to liver disease.This could be attributed to many factors,among which are human habits,awareness issues,poor healthcare,and late detection.To curb the growing threats from liver disease,early detection is critical to help reduce the risks and improve treatment outcome.Emerging technologies such as machine learning,as shown in this study,could be deployed to assist in enhancing its prediction and treatment.AIM To present a more efficient system for timely prediction of liver disease using a hybrid eXtreme Gradient Boosting model with hyperparameter tuning with a view to assist in early detection,diagnosis,and reduction of risks and mortality associated with the disease.METHODS The dataset used in this study consisted of 416 people with liver problems and 167 with no such history.The data were collected from the state of Andhra Pradesh,India,through https://www.kaggle.com/datasets/uciml/indian-liver-patientrecords.The population was divided into two sets depending on the disease state of the patient.This binary information was recorded in the attribute"is_patient".RESULTS The results indicated that the chi-square automated interaction detection and classification and regression trees models achieved an accuracy level of 71.36%and 73.24%,respectively,which was much better than the conventional method.The proposed solution would assist patients and physicians in tackling the problem of liver disease and ensuring that cases are detected early to prevent it from developing into cirrhosis(scarring)and to enhance the survival of patients.The study showed the potential of machine learning in health care,especially as it concerns disease prediction and monitoring.CONCLUSION This study contributed to the knowledge of machine learning application to health and to the efforts toward combating the problem of liver disease.However,relevant authorities have to invest more into machine learning research and other health technologies to maximize their potential.
文摘The contribution of this paper is comparing three popular machine learning methods for software fault prediction. They are classification tree, neural network and case-based reasoning. First, three different classifiers are built based on these three different approaches. Second, the three different classifiers utilize the same product metrics as predictor variables to identify the fault-prone components. Third, the predicting results are compared on two aspects, how good prediction capabilities these models are, and how the models support understanding a process represented by the data.
基金Under the auspices of National Natural Science Foundation of China(No.41671339)
文摘The sub-pixel impervious surface percentage(SPIS) is the fraction of impervious surface area in one pixel,and it is an important indicator of urbanization.Using remote sensing data,the spatial distribution of SPIS values over large areas can be extracted,and these data are significant for studies of urban climate,environment and hydrology.To develop a stabilized,multi-temporal SPIS estimation method suitable for typical temperate semi-arid climate zones with distinct seasons,an optimal model for estimating SPIS values within Beijing Municipality was built that is based on the classification and regression tree(CART) algorithm.First,models with different input variables for SPIS estimation were built by integrating multi-source remote sensing data with other auxiliary data.The optimal model was selected through the analysis and comparison of the assessed accuracy of these models.Subsequently,multi-temporal SPIS mapping was carried out based on the optimal model.The results are as follows:1) multi-seasonal images and nighttime light(NTL) data are the optimal input variables for SPIS estimation within Beijing Municipality,where the intra-annual variability in vegetation is distinct.The different spectral characteristics in the cultivated land caused by the different farming characteristics and vegetation phenology can be detected by the multi-seasonal images effectively.NLT data can effectively reduce the misestimation caused by the spectral similarity between bare land and impervious surfaces.After testing,the SPIS modeling correlation coefficient(r) is approximately 0.86,the average error(AE) is approximately 12.8%,and the relative error(RE) is approximately 0.39.2) The SPIS results have been divided into areas with high-density impervious cover(70%–100%),medium-density impervious cover(40%–70%),low-density impervious cover(10%–40%) and natural cover(0%–10%).The SPIS model performed better in estimating values for high-density urban areas than other categories.3) Multi-temporal SPIS mapping(1991–2016) was conducted based on the optimized SPIS results for 2005.After testing,AE ranges from 12.7% to 15.2%,RE ranges from 0.39 to 0.46,and r ranges from 0.81 to 0.86.It is demonstrated that the proposed approach for estimating sub-pixel level impervious surface by integrating the CART algorithm and multi-source remote sensing data is feasible and suitable for multi-temporal SPIS mapping of areas with distinct intra-annual variability in vegetation.