In recent years,the rough set(RS)method has been in common use for remotesensing classification,which provides one of the techniques of information extraction for Digital Earth.The discretization of remotely sensed d...In recent years,the rough set(RS)method has been in common use for remotesensing classification,which provides one of the techniques of information extraction for Digital Earth.The discretization of remotely sensed data is an important data preprocessing approach in classical RS-based remote-sensing classification.Appropriate discretization methods can improve the adaptability of the classification rules and increase the accuracy of the remote-sensing classification.To assess the performance of discretization methods this article adopts three indicators,which are the compression capability indicator(CCI),consistency indicator(CI),and number of the cut points(NCP).An appropriate discretization method for the RS-based classification of a given remotely sensed image can be found by comparing the values of the three indicators and the classification accuracies of the discretized remotely sensed images obtained with the different discretization methods.To investigate the effectiveness of our method,this article applies three discretization methods of the Entropy/MDL,Naive,and SemiNaive to a TM image and three indicators for these discretization methods are then calculated.After comparing the three indicators and the classification accuracies of the discretized remotely sensed images,it has been found that the SemiNaive method significantly reduces large quantities of data and also keeps satisfactory classification accuracy.展开更多
Feature selection is always an important issue in the research on data mining technologies. However, the problem of optimal feature selection is NP hard. Therefore, heuristic approaches are more practical to actual le...Feature selection is always an important issue in the research on data mining technologies. However, the problem of optimal feature selection is NP hard. Therefore, heuristic approaches are more practical to actual learning systems. Usually, that kind of algorithm selects features with the help of a heuristic metric compactum to measure the relative importance of features in a learning system. Here a new notion of ‘system entropy’ is described in terms of rough set theory, and then some of its algebraic characteristics are studied. After its intrinsic value biase is effectively counteracted, the system entropy is applied in BSE, a new heuristic algorithm for feature selection. BSE is efficient, whose time complexity is lower than that of analogous algorithms; BSE is also effective, which can produce the optimal results in the mini-feature biased sense from varieties of learning systems. Besides, BSE is tolerant and also flexible to the inconsistency of a learning system, consequently able to elegantly handle data noise in the learning system.展开更多
Rough Set is a valid mathematical theory developed in recent years, which has been applied successfully in such fields as machine learning, data mining, intelligent data analyzing and control algorithm acquiring. In t...Rough Set is a valid mathematical theory developed in recent years, which has been applied successfully in such fields as machine learning, data mining, intelligent data analyzing and control algorithm acquiring. In this paper, the authors discuss the reduction of knowledge using conditional entropy in rough set theory. First, the changing tendency of the conditional entropy of decision attributes giving condition attributes is studied from the viewpoint of information. Next, a new reduction algorithm based on conditional entropy is developed. Furthermore, our simulation results show that the algorithm can find the minimal reduction in most cases.展开更多
The governing factors that influence landslide occurrences are complicated by the different soil conditions at various sites.To resolve the problem,this study focused on spatial information technology to collect data ...The governing factors that influence landslide occurrences are complicated by the different soil conditions at various sites.To resolve the problem,this study focused on spatial information technology to collect data and information on geology.GIS,remote sensing and digital elevation model(DEM) were used in combination to extract the attribute values of the surface material in the vast study area of SheiPa National Park,Taiwan.The factors influencing landslides were collected and quantification values computed.The major soil component of loam and gravel in the Shei-Pa area resulted in different landslide problems.The major factors were successfully extracted from the influencing factors.Finally,the discrete rough set(DRS) classifier was used as a tool to find the threshold of each attribute contributing to landslide occurrence,based upon the knowledge database.This rule-based knowledge database provides an effective and urgent system to manage landslides.NDVI(Normalized Difference Vegetation Index),VI(Vegetation Index),elevation,and distance from the road are the four major influencing factors for landslide occurrence.The landslide hazard potential diagrams(landslide susceptibility maps) were drawn and a rational accuracy rate of landslide was calculated.This study thus offers a systematic solution to the investigation of landslide disasters.展开更多
Feature selection(FS) aims to determine a minimal feature(attribute) subset from a problem domain while retaining a suitably high accuracy in representing the original features. Rough set theory(RST) has been us...Feature selection(FS) aims to determine a minimal feature(attribute) subset from a problem domain while retaining a suitably high accuracy in representing the original features. Rough set theory(RST) has been used as such a tool with much success. RST enables the discovery of data dependencies and the reduction of the number of attributes contained in a dataset using the data alone,requiring no additional information. This paper describes the fundamental ideas behind RST-based approaches,reviews related FS methods built on these ideas,and analyses more frequently used RST-based traditional FS algorithms such as Quickreduct algorithm,entropy based reduct algorithm,and relative reduct algorithm. It is found that some of the drawbacks in the existing algorithms and our proposed improved algorithms can overcome these drawbacks. The experimental analyses have been carried out in order to achieve the efficiency of the proposed algorithms.展开更多
基金This work was supported in part by the National Natural Science Foundation of China(Grant No.40971222)the National High Technology Research and Development Program of China(Grant No.2006AA120106)。
文摘In recent years,the rough set(RS)method has been in common use for remotesensing classification,which provides one of the techniques of information extraction for Digital Earth.The discretization of remotely sensed data is an important data preprocessing approach in classical RS-based remote-sensing classification.Appropriate discretization methods can improve the adaptability of the classification rules and increase the accuracy of the remote-sensing classification.To assess the performance of discretization methods this article adopts three indicators,which are the compression capability indicator(CCI),consistency indicator(CI),and number of the cut points(NCP).An appropriate discretization method for the RS-based classification of a given remotely sensed image can be found by comparing the values of the three indicators and the classification accuracies of the discretized remotely sensed images obtained with the different discretization methods.To investigate the effectiveness of our method,this article applies three discretization methods of the Entropy/MDL,Naive,and SemiNaive to a TM image and three indicators for these discretization methods are then calculated.After comparing the three indicators and the classification accuracies of the discretized remotely sensed images,it has been found that the SemiNaive method significantly reduces large quantities of data and also keeps satisfactory classification accuracy.
文摘Feature selection is always an important issue in the research on data mining technologies. However, the problem of optimal feature selection is NP hard. Therefore, heuristic approaches are more practical to actual learning systems. Usually, that kind of algorithm selects features with the help of a heuristic metric compactum to measure the relative importance of features in a learning system. Here a new notion of ‘system entropy’ is described in terms of rough set theory, and then some of its algebraic characteristics are studied. After its intrinsic value biase is effectively counteracted, the system entropy is applied in BSE, a new heuristic algorithm for feature selection. BSE is efficient, whose time complexity is lower than that of analogous algorithms; BSE is also effective, which can produce the optimal results in the mini-feature biased sense from varieties of learning systems. Besides, BSE is tolerant and also flexible to the inconsistency of a learning system, consequently able to elegantly handle data noise in the learning system.
文摘Rough Set is a valid mathematical theory developed in recent years, which has been applied successfully in such fields as machine learning, data mining, intelligent data analyzing and control algorithm acquiring. In this paper, the authors discuss the reduction of knowledge using conditional entropy in rough set theory. First, the changing tendency of the conditional entropy of decision attributes giving condition attributes is studied from the viewpoint of information. Next, a new reduction algorithm based on conditional entropy is developed. Furthermore, our simulation results show that the algorithm can find the minimal reduction in most cases.
基金National Science Council(102-2313-b-275-001),which sponsored this work
文摘The governing factors that influence landslide occurrences are complicated by the different soil conditions at various sites.To resolve the problem,this study focused on spatial information technology to collect data and information on geology.GIS,remote sensing and digital elevation model(DEM) were used in combination to extract the attribute values of the surface material in the vast study area of SheiPa National Park,Taiwan.The factors influencing landslides were collected and quantification values computed.The major soil component of loam and gravel in the Shei-Pa area resulted in different landslide problems.The major factors were successfully extracted from the influencing factors.Finally,the discrete rough set(DRS) classifier was used as a tool to find the threshold of each attribute contributing to landslide occurrence,based upon the knowledge database.This rule-based knowledge database provides an effective and urgent system to manage landslides.NDVI(Normalized Difference Vegetation Index),VI(Vegetation Index),elevation,and distance from the road are the four major influencing factors for landslide occurrence.The landslide hazard potential diagrams(landslide susceptibility maps) were drawn and a rational accuracy rate of landslide was calculated.This study thus offers a systematic solution to the investigation of landslide disasters.
基金supported by the UGC, SERO, Hyderabad under FDP during XI plan period, and the UGC, New Delhi for financial assistance under major research project Grant No. F-34-105/2008
文摘Feature selection(FS) aims to determine a minimal feature(attribute) subset from a problem domain while retaining a suitably high accuracy in representing the original features. Rough set theory(RST) has been used as such a tool with much success. RST enables the discovery of data dependencies and the reduction of the number of attributes contained in a dataset using the data alone,requiring no additional information. This paper describes the fundamental ideas behind RST-based approaches,reviews related FS methods built on these ideas,and analyses more frequently used RST-based traditional FS algorithms such as Quickreduct algorithm,entropy based reduct algorithm,and relative reduct algorithm. It is found that some of the drawbacks in the existing algorithms and our proposed improved algorithms can overcome these drawbacks. The experimental analyses have been carried out in order to achieve the efficiency of the proposed algorithms.