The selection of a suitable discretization method(DM) to discretize spatially continuous variables(SCVs)is critical in ML-based natural hazard susceptibility assessment. However, few studies start to consider the infl...The selection of a suitable discretization method(DM) to discretize spatially continuous variables(SCVs)is critical in ML-based natural hazard susceptibility assessment. However, few studies start to consider the influence due to the selected DMs and how to efficiently select a suitable DM for each SCV. These issues were well addressed in this study. The information loss rate(ILR), an index based on the information entropy, seems can be used to select optimal DM for each SCV. However, the ILR fails to show the actual influence of discretization because such index only considers the total amount of information of the discretized variables departing from the original SCV. Facing this issue, we propose an index, information change rate(ICR), that focuses on the changed amount of information due to the discretization based on each cell, enabling the identification of the optimal DM. We develop a case study with Random Forest(training/testing ratio of 7 : 3) to assess flood susceptibility in Wanan County, China.The area under the curve-based and susceptibility maps-based approaches were presented to compare the ILR and ICR. The results show the ICR-based optimal DMs are more rational than the ILR-based ones in both cases. Moreover, we observed the ILR values are unnaturally small(<1%), whereas the ICR values are obviously more in line with general recognition(usually 10%–30%). The above results all demonstrate the superiority of the ICR. We consider this study fills up the existing research gaps, improving the MLbased natural hazard susceptibility assessments.展开更多
With the propagation of applications on the internet, the internet has become a great information source which supplies users with valuable information. But it is hard for users to quickly acquire the right informatio...With the propagation of applications on the internet, the internet has become a great information source which supplies users with valuable information. But it is hard for users to quickly acquire the right information on the web. This paper an intelligent agent for internet applications to retrieve and extract web information under user's guidance. The intelligent agent is made up of a retrieval script to identify web sources, an extraction script based on the document object model to express extraction process, a data translator to export the extracted information into knowledge bases with frame structures, and a data reasoning to reply users' questions. A GUI tool named Script Writer helps to generate the extraction script visually, and knowledge rule databases help to extract wanted information and to generate the answer to questions.展开更多
文摘The selection of a suitable discretization method(DM) to discretize spatially continuous variables(SCVs)is critical in ML-based natural hazard susceptibility assessment. However, few studies start to consider the influence due to the selected DMs and how to efficiently select a suitable DM for each SCV. These issues were well addressed in this study. The information loss rate(ILR), an index based on the information entropy, seems can be used to select optimal DM for each SCV. However, the ILR fails to show the actual influence of discretization because such index only considers the total amount of information of the discretized variables departing from the original SCV. Facing this issue, we propose an index, information change rate(ICR), that focuses on the changed amount of information due to the discretization based on each cell, enabling the identification of the optimal DM. We develop a case study with Random Forest(training/testing ratio of 7 : 3) to assess flood susceptibility in Wanan County, China.The area under the curve-based and susceptibility maps-based approaches were presented to compare the ILR and ICR. The results show the ICR-based optimal DMs are more rational than the ILR-based ones in both cases. Moreover, we observed the ILR values are unnaturally small(<1%), whereas the ICR values are obviously more in line with general recognition(usually 10%–30%). The above results all demonstrate the superiority of the ICR. We consider this study fills up the existing research gaps, improving the MLbased natural hazard susceptibility assessments.
文摘With the propagation of applications on the internet, the internet has become a great information source which supplies users with valuable information. But it is hard for users to quickly acquire the right information on the web. This paper an intelligent agent for internet applications to retrieve and extract web information under user's guidance. The intelligent agent is made up of a retrieval script to identify web sources, an extraction script based on the document object model to express extraction process, a data translator to export the extracted information into knowledge bases with frame structures, and a data reasoning to reply users' questions. A GUI tool named Script Writer helps to generate the extraction script visually, and knowledge rule databases help to extract wanted information and to generate the answer to questions.