Axiomatization of Shannon entropy is a subject that has received lots of attention in the information theory literature.While Shannon entropy is defined on probability distribution,we define a new type of entropy on t...Axiomatization of Shannon entropy is a subject that has received lots of attention in the information theory literature.While Shannon entropy is defined on probability distribution,we define a new type of entropy on the set of partitions of finite subsets of metric spaces,which has a rich algebraic structure as a partially ordered set.We propose an axiomatization of an entropy-like measure of partitions of sets of objects located in metric spaces,and we derive an analytic expression of this new type of entropy referred to as inertial entropy.This approach starts with the notion of inertia of a partition and includes a study of the behavior of the sum of square errors of a partition.In this context,we characterize the chain of partitions produced by the Ward hierarchical clustering method.Starting from inertial entropies of partitions,we introduce conditional entropies which,in turn,generate metrics on partitions of finite sets.These metrics are used as external validation tools for clusterings of labeled data sets.The metric generated by inertial entropy can be used to validate data clustering for labeled data sets.This type of validation aims to determine to what extend labeling of the data coincides with the clustering obtained algorithmically,and we obtain a high degree of consistency of the data labeling with the results of several hierarchical clusterings.展开更多
Computer simulation for materials processing needs a huge database containing a great deal of various physical properties of materials. In order to employ the accumulated large data on materials heat treatment in the ...Computer simulation for materials processing needs a huge database containing a great deal of various physical properties of materials. In order to employ the accumulated large data on materials heat treatment in the past years, it is significant to develop an intelligent database system. Based on the data mining technology for data analysis, an intelligent database web tool system of computer simulation for heat treatment process named as IndBASEweb-HT was built up. The architecture and the arithmetic of this system as well as its application were introduced.展开更多
文摘Axiomatization of Shannon entropy is a subject that has received lots of attention in the information theory literature.While Shannon entropy is defined on probability distribution,we define a new type of entropy on the set of partitions of finite subsets of metric spaces,which has a rich algebraic structure as a partially ordered set.We propose an axiomatization of an entropy-like measure of partitions of sets of objects located in metric spaces,and we derive an analytic expression of this new type of entropy referred to as inertial entropy.This approach starts with the notion of inertia of a partition and includes a study of the behavior of the sum of square errors of a partition.In this context,we characterize the chain of partitions produced by the Ward hierarchical clustering method.Starting from inertial entropies of partitions,we introduce conditional entropies which,in turn,generate metrics on partitions of finite sets.These metrics are used as external validation tools for clusterings of labeled data sets.The metric generated by inertial entropy can be used to validate data clustering for labeled data sets.This type of validation aims to determine to what extend labeling of the data coincides with the clustering obtained algorithmically,and we obtain a high degree of consistency of the data labeling with the results of several hierarchical clusterings.
文摘Computer simulation for materials processing needs a huge database containing a great deal of various physical properties of materials. In order to employ the accumulated large data on materials heat treatment in the past years, it is significant to develop an intelligent database system. Based on the data mining technology for data analysis, an intelligent database web tool system of computer simulation for heat treatment process named as IndBASEweb-HT was built up. The architecture and the arithmetic of this system as well as its application were introduced.