Clustering a social network is a process of grouping social actors into clusters where intra-cluster similarities among actors are higher than inter-cluster similarities. Clustering approaches, i.e. , k-medoids or hie...Clustering a social network is a process of grouping social actors into clusters where intra-cluster similarities among actors are higher than inter-cluster similarities. Clustering approaches, i.e. , k-medoids or hierarchical, use the distance function to measure the dissimilarities among actors. These distance functions need to fulfill various properties, including the triangle inequality (TI). However, in some cases, the triangle inequality might be violated, impacting the quality of the resulting clusters. With experiments, this paper explains how TI violates while performing traditional clustering techniques: k-medoids, hierarchical, DENGRAPH, and spectral clustering on social networks and how the violation of TI affects the quality of the resulting clusters.展开更多
Traditional normalized tree edit distances do not satisfy the triangle inequality. We present a metric normalization method for tree edit distance, which results in a new normalized tree edit distance fulfilling the t...Traditional normalized tree edit distances do not satisfy the triangle inequality. We present a metric normalization method for tree edit distance, which results in a new normalized tree edit distance fulfilling the triangle inequality, under the condition that the weight function is a metric over the set of elementary edit operations with all costs of insertions/deletions having the same weight. We prove that the new distance, in the range [0, 1], is a genuine metric as a simple function of the sizes of two ordered labeled trees and the tree edit distance between them, which can be directly computed through tree edit distance with the same complexity. Based on an efficient algorithm to represent digits as ordered labeled trees, we show that the normalized tree edit metric can provide slightly better results than other existing methods in handwritten digit recognition experiments using the approximating and eliminating search algorithm (AESA) algorithm.展开更多
This is a lecture note of my joint work with Chi-Kwong Li concerning various results on the norm structure of n 2 n matrices (as Hilbert-space operators). The main result says that the triangle inequality serves as th...This is a lecture note of my joint work with Chi-Kwong Li concerning various results on the norm structure of n 2 n matrices (as Hilbert-space operators). The main result says that the triangle inequality serves as the ultimate norm estimate for the upper bounds of summation of two matrices. In the case of summation of two normal matrices, the result turns out to be a norm estimate in terms of the spectral variation for normal matrices.展开更多
文摘Clustering a social network is a process of grouping social actors into clusters where intra-cluster similarities among actors are higher than inter-cluster similarities. Clustering approaches, i.e. , k-medoids or hierarchical, use the distance function to measure the dissimilarities among actors. These distance functions need to fulfill various properties, including the triangle inequality (TI). However, in some cases, the triangle inequality might be violated, impacting the quality of the resulting clusters. With experiments, this paper explains how TI violates while performing traditional clustering techniques: k-medoids, hierarchical, DENGRAPH, and spectral clustering on social networks and how the violation of TI affects the quality of the resulting clusters.
文摘Traditional normalized tree edit distances do not satisfy the triangle inequality. We present a metric normalization method for tree edit distance, which results in a new normalized tree edit distance fulfilling the triangle inequality, under the condition that the weight function is a metric over the set of elementary edit operations with all costs of insertions/deletions having the same weight. We prove that the new distance, in the range [0, 1], is a genuine metric as a simple function of the sizes of two ordered labeled trees and the tree edit distance between them, which can be directly computed through tree edit distance with the same complexity. Based on an efficient algorithm to represent digits as ordered labeled trees, we show that the normalized tree edit metric can provide slightly better results than other existing methods in handwritten digit recognition experiments using the approximating and eliminating search algorithm (AESA) algorithm.
文摘This is a lecture note of my joint work with Chi-Kwong Li concerning various results on the norm structure of n 2 n matrices (as Hilbert-space operators). The main result says that the triangle inequality serves as the ultimate norm estimate for the upper bounds of summation of two matrices. In the case of summation of two normal matrices, the result turns out to be a norm estimate in terms of the spectral variation for normal matrices.