Spatial objects have two types of attributes: geometrical attributes and non-geometrical attributes, which belong to two different attribute domains (geometrical and non-geometrical domains). Although geometrically...Spatial objects have two types of attributes: geometrical attributes and non-geometrical attributes, which belong to two different attribute domains (geometrical and non-geometrical domains). Although geometrically scattered in a geometrical domain, spatial objects may be similar to each other in a non-geometrical domain. Most existing clustering algorithms group spatial datasets into different compact regions in a geometrical domain without considering the aspect of a non-geometrical domain. However, many application scenarios require clustering results in which a cluster has not only high proximity in a geometrical domain, but also high similarity in a non-geometrical domain. This means constraints are imposed on the clustering goal from both geometrical and non-geometrical domains simultaneously. Such a clustering problem is called dual clustering. As distributed clustering applications become more and more popular, it is necessary to tackle the dual clustering problem in distributed databases. The DCAD algorithm is proposed to solve this problem. DCAD consists of two levels of clustering: local clustering and global clustering. First, clustering is conducted at each local site with a local clustering algorithm, and the features of local clusters are extracted clustering is obtained based on those features fective and efficient. Second, local features from each site are sent to a central site where global Experiments on both artificial and real spatial datasets show that DCAD is effective and efficient.展开更多
It is a fairly challenging issue to make image repositories easy to be searched and browsed. This depends on a technique--image clustering. Kernel-based clustering algorithm has been one of the most promising clusteri...It is a fairly challenging issue to make image repositories easy to be searched and browsed. This depends on a technique--image clustering. Kernel-based clustering algorithm has been one of the most promising clustering methods in the last few years, beeanse it can handle data with high dimensional complex structure. In this paper, a kernel fuzzy learning (KFL) algorithm is proposed, which takes advantages of the distance kernel trick and the gradient-based fuzzy clustering method to execute the image clustering automatically. Experimental results show that KFL is a more efficient method for image clustering in comparison with recent renorted alternative methods.展开更多
In this research article, we analyze the multimedia data mining and classification algorithm based on database optimization techniques. Of high performance application requirements of various kinds are springing up co...In this research article, we analyze the multimedia data mining and classification algorithm based on database optimization techniques. Of high performance application requirements of various kinds are springing up constantly makes parallel computer system structure is valued by more and more common but the corresponding software system development lags far behind the development of the hardware system, it is more obvious in the field of database technology application. Multimedia mining is different from the low level of computer multimedia processing technology and the former focuses on the extracted from huge multimedia collection mode which focused on specific features of understanding or extraction from a single multimedia objects. Our research provides new paradigm for the methodology which will be meaningful and necessary.展开更多
Most of the earlier work on clustering mainly focused on numeric data whoseinherent geometric properties can be exploited to naturally define distance functions between datapoints. However, data mining applications fr...Most of the earlier work on clustering mainly focused on numeric data whoseinherent geometric properties can be exploited to naturally define distance functions between datapoints. However, data mining applications frequently involve many datasets that also consists ofmixed numeric and categorical attributes. In this paper we present a clustering algorithm which isbased on the k-means algorithm. The algorithm clusters objects with numeric and categoricalattributes in a way similar to k-means. The object similarity measure is derived from both numericand categorical attributes. When applied to numeric data, the algorithm is identical to the k-means.The main result of this paper is to provide a method to update the 'cluster centers' of clusteringobjects described by mixed numeric and categorical attributes in the clustering process to minimizethe clustering cost function. The clustering performance of the algorithm is demonstrated with thetwo well known data sets, namely credit approval and abalone databases.展开更多
基金Funded by the National 973 Program of China (No.2003CB415205)the National Natural Science Foundation of China (No.40523005, No.60573183, No.60373019)the Open Research Fund Program of LIESMARS (No.WKL(04)0303).
文摘Spatial objects have two types of attributes: geometrical attributes and non-geometrical attributes, which belong to two different attribute domains (geometrical and non-geometrical domains). Although geometrically scattered in a geometrical domain, spatial objects may be similar to each other in a non-geometrical domain. Most existing clustering algorithms group spatial datasets into different compact regions in a geometrical domain without considering the aspect of a non-geometrical domain. However, many application scenarios require clustering results in which a cluster has not only high proximity in a geometrical domain, but also high similarity in a non-geometrical domain. This means constraints are imposed on the clustering goal from both geometrical and non-geometrical domains simultaneously. Such a clustering problem is called dual clustering. As distributed clustering applications become more and more popular, it is necessary to tackle the dual clustering problem in distributed databases. The DCAD algorithm is proposed to solve this problem. DCAD consists of two levels of clustering: local clustering and global clustering. First, clustering is conducted at each local site with a local clustering algorithm, and the features of local clusters are extracted clustering is obtained based on those features fective and efficient. Second, local features from each site are sent to a central site where global Experiments on both artificial and real spatial datasets show that DCAD is effective and efficient.
基金Supported by the National Natural Science Foundation of China (No. 61101159, 60872123), the China Postdoctoral Science Foundation (No. 20100480049) and the Fundamental Research Funds for the Central Universities (No. 201 IZM0033)
文摘It is a fairly challenging issue to make image repositories easy to be searched and browsed. This depends on a technique--image clustering. Kernel-based clustering algorithm has been one of the most promising clustering methods in the last few years, beeanse it can handle data with high dimensional complex structure. In this paper, a kernel fuzzy learning (KFL) algorithm is proposed, which takes advantages of the distance kernel trick and the gradient-based fuzzy clustering method to execute the image clustering automatically. Experimental results show that KFL is a more efficient method for image clustering in comparison with recent renorted alternative methods.
文摘In this research article, we analyze the multimedia data mining and classification algorithm based on database optimization techniques. Of high performance application requirements of various kinds are springing up constantly makes parallel computer system structure is valued by more and more common but the corresponding software system development lags far behind the development of the hardware system, it is more obvious in the field of database technology application. Multimedia mining is different from the low level of computer multimedia processing technology and the former focuses on the extracted from huge multimedia collection mode which focused on specific features of understanding or extraction from a single multimedia objects. Our research provides new paradigm for the methodology which will be meaningful and necessary.
文摘Most of the earlier work on clustering mainly focused on numeric data whoseinherent geometric properties can be exploited to naturally define distance functions between datapoints. However, data mining applications frequently involve many datasets that also consists ofmixed numeric and categorical attributes. In this paper we present a clustering algorithm which isbased on the k-means algorithm. The algorithm clusters objects with numeric and categoricalattributes in a way similar to k-means. The object similarity measure is derived from both numericand categorical attributes. When applied to numeric data, the algorithm is identical to the k-means.The main result of this paper is to provide a method to update the 'cluster centers' of clusteringobjects described by mixed numeric and categorical attributes in the clustering process to minimizethe clustering cost function. The clustering performance of the algorithm is demonstrated with thetwo well known data sets, namely credit approval and abalone databases.