Among the available clustering algorithms in data mining, the CLOPE algorithm attracts much more attention with its high speed and good performance. However, the proper choice of some parameters in the CLOPE algorithm...Among the available clustering algorithms in data mining, the CLOPE algorithm attracts much more attention with its high speed and good performance. However, the proper choice of some parameters in the CLOPE algorithm directly affects the validity of the clustering results, which is still an open issue. For this purpose, this paper proposes a fuzzy CLOPE algorithm, and presents a method for the optimal parameter choice by defining a modified partition fuzzy degree as a clustering validity function. The experimental results with real data set illustrate the effectiveness of the proposed fuzzy CLOPE algorithm and optimal parameter choice method based on the modified partition fuzzy degree.展开更多
The toxicity and bioaccumulation of selenite in four microalgae, Spirulina platensis, Dunaliella salina, Dunaliella bardawill and Phaeodactylum tricornutum cultured in the presence of selenite were investigated. Lower...The toxicity and bioaccumulation of selenite in four microalgae, Spirulina platensis, Dunaliella salina, Dunaliella bardawill and Phaeodactylum tricornutum cultured in the presence of selenite were investigated. Lower concentrations of selenite were generally nontoxic and frequently stimulated algal growth, while higher concentrations of selenite inhibited algal growth. Selenite was more toxic to D. salina and D. bardawill than to S. platensis and P. tricornutum . All algae cultured in selenite were able to incorporate Se to different degrees, which depended on algal species. The distributions of selenite among intracellular macromolecular compounds were different among algal species: most of the selenite was associated with proteins in S. platensis, D. salina and D. bardawill , while most of the selenite was associated with lipids in P. tricornutum , which reflected the physiological differences among the algae. These observations suggest that algae are able to accumulate selenite and bind it with intracellular macromolecular compounds when exposed to high concentration of selenite. This may represent a form of storage or detoxification of selenite by the algae.展开更多
Wireless sensor networks (WSNs) can be used to collect surrounding data by multi-hop. As sensor networks have the constrained and not rechargeable energy resource, energy efficiency is an important design issue for ...Wireless sensor networks (WSNs) can be used to collect surrounding data by multi-hop. As sensor networks have the constrained and not rechargeable energy resource, energy efficiency is an important design issue for its topology. In this paper, the energy consumption issue under the different topology is studied. We derive the exact mathematical expression of energy consumption for the fiat and clustering scheme, respectively. Then the energy consumptions of different schemes are compared. By the comparison, multi-level clustering scheme is more energy efficient in large scale networks. Simulation results demonstrate that our analysis is correct from the view of prolonging the large-scale network lifetime and achieving more power reductions.展开更多
One of the most important problems of clustering is to define the number of classes. In fact, it is not easy to find an appropriate method to measure whether the cluster configuration is acceptable or not. In this pap...One of the most important problems of clustering is to define the number of classes. In fact, it is not easy to find an appropriate method to measure whether the cluster configuration is acceptable or not. In this paper we propose a possible and non-automatic solution considering different criteria of clustering and comparing their results. In this way robust structures of an analyzed dataset can be often caught (or established) and an optimal cluster configuration, which presents a meaningful association, may be defined. In particular, we also focus on the variables which may be used in cluster analysis. In fact, variables which contain little clustering information can cause misleading and not-robustness results. Therefore, three algorithms are employed in this study: K-means partitioning methods, Partitioning Around Medoids (PAM) and the Heuristic Identification of Noisy Variables (HINoV). The results are compared with robust methods ones.展开更多
OBJECTIVE: Apply spectral clustering to analyze the patterns of symptoms in patients with chronic gastritis(CG).METHODS: Based on 919 CG subjects, we applied mutual information feature selection to choose the positive...OBJECTIVE: Apply spectral clustering to analyze the patterns of symptoms in patients with chronic gastritis(CG).METHODS: Based on 919 CG subjects, we applied mutual information feature selection to choose the positively correlated symptoms with each pattern.Then, we used the Shi and Malik spectral clustering algorithm to select the top 20 correlated symptoms.RESULTS: We ascertained the results of six patterns.There were three categories for the pattern of accumulation of damp heat in the spleen-stomach(0.00332). There were six categories for the pattern of dampness obstructing the spleen-stomach(0.02466). There were two categories for the pattern of spleen-stomach Qi deficiency(0.013 89).There were three categories for the pattern of spleen-stomach deficiency cold(0.009 15). There were five categories for the pattern of liver-Qistagnation(0.01910).There were four categories for the pattern of stagnant heat in the liver-stomach(0.00585).CONCLUSION: Most of the spectral clustering results of the symptoms of CG patterns were in accordance with clinical experience and Traditional Chinese Medicine theory. Most categories suggested the nature and/or location of the disease.展开更多
Most of the earlier work on clustering mainly focused on numeric data whoseinherent geometric properties can be exploited to naturally define distance functions between datapoints. However, data mining applications fr...Most of the earlier work on clustering mainly focused on numeric data whoseinherent geometric properties can be exploited to naturally define distance functions between datapoints. However, data mining applications frequently involve many datasets that also consists ofmixed numeric and categorical attributes. In this paper we present a clustering algorithm which isbased on the k-means algorithm. The algorithm clusters objects with numeric and categoricalattributes in a way similar to k-means. The object similarity measure is derived from both numericand categorical attributes. When applied to numeric data, the algorithm is identical to the k-means.The main result of this paper is to provide a method to update the 'cluster centers' of clusteringobjects described by mixed numeric and categorical attributes in the clustering process to minimizethe clustering cost function. The clustering performance of the algorithm is demonstrated with thetwo well known data sets, namely credit approval and abalone databases.展开更多
The author reviews some recent developments in Chern-Simons theory on a hyperbolic 3-manifold M with complex gauge group G. The author focuses on the case of G = SL(N, C) and M being a knot complement: M = S^3\ K. The...The author reviews some recent developments in Chern-Simons theory on a hyperbolic 3-manifold M with complex gauge group G. The author focuses on the case of G = SL(N, C) and M being a knot complement: M = S^3\ K. The main result presented in this note is the cluster partition function, a computational tool that uses cluster algebra techniques to evaluate the Chern-Simons path integral for G = SL(N, C). He also reviews various applications and open questions regarding the cluster partition function and some of its relation with string theory.展开更多
基金Supported by the National Natural Science Foundation of China (No.60202004).
文摘Among the available clustering algorithms in data mining, the CLOPE algorithm attracts much more attention with its high speed and good performance. However, the proper choice of some parameters in the CLOPE algorithm directly affects the validity of the clustering results, which is still an open issue. For this purpose, this paper proposes a fuzzy CLOPE algorithm, and presents a method for the optimal parameter choice by defining a modified partition fuzzy degree as a clustering validity function. The experimental results with real data set illustrate the effectiveness of the proposed fuzzy CLOPE algorithm and optimal parameter choice method based on the modified partition fuzzy degree.
文摘The toxicity and bioaccumulation of selenite in four microalgae, Spirulina platensis, Dunaliella salina, Dunaliella bardawill and Phaeodactylum tricornutum cultured in the presence of selenite were investigated. Lower concentrations of selenite were generally nontoxic and frequently stimulated algal growth, while higher concentrations of selenite inhibited algal growth. Selenite was more toxic to D. salina and D. bardawill than to S. platensis and P. tricornutum . All algae cultured in selenite were able to incorporate Se to different degrees, which depended on algal species. The distributions of selenite among intracellular macromolecular compounds were different among algal species: most of the selenite was associated with proteins in S. platensis, D. salina and D. bardawill , while most of the selenite was associated with lipids in P. tricornutum , which reflected the physiological differences among the algae. These observations suggest that algae are able to accumulate selenite and bind it with intracellular macromolecular compounds when exposed to high concentration of selenite. This may represent a form of storage or detoxification of selenite by the algae.
文摘Wireless sensor networks (WSNs) can be used to collect surrounding data by multi-hop. As sensor networks have the constrained and not rechargeable energy resource, energy efficiency is an important design issue for its topology. In this paper, the energy consumption issue under the different topology is studied. We derive the exact mathematical expression of energy consumption for the fiat and clustering scheme, respectively. Then the energy consumptions of different schemes are compared. By the comparison, multi-level clustering scheme is more energy efficient in large scale networks. Simulation results demonstrate that our analysis is correct from the view of prolonging the large-scale network lifetime and achieving more power reductions.
文摘One of the most important problems of clustering is to define the number of classes. In fact, it is not easy to find an appropriate method to measure whether the cluster configuration is acceptable or not. In this paper we propose a possible and non-automatic solution considering different criteria of clustering and comparing their results. In this way robust structures of an analyzed dataset can be often caught (or established) and an optimal cluster configuration, which presents a meaningful association, may be defined. In particular, we also focus on the variables which may be used in cluster analysis. In fact, variables which contain little clustering information can cause misleading and not-robustness results. Therefore, three algorithms are employed in this study: K-means partitioning methods, Partitioning Around Medoids (PAM) and the Heuristic Identification of Noisy Variables (HINoV). The results are compared with robust methods ones.
基金Supported by the National Natural Science Foundation of China[the Patterns Differentiation Mode of Main TCM Clinical Symptoms Based on Complex System Method(No.81270050)Information Extraction From TCM Inquiry and the Deducting Method of Patterns Differentiation Based on Feature Selection(No.30901897)+2 种基金Common Syndrome Diagnosis of Traditional Chinese Medicine Based on The Integration of Four Diagnosis Methods(No.81173199)]College Students' Scientific Innovation Foundation of Shanghai University of TCM[SHUTCMCXHDZ(2011)03]the Foundation for Training Talents of National Basic Scientific Research(No.J1103607)
文摘OBJECTIVE: Apply spectral clustering to analyze the patterns of symptoms in patients with chronic gastritis(CG).METHODS: Based on 919 CG subjects, we applied mutual information feature selection to choose the positively correlated symptoms with each pattern.Then, we used the Shi and Malik spectral clustering algorithm to select the top 20 correlated symptoms.RESULTS: We ascertained the results of six patterns.There were three categories for the pattern of accumulation of damp heat in the spleen-stomach(0.00332). There were six categories for the pattern of dampness obstructing the spleen-stomach(0.02466). There were two categories for the pattern of spleen-stomach Qi deficiency(0.013 89).There were three categories for the pattern of spleen-stomach deficiency cold(0.009 15). There were five categories for the pattern of liver-Qistagnation(0.01910).There were four categories for the pattern of stagnant heat in the liver-stomach(0.00585).CONCLUSION: Most of the spectral clustering results of the symptoms of CG patterns were in accordance with clinical experience and Traditional Chinese Medicine theory. Most categories suggested the nature and/or location of the disease.
文摘Most of the earlier work on clustering mainly focused on numeric data whoseinherent geometric properties can be exploited to naturally define distance functions between datapoints. However, data mining applications frequently involve many datasets that also consists ofmixed numeric and categorical attributes. In this paper we present a clustering algorithm which isbased on the k-means algorithm. The algorithm clusters objects with numeric and categoricalattributes in a way similar to k-means. The object similarity measure is derived from both numericand categorical attributes. When applied to numeric data, the algorithm is identical to the k-means.The main result of this paper is to provide a method to update the 'cluster centers' of clusteringobjects described by mixed numeric and categorical attributes in the clustering process to minimizethe clustering cost function. The clustering performance of the algorithm is demonstrated with thetwo well known data sets, namely credit approval and abalone databases.
基金supported by the U.S.Department of Energy(No.DE-SC0009988)
文摘The author reviews some recent developments in Chern-Simons theory on a hyperbolic 3-manifold M with complex gauge group G. The author focuses on the case of G = SL(N, C) and M being a knot complement: M = S^3\ K. The main result presented in this note is the cluster partition function, a computational tool that uses cluster algebra techniques to evaluate the Chern-Simons path integral for G = SL(N, C). He also reviews various applications and open questions regarding the cluster partition function and some of its relation with string theory.