摘要
提出了一种基于样本间关系的新聚类方法,从基因表达数据中通过pearson相关系数获得样本间的关系,并用网络的方法表示这种关系,通过该网络的空间结构特征来提取样本间的关系特征,并在这种关系特征空间中进行样本的聚类.该方法能更好地揭示不同类样本间的差异性,具有聚类空间维数低而无需降维的特点.分别采用本方法与现有的聚类方法对真实的基因表达数据进行了聚类分析,实验结果说明该方法能获得更高的聚类正确率,且对于分布混杂的数据的聚类效果也较好.
A new clustering method based on the relationship between patterns is proposed. The relationship between patterns is obtained from gene expression data through the pearson correlation coefficient, which is denoted by a network, the relation feature between patterns is extracted by discovering the structure feature of the network, and clustering is performed in the relation feature space. The proposed method uncovers the dissimilarity between patterns belonging to different classes more effectively, and the dimensionality of the clustering space is so low than there is no need to reduce dimensions. The comparison of the method with the conventional ones shows that the method can obtain a much higher clustering efficiency than other methods and it can lead to a better efficiency even for those data with promiscuous distribution.
出处
《西安电子科技大学学报》
EI
CAS
CSCD
北大核心
2009年第3期502-505,534,共5页
Journal of Xidian University
基金
国家自然科学基金资助(60371044)
关键词
聚类
样本关系网络
结构特征
关系特征
clustering
pattern relation network
structure feature
relation feature