摘要
基因芯片及高通量技术的广泛应用产生了大量的基因表达数据,海量数据蕴涵着非常丰富的生物信息,利用聚类分析可以有效分析蕴含在其中的规律和知识。然而,基因表达数据普遍为高维稀疏矩阵,传统单聚类分析很难寻找到隐藏在海量基因表达数据的局部有用信息。为了更好地挖掘海量数据中有用的局部信息,人们从思想上提出了有别于传统的聚类算法的双聚类概念,其主要强调在聚类时基因和条件的同时性。
Now there are a large number of gene expression data because of the widely used of Gene chip and high-throughput technology, which contains very rich biological information. It can be found more useful information in gene expression data by cluster analysis. However, gene expression data are generally high-dimensional sparse matrices. It is difficult to find useful local information by traditional single cluster analysis that hidden in large number of gene expression data. A double clustering algorithm is proposed to find useful local information in gene expression data matrix. This algorithm is different from the traditional clustering algorithm which mainly emphasizes the simultaneity of genes and conditions in clustering which mainly emphasizes the simultaneity of genes and conditions in clustering.
作者
罗德相
李飞龙
欧旭
LUO Dexiang;LI Feilong;OU Xu(School of Information and Management, Guangxi Medical University,Nanning Guangxi 530021)
出处
《河南科技》
2018年第34期23-25,共3页
Henan Science and Technology
基金
广西医科大学青年科学基金项目(GXMUYSF201342)