期刊文献+

基因表达数据的并行双向聚类算法 被引量:2

Parallel Biclustering Algorithm for Gene Expressing Data
下载PDF
导出
摘要 基因表达数据的双向聚类问题是生物信息学中的一个重要的问题,通过对基因在各种不同实验条件下的表达数据进行双向聚类,可以分析和识别同类基因所共同拥有的基因功能以及转录调控元件.本文对基因表达数据进行双向聚类的问题进行了深入的研究,提出一种并行算法.该算法根据数据集合的大小对双向聚类质量的反单调性,由最小的数据集合开始逐步添加行或列,最终找到所有满足条件的聚类.实验结果表明,该算法处理速度快,聚类质量高,性能明显优于其它同类算法. Biclustering of the gene expressing data is an important task in bioinformatics. By clustering the gene expressing data obtained under different experimental conditions, function and regulatory elements of the gene sequence can be analyzed and recognized. After studying the problem of gene expressing data analysis, a parallel biclustering algorithm is presented. Based on the anti-monotones property of the quality of the data sets with their sizes, the algorithm starts from the data sets containing of every two rows and every two columns of the data matrix, and gets the final biclusters by gradually adding columns and rows on the data sets, Experimental results show that our algorithm has superiority our other similar algorithms in terms of processing speed and quality of clustering and efficiency.
作者 刘维 陈崚
出处 《小型微型计算机系统》 CSCD 北大核心 2009年第4期683-689,共7页 Journal of Chinese Computer Systems
基金 国家自然科学基金项目(60473012)资助 国家科技攻关项目(2003BA614A-14)资助 江苏省自然科学基金(BK2005047)资助
关键词 基因表达数据 并行算法 生物信息学 双向聚类 sequence comparison parallel algorithm bioinformatics scalability
  • 相关文献

参考文献17

  • 1Getz G, Levine E, Domany E. Coupled two-way clustering analysis of gene microarray data[C]. Proceedings of the Natural Academy of Sciences USA, 2000, 12079-12084.
  • 2Tang Chun,Zhang Li,Zhang Idon,et al. Interrelated two-way clustering: an unsupervised approach for gene expression data analysis[C]. Proceedings of the 2nd IEEE International Symposium on Bioinformatics and Bioengineering,41-48, 2001. Inese-ID Tec. Rep. 1/2004, Jan. 2004,31.
  • 3Hartigan J A. Direct clustering of a data matrix[J]. Journal of the American Statistical Association (JASA), 1972, 67(337) :123-129.
  • 4Cheng Yi-zong, George M Church. Biclustering of expression data[J]. Proceedings of the 8th International Conference on Intelligent Systems for Molecular Biology (ISMB' 00), 2000,93- 103.
  • 5Yang yong,Wang Wei,Wang Hai-xun,et al. Capturing subspace correlation in a large data set[C]. Proceedings of the 18th IEEE International Conference on Data Engineering, 2002,517-528.
  • 6Yang yong,Wang Wei,Wang Hai-xun,et al. Enhanced biclustering on expression data [C]. Proceedings of the 3rd IEEE Conference on Bioinforrnatics and Bioengineering, 2003, 321- 327.
  • 7Yuval Klugar, Ronen Basri, Joseph T Chang,et al. Spectral biclustering of microarray data : coclustering genes and conditions[J]. Genome Research,2003,13(4):703-716.
  • 8Amir Ben-Dor, Benny Chor, Richard Karp, et al. Discovering local structure in gene expression data: the order-preserving submatrix problem [C]. Proceedings of the 6th International Conference on Computacional Biology (RECOMB' 02), 2002, 49-57.
  • 9Zhang Zong-hong, Alvin Teo. Mining deterministic biclusters in gene expression data[C]. Proceedings of the Fourth IEEE Symposium on Bioinformatics and Bioengineering (BIBE'04) ,2004, 2173-2180.
  • 10Wang Hai-xun, Wang wei, Yang yong, et al. Clustering by pattern similarity in large data sets[C]. Proceedings of the 2002 ACM SIGMOD International Conference on Management of Data, 2002,394-405.

同被引文献21

引证文献2

二级引证文献9

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部