摘要
研究了从基因芯片中挖掘差异双聚类的算法。差异双聚类中的基因在不同类别的数据中表达水准不同,这样的差异双聚类可以有效地找出影响基因表达水平的关键实验因素以及对实验条件敏感的基因。传统的双聚类方法采取分别在两类基因数据中找出聚类,再进行比较以得到最终的差异双聚类,该策略的时间效率不高。为了快速地找出差异双聚类,提出一个全新的基于权值图的差异双聚类方法,该方法的主要创新之处在于直接在由两类数据构成的权值图上挖掘双聚类,避免了分别挖掘再比较的步骤。实验结果证实该算法具有较高的运行效率。
This paper made research on approaches of mining differential biclusters from gene expression data. The gene sets in a differential bicluster showed different expression values in two sample classes. These differential clusters might have meaningful biology significance that some specified experimental conditions were key factor to gene expression values and certain genes were sensitive to these conditions. In general, the traditional bicluster algorithms followed a respective clustering frame- work that they mined clusters in two classes. However, this strategy led to low efficiency in term of time. For efficient bicluster mining, proposed a novel algorithm with the strategy which mine differential biclusters directly on weighted graph corresponding to two sample classes. The main contribution of the algorithm was that avoiding the respective cluster generation in two classes and comparison strategy. The experimental result analysis demonstrates the efficiency of the algorithm.
出处
《计算机应用研究》
CSCD
北大核心
2011年第1期48-50,53,共4页
Application Research of Computers
基金
国家自然科学基金资助项目(60703105)
陕西省自然科学基金资助项目(2007F27)
关键词
聚类
双聚类
差异权值图
子空间聚类
cluster
bicluster
differential weighted graph
subspace clustering