摘要
揭示生物体内在的调控机制是生物信息学的一项重要研究内容.各种高通量生物数据的涌现,为从基因组的尺度上重构基因调控网络提供了可能.由于单数据源仅能提供关于调控关系的片面信息且存在噪声,因此整合多种生物学数据的方法有望得到可靠性较高的调控网络.提出了一种综合ChIP-chip数据、knock out(敲除)数据和各种条件下的表达谱数据来推断调控关系的新方法.ChIP-chip数据和knock out数据能分别提供转录因子和目标基因对关系的直接物理结合和功能关系的证据,这两类数据的整合有望获得较高的识别准确率.但这两类数据的重合性通常较低,基于共调控的基因通常具有较高的表达相似性这一假设,在一定程度上降低了这两类数据重合性较低所带来的影响.算法所识别的大部分调控关系都被YEASTRACT,高质量ChIP-chip数据和文献所验证,从而证明了该方法在调控关系的预测上具有较高的准确性.与其他方法的比较,也表明了该方法具有较高的预测性能.
Uncovering the underlying regulatory mechanism has become a major research in bioinformatics studies. The availability of various kinds of high-throughput biological data makes the reconstruction of regulatory networks on a genomic scale possible. Since each single data source provides only partial and noisy information of the regulatory relationships, methods combining diverse data sources are expected to get more reliable networks. Here a method was presented to infer the regulatory networks by combining ChIP-chip, TF (transcription factor) knock out and expression data. Since ChIP-chip and TF knock out data provide direct physical binding and functional evidences of relations between TF and target genes, combining these two data is expected to obtain high prediction accuracy. However,the overlap of these two data is low. Based on the assumption that co-regulated genes often have high expression similarity, the method reduced the effect of the low overlap of these two data to some extent. The results show that most inferred regulatory relations are validated by YEASTRACT, high quality ChIP-chip data and literatures, which demonstrate our method is powerful and reliable. Moreover, the comparison between our method and others also shows that it has better performance.
出处
《生物化学与生物物理进展》
SCIE
CAS
CSCD
北大核心
2010年第9期996-1005,共10页
Progress In Biochemistry and Biophysics
基金
国家重点基础研究发展计划(973)(2009CB918404
2006CB910700)
国家高技术研究发展计划(863)(2007AA02Z329)
国际合作项目(2007DFA31040)
国家自然科学基金(30700154
31070746)资助项目~~