A Novel Approach to Revealing Positive and Negative Co-Regulated Genes 被引量：2

A Novel Approach to Revealing Positive and Negative Co-Regulated Genes

导出

摘要 As explored by biologists, there is a real and emerging need to identify co-regulated gene clusters, which include both positive and negative regulated gene clusters. However, the existing pattern-based and tendency-based clustering approaches are only designed for finding positive regulated gene clusters. In this paper, a new subspace clustering model called g-Cluster is proposed for gene expression data. The proposed model has the following advantages： 1） find both positive and negative co-regulated genes in a shot, 2） get away from the restriction of magnitude transformation relationship among co-regulated genes, and 3） guarantee quality of clusters and significance of regulations using a novel similarity measurement gCode and a user-specified regulation threshold δ, respectively. No previous work measures up to the task which has been set. Moreover, MDL technique is introduced to avoid insignificant g-Clusters generated. A tree structure, namely GS-tree, is also designed, and two algorithms combined with efficient pruning and optimization strategies to identify all qualified g-Clusters. Extensive experiments are conducted on real and synthetic datasets. The experimental results show that 1） the algorithm is able to find an amount of co-regulated gene clusters missed by previous models, which are potentially of high biological significance, and 2） the algorithms are effective and efficient, and outperform the existing approaches. As explored by biologists, there is a real and emerging need to identify co-regulated gene clusters, which include both positive and negative regulated gene clusters. However, the existing pattern-based and tendency-based clustering approaches are only designed for finding positive regulated gene clusters. In this paper, a new subspace clustering model called g-Cluster is proposed for gene expression data. The proposed model has the following advantages： 1） find both positive and negative co-regulated genes in a shot, 2） get away from the restriction of magnitude transformation relationship among co-regulated genes, and 3） guarantee quality of clusters and significance of regulations using a novel similarity measurement gCode and a user-specified regulation threshold δ, respectively. No previous work measures up to the task which has been set. Moreover, MDL technique is introduced to avoid insignificant g-Clusters generated. A tree structure, namely GS-tree, is also designed, and two algorithms combined with efficient pruning and optimization strategies to identify all qualified g-Clusters. Extensive experiments are conducted on real and synthetic datasets. The experimental results show that 1） the algorithm is able to find an amount of co-regulated gene clusters missed by previous models, which are potentially of high biological significance, and 2） the algorithms are effective and efficient, and outperform the existing approaches.

作者赵宇海王国仁印莹许光宇

机构地区 Department of Computer Science and Engineering

出处《Journal of Computer Science & Technology》 SCIE EI CSCD 2007年第2期261-272,共12页 计算机科学技术学报（英文版）

基金 This work is supported by the National Grand Fundamental Research 973 Program of China （Grant No. 2006CB303103） and the National Natural Science Foundation of China under Grants No. 60573089, No. 60273079 and No. 60473074.

关键词 microarray data pattern-based clustering co-regulated genes microarray data, pattern-based clustering, co-regulated genes

分类号 TP311.13 [自动化与计算机技术—计算机软件与理论]

引文网络
相关文献

参考文献18

1Liu J, Wang W. Op-cluster: Clustering by tendency in high dimensional space. In Proc. ICDM 2003 Conference, Melbourne, USA, 2003, 187-194.
2Haixun Wang, Wei Wang, Jiong Yang, Philip S Yu. Clustering by pattern similarity in large data sets. In Proc. the 2002 A CM SIGMOD Conference, Wisconsin, 2002, pp.394-405.
3Jian Pei, Xiaoling Zhang, Moonjung Cho et al. Maple: Af ast algorithm for maximal pattern-based clustering. In Proc.IGDM 2003 Gonf., Florida, 2003, pp.259-266.
4Haixun Wang, Fang Chu, Wei Fan, Philip S Yu, Jian Pei. A fast algorithm for subspace clustering by pattern similarity. In Proc. Scientific and Statistical Database Management Conference, Santorinl Island, Greece, 2004, pp.51-62.
5Lizhuang Zhao, Mohammed J Zaki. Tricluster: An effective algorithm for mining coherent clusters in 3d microarray data.In Proc. SIGMOD 2005 Conference, Maryland, USA, 2005,pp.51-62.
6Jinze Liu, Jiong Yang, Wei Wang. Biclustering in gene expression data by tendency. In Proc. 3rd Int. IEEE Computer Society Computational Systems Bioinformatics Conf., Stanford, USA, 2004, pp.182-193.
7Selnur Erdal, Ozgur Ozturk, David L Armbruster et al. A time series analysis of mlcroarray data. In Proc. 4th IEEE Int. Symp. Bioinformatics and Bioengineering Conference,Taichung, 2004, pp.366-378.
8Daxin Jiang, chun Tang, Aidong Zhang. Cluster analysis for gene expression data: A survey. IEEE Trans. Knowl. Data Eng., 2004, 16(11): 1370-1386.
9Jason Ernst, Gerard J Nau, Ziv Bar-Joseph. Clustering short time series gene expression data. Bioinformatics, 2005,21(Suppl): 159-168.
10Yizong Cheng, George M Church. Biclustering of expression data. In Proc. 8th Int. Conf. InteUigent Systems for Molecular Biology 2000 Conference, San Diego, USA, 2000, pp.93-103.

同被引文献10

1印莹,赵宇海,张斌,王国仁.时序微阵列数据中的同步和异步共调控基因聚类[J].计算机学报,2007,30(8):1302-1314. 被引量：5
2岳峰,孙亮,王宽全,王永吉,左旺孟.基因表达数据的聚类分析研究进展[J].自动化学报,2008,34(2):113-120. 被引量：25
3闫雷鸣,孙志挥,吴英杰,张柏礼.联合聚类非线性相关的时序基因表达数据[J].计算机研究与发展,2008,45(11):1865-1873. 被引量：5
4邹权,郭茂祖,刘扬,王峻.类别不平衡的分类方法及在生物信息学中的应用[J].计算机研究与发展,2010,47(8):1407-1414. 被引量：26
5陈伟,程咏梅,张绍武,潘泉.邻域种子的启发式454序列聚类方法[J].软件学报,2014,25(5):929-938. 被引量：2
6Amichai Painsky,Saharon Rosset.Optimal Set Cover Formulation for Exclusive Row Biclustering of Gene Expression[J].Journal of Computer Science & Technology,2014,29(3):423-435. 被引量：2
7薛云,傅俊橦,李杰进,王杜齐,邝秋华,张美珍,肖化.基于公共子序列的OPSM双聚类算法[J].华南师范大学学报（自然科学版）,2015,47(4):165-171. 被引量：1
8姜涛,李战怀,尚学群,陈伯林,李卫榜.基因表达数据中局部模式的查询[J].计算机科学,2016,43(7):191-196. 被引量：1
9Tao JIANG,Zhanhuai LI,Xuequn SHANG,Bolin CHEN,Weibang LI,Zhilei YIN.Constrained query of order-preserving submatrix in gene expression data[J].Frontiers of Computer Science,2016,10(6):1052-1066. 被引量：2
10姜涛,李战怀,尚学群,陈伯林,李卫榜,殷知磊.基于数字签名与Trie的保序子矩阵约束查询[J].软件学报,2017,28(8):2175-2195. 被引量：1

引证文献2

1姜涛,李战怀,尚学群,陈伯林,李卫榜.基因表达数据中局部模式的查询[J].计算机科学,2016,43(7):191-196. 被引量：1
2姜涛,李战怀.基因表达数据中的局部模式挖掘研究综述[J].计算机研究与发展,2018,55(11):2343-2360. 被引量：2

二级引证文献3

1姜涛,李战怀.基因表达数据中的局部模式挖掘研究综述[J].计算机研究与发展,2018,55(11):2343-2360. 被引量：2
2段刚龙,王妍,马鑫,杨泽阳.银行客户分类的数据特征选择方法与实证研究[J].计算机工程与应用,2022,58(11):302-312. 被引量：2
3廖旭红,江华,廖莎,李志杰.基于Charm算法挖掘基因表达保序子序列[J].现代计算机,2023,29(14):8-13.

1Dmitry V.Savostyanov.Fast Revealing of Mode Ranks of Tensor in Canonical Form[J].Numerical Mathematics(Theory,Methods and Applications),2009,2(4):439-444.
2Yin Tian,Qian Zhang,De-Zhong Yao.fMRI Study Revealing Neural Mechanisms of the Functions of SOA in Spatial Orienting[J].Journal of Electronic Science and Technology of China,2009,7(3):236-239.
3Shenjiang Li Changcheng Li Xin Wang Debin Liu Wenjie Liang Feng Zhu Yan Zhu Xuefeng Cui Wenjie Bi.The optimal slice thickness of CT in revealing lobulation of malignant solitary pulmonary nodules[J].The Chinese-German Journal of Clinical Oncology,2011,10(10):559-562. 被引量：2
4Isaac Einav,Boris Artemiev,Sergey Zhukov.Revealing the Invisible： A New Approach for Enhancing Industrial Safety, Reliability and Remaining Life Assessment[J].Journal of Chemistry and Chemical Engineering,2015,9(3):191-198.
5Meng Xiaoyang.Advantages ＆ Disadvantages and the Legitimacy of the Influence of Network Consensus[J].学术界,2013(5):249-254.
6周四清,余英林,陈潮填.零知识数字水印检测协议研究[J].计算机科学,2003,30(1):137-138. 被引量：3
7LI RuiQin 1 &DAI Jian S 2 1 College of Mechanical Engineering and Automation,North University of China,Taiyuan 030051,China,2 Department of Mechanical Engineering,King’s College London,University of London,WC2R 2LS,UK.Crank conditions and rotatability of 3-RRR planar parallel mechanisms[J].Science China(Technological Sciences),2009,52(12):3601-3612. 被引量：12
8杨英杰,张清.A NEW METHOD FOR STABILITY ANALYSIS OF UNDERGROUND OPENING USING ARTIFICIAL NEURAL NETWORK[J].Journal of Coal Science & Engineering(China),1996,2(2):16-22.
9Liu Liang,Wu Chunying,Li Shundong.TWO PRIVACY-PRESERVING PROTOCOLS FOR POINT-CURVE RELATION[J].Journal of Electronics(China),2012,29(5):422-430. 被引量：6
10Secure Web Transcation with Anonymous Mobile Agent over Internet[J].Journal of Computer Science & Technology,2003,18(1):84-89.

Journal of Computer Science & Technology

2007年第2期

浏览历史

内容加载中请稍等...

A Novel Approach to Revealing Positive and Negative Co-Regulated Genes 被引量：2

参考文献18

同被引文献10

引证文献2

二级引证文献3

相关作者

相关机构

相关主题

浏览历史