scLM:Automatic Detection of Consensus Gene Clusters Across Multiple Single-cell Datasets 被引量：2

导出

摘要 In gene expression profiling studies,including single-cell RNA sequencing(scRNA-seq)analyses,the identification and characterization of co-expressed genes provides critical information on cell identity and function.Gene co-expression clustering in scRNA-seq data presents certain challenges.We show that commonly used methods for single-cell data are not capable of identifying co-expressed genes accurately,and produce results that substantially limit biological expectations of co-expressed genes.Herein,we present single-cell Latent-variable Model(scLM),a gene coclustering algorithm tailored to single-cell data that performs well at detecting gene clusters with significant biologic context.Importantly,scLM can simultaneously cluster multiple single-cell datasets,i.e.,consensus clustering,enabling users to leverage single-cell data from multiple sources for novel comparative analysis.scLM takes raw count data as input and preserves biological variation without being influenced by batch effects from multiple datasets.Results from both simulation data and experimental data demonstrate that scLM outperforms the existing methods with considerably improved accuracy.To illustrate the biological insights of scLM,we apply it to our in-house and public experimental scRNA-seq datasets.scLM identifies novel functional gene modules and refines cell states,which facilitates mechanism discovery and understanding of complex biosystems such as cancers.A user-friendly R package with all the key features of the scLM method is available at https://github.com/QSong-github/scLM.

作者 Qianqian Song Jing Su Lance D.Miller Wei Zhang

机构地区 Center for Cancer Genomics and Precision Oncology Department of Cancer Biology Department of Biostatistics

出处《Genomics, Proteomics & Bioinformatics》 SCIE CAS CSCD 2021年第2期330-341,共12页 基因组蛋白质组与生物信息学报（英文版）

基金 the Cancer Genomics,Tumor Tissue Repository,and Bioinformatics Shared Resources under the NCI Cancer Center Support Grant to the Comprehensive Cancer Center of Wake Forest University Health Sciences,USA(Grant No.P30CA012197)。

关键词 Single-cell RNA sequencing Consensus clustering Latent space Markov Chain Monte Carlo Maximum likelihood approach

分类号 Q811.4 [生物学—生物工程]

引文网络
相关文献

参考文献8

1Xianwen Ren,Liangtao Zheng,Zemin Zhang.SSCC: A Novel Computational Framework for Rapid and Accurate Clustering Large-scale Single Cell RNA-seq Data[J].Genomics, Proteomics & Bioinformatics,2019,17(2):201-210. 被引量：4
2Dongfang Wang,Jin Gu.VASC: Dimension Reduction and Visualization of Single-cell RNA-seq Data by Deep Variational Autoencoder[J].Genomics, Proteomics & Bioinformatics,2018,16(5):320-331. 被引量：8
3Pingjian Yu,Wei Lin.Single-cell Transcriptome Study as Big Data[J].Genomics, Proteomics & Bioinformatics,2016,14(1):21-30. 被引量：2
4Iva Xhangolli,Burak Dura,GeeHee Lee,Dongjoo Kim,Yang Xiao,Rong Fan.Single-cell Analysis of CAR-T Cell Activation Reveals A Mixed T_H1/T_H2 Response Independent of Differentiation[J].Genomics, Proteomics & Bioinformatics,2019,17(2):129-139. 被引量：9
5Ping Wan,Jun WH,Yuan Zhou,Junshu Xiao,Jie Feng,Weizhong Zhao,Shen Xian,Guanglong Jiang,Jake Y.Chen.Computational Analysis of Drought Stress-Associated miRNAs and miRNA Co-Regulation Network in Physcomitrella patens[J].Genomics, Proteomics & Bioinformatics,2011,9(1):37-44. 被引量：2
6Chun-Hou Zheng,De-Shuang Huang,Xiang-Zhen Kong,Xing-Ming Zhao.Gene Expression Data Classification Using Consensus Independent Component Analysis[J].Genomics, Proteomics & Bioinformatics,2008,6(2):74-82. 被引量：7
7Maarten Clements,Eugene P. van Someren,Theo A. Knijnenburg,Marcel J.T. Reinders.Integration of Known Transcription Factor Binding Site Information and Gene Expression Data to Advance from Co-Expression to Co-Regulation[J].Genomics, Proteomics & Bioinformatics,2007,5(2):86-101. 被引量：4
8Cord F.Stehler,Andreas Keller,Petra Leidinger,Christina Backes,Anoop Chandran,Jerg Wischhusen,Benjamin Meder,Eckart Meese.Whole miRNome-wide Differential Co-expression of MicroRNAs[J].Genomics, Proteomics & Bioinformatics,2012,10(5):285-294. 被引量：1

二级参考文献165

1[1]Johansson,(O).,et al.2003.Identification of functional clusters of transcription factor binding motifs in genome sequences:the MSCAN algorithm.Bioinformatics 19:i169-176.
2[2]van Helden,J.,et al.1998.Extracting regulatory sites from the upstream region of yeast genes by computational analysis of oligonucleotide frequencies.J.Mol.Biol.281:827-842.
3[3]Hertz,G.Z.and Stormo,G.D.1999.Identifying DNA and protein patterns with statistically significant alignments of multiple sequences.Bioinformatics 15:563-577.
4[4]Hughes,J.D.,et al.2000.Computational identification of cis-regulatory elements associated with groups of functionally related genes in Saccharomyces cerevisiae.J.Mol.Biol.296:1205-1214.
5[5]Jensen,S.T.,et al.2004.Computational discovery of gene regulatory binding motifs:a Bayesian perspective.Statist.Sci.19:188-204.
6[6]Sinha,S.,et al.2004.PhyME:a probabilistic algorithm for finding motifs in sets of orthologous sequences.BMC Bioinformatics 5:170.
7[7]Roth,F.P.,et al.1998.Finding DNA regulatory motifs within unaligned noncoding sequences clustered by whole-genome mRNA quantitation.Nat.Biotechnol.16:939-945.
8[8]Tavazoie,S.,et al.1999.Systematic determination of genetic network architecture.Nat.Genet.22:281-285.
9[9]Segal,E.,et al.2003.Module networks:identifying regulatory modules and their condition-specific regulators from gene expression data.Nat.Genet.34:166-176.
10[10]Latchman,D.S.2000.Transcription factors as potential targets for therapeutic drugs.Curr.Pharm.Biotechnol.1:57-61.

共引文献27

1杨德武,李霞,肖雪,杨月莹,王靖.离子通道亚型与其基因共表达的关联研究[J].遗传,2008,30(9):1157-1162.
2陆慧娟,陈伍涛,王明怡.样本过滤的基因表达数据分类[J].中国计量学院学报,2009,20(3):254-258. 被引量：2
3刘永宾,高磊,李冬果,李霞.共调控共互作蛋白结构域的特征研究[J].中国优生与遗传杂志,2010,18(3):18-21.
4刘云如,蔡立军,易叶青.基于G-ICA的组织样本分类算法[J].计算机工程与应用,2010,46(31):124-126. 被引量：1
5董园园,李海燕,李校堃,杨树林.microRNA分子表达与调控研究的最新进展[J].中国生物工程杂志,2011,31(12):109-114. 被引量：1
6杨英杰,李红燕,云慧,刘新星.MATLAB 7.X生物信息工具箱的应用——基因表达图谱分析(4)[J].现代生物医学进展,2012,12(20):3938-3942.
7廖力达,何清华,胡钟林,张大庆.强干扰环境中挖掘机噪声独立分量分析[J].中南大学学报（自然科学版）,2012,43(9):3426-3430. 被引量：4
8陆慧娟,安春霖,马小平,郑恩辉,杨小兵.基于输出不一致测度的极限学习机集成的基因表达数据分类[J].计算机学报,2013,36(2):341-348. 被引量：41
9陆慧娟,魏莎莎,关伟,缪燕子.基于鱼群优化算法和Cholesky分解的RELM的基因表达数据分类[J].计算机科学,2014,41(12):226-230. 被引量：3
10胡红萍,高帅,孙强,白艳萍.基于基因表达子宫内膜癌的分类[J].数学的实践与认识,2017,47(18):111-115. 被引量：1

同被引文献33

1Liqun Yang,Pengfei Shi,Gaichao Zhao,Jie Xu,Wen Peng,Jiayi Zhang,Guanghui Zhang,Xiaowen Wang,Zhen Dong,Fei Chen,Hongjuan Cui.Targeting cancer stem cell pathways for cancer therapy[J].Signal Transduction and Targeted Therapy,2020,5(1):2395-2429. 被引量：61
2张晨,苏文.胃癌免疫治疗研究进展[J].肿瘤研究与临床,2021,33(7):481-484. 被引量：2
3Chunyu Huang,Yong Zeng,Wenwei Tu.Single-cell RNA Sequencing Deciphers Immune Landscape of Human Recurrent Miscarriage[J].Genomics, Proteomics & Bioinformatics,2021,19(2):169-171. 被引量：5
4Sarthak Sinha,Ansuman T.Satpathy,Weiqiang Zhou,Hongkai Ji,Jo A.Stratton,Arzina Jaffer,Nizar Bahlis,Sorana Morrissy,Jeff A.Biernaskie.Profiling Chromatin Accessibility at Single-cell Resolution[J].Genomics, Proteomics & Bioinformatics,2021,19(2):172-190. 被引量：1
5Zhiliang Bai,Graham Su,Rong Fan.Single-cell Analysis Technologies for Immuno-oncology Research:from Mechanistic Delineation to Biomarker Discovery[J].Genomics, Proteomics & Bioinformatics,2021,19(2):191-207. 被引量：1
6Feiyang Wang,Wentong Jia,Mengjie Fan,Xuan Shao,Zhilang Li,Yongjie Liu,Yeling Ma,Yu-Xia Li,Rong Li,Qiang Tu,Yan-Ling Wang.Single-cell Immune Landscape of Human Recurrent Miscarriage[J].Genomics, Proteomics & Bioinformatics,2021,19(2):208-222. 被引量：18
7Grace E.Lidgerwood,Anne Senabouth,Casey J.A.Smith-Anttila,Vikkitharan Gnanasambandapillai,Dominik C.Kaczorowski,Daniela Amann-Zalcenstein,Erica L.Fletcher,Shalin H.Naik,Alex W.Hewitt,Joseph E.Powell,Alice Pebay.Transcriptomic Profiling of Human Pluripotent Stem Cell-derived Retinal Pigment Epithelium over Time[J].Genomics, Proteomics & Bioinformatics,2021,19(2):223-242. 被引量：2
8Jozsef A.Balog,Viktor Honti,Eva Kurucz,Beata Kari,Laszlo G.Puskas,Istvan Ando,Gabor J.Szebeni.Immunoprofiling of Drosophila Hemocytes by Single-cell Mass Cytometry[J].Genomics, Proteomics & Bioinformatics,2021,19(2):243-252. 被引量：1
9Xiliang Wang,Yao He,Qiming Zhang,Xianwen Ren,Zemin Zhang.Direct Comparative Analyses of 10X Genomics Chromium and Smart-seq2[J].Genomics, Proteomics & Bioinformatics,2021,19(2):253-266. 被引量：17
10Qianhui Huang,Yu Liu,Yuheng Du,Lana X.Garmire.Evaluation of Cell Type Annotation R Packages on Single-cell RNA-seq[J].Genomics, Proteomics & Bioinformatics,2021,19(2):267-281. 被引量：6

引证文献2

1Luonan Chen,Rong Fan,Fuchou Tang.Advanced Single-cell Omics Technologies and Informatics Toolsfor Genomics, Proteomics, and Bioinformatics Analysis[J].Genomics, Proteomics & Bioinformatics,2021,19(3):343-345.
2刘硕,朱丽慧,蔡辉.治疗性胃癌疫苗激发特异性免疫反应及不同疫苗的研究进展[J].解放军医学院学报,2024,45(2):199-204.

1Yue Yan,Yun-Hai Luo,Dao-Feng Zheng,Tong Mu,Zhong-Jun Wu.Integrating transcriptomes and somatic mutations to identify RNA methylation regulators as a prognostic marker in hepatocellular carcinoma[J].Hepatobiliary & Pancreatic Diseases International,2021,20(1):34-45.
2Xinwei Wang.The dilemmas of liver cancer heterogeneity[J].Cancer Biology & Medicine,2018,15(S01):14-14.
3Long Yan,Wenhu Tang,Q.H.Wu,Jeremy Simon Smith.Kernel-based Consensus Clustering for Ontology-embedded Document Repository of Power Substations[J].CSEE Journal of Power and Energy Systems,2017,3(2):212-221. 被引量：1
4Maha Al-Eid,Mohamed M. Shoukri.Inference Procedures on the Generalized Poisson Distribution from Multiple Samples: Comparisons with Nonparametric Models for Analysis of Covariance (ANCOVA) of Count Data[J].Open Journal of Statistics,2021,11(3):420-436.
5Zhili HE,Joy DVAN NOSTRAND,Ye DENG,Jizhong ZHOU.Development and applications of functional gene microarrays in the analysis of the functional diversity,composition,and structure of microbial communities[J].Frontiers of Environmental Science & Engineering,2011,5(1):1-20. 被引量：9
6尹超,宋彬.科普短视频内容生产与传播策略研究——以自媒体毕导THU为例[J].中国广播,2021(8):68-71. 被引量：6
7Mahendra Kumar Trivedi,Parthasarathi Panda,Kalyan Kumar Sethi,Mayank Gangwar,Sambhu Charan Mondal,Snehasis Jana.Solid and liquid state characterization of tetrahydrocurcumin using XRPD, FT-IR, DSC, TGA, LC-MS, GC-MS, and NMR and its biological activities[J].Journal of Pharmaceutical Analysis,2020,10(4):334-345.
8Alan Wayne Jones,Johnny Mack Cowan.Reflections on variability in the blood-breath ratio of ethanol and its importance when evidential breath-alcohol instruments are used in law enforcement[J].Forensic Sciences Research,2020,5(4):300-308.
9Yi-Ming Bi,Wei Zhang,Yi-Gui Lai,Ying-Chang Fu,Kong-Zheng Li,Qiang Wang,Xue-Feng Jiang,Hong-Feng Liang,Hui-Jie Fan.Prediction of active ingredients and potential mechanisms of Alisma decoction against atherosclerosis:A study based on UHPLC-Q-Orbitrap-HRMS and network pharmacology[J].Drug Combination Therapy,2021,3(4):15-23.
10王相相,冯兴乐,韩佳倩.双模索引辅助广义空间调制系统设计[J].数据采集与处理,2021,36(6):1157-1166. 被引量：1

Genomics, Proteomics & Bioinformatics

2021年第2期

浏览历史

内容加载中请稍等...