Plant morphogenesis relies on precise gene expression programs at the proper time and position which is orchestrated by transcription factors(TFs)in intricate regulatory networks in a cell-type specific manner.Here we...Plant morphogenesis relies on precise gene expression programs at the proper time and position which is orchestrated by transcription factors(TFs)in intricate regulatory networks in a cell-type specific manner.Here we introduced a comprehensive single-cell transcriptomic atlas of Arabidopsis seedlings.This atlas is the result of meticulous integration of 63 previously published scRNA-seq datasets,addressing batch effects and conserving biological variance.This integration spans a broad spectrum of tissues,including both below-and above-ground parts.Utilizing a rigorous approach for cell type annotation,we identified 47 distinct cell types or states,largely expanding our current view of plant cell compositions.We systematically constructed cell-type specific gene regulatory networks and uncovered key regulators that act in a coordinated manner to control cell-type specific gene expression.Taken together,our study not only offers extensive plant cell atlas exploration that serves as a valuable resource,but also provides molecular insights into gene-regulatory programs that varies from different cell types.展开更多
Clustering is a prevalent analytical means to analyze single cell RNA sequencing (scRNA-seq) data but the rapidly expanding data volume can make this process computationally challenging. New methods for both accurate ...Clustering is a prevalent analytical means to analyze single cell RNA sequencing (scRNA-seq) data but the rapidly expanding data volume can make this process computationally challenging. New methods for both accurate and efficient clustering are of pressing need. Here we proposed Spearman subsampling-clustering-classification (SSCC),a new clustering framework based on random projection and feature construction,for large-scale scRNA-seq data. SSCC greatly improves clustering accuracy,robustness,and computational efficacy for various state-of-the-art algorithms benchmarked on multiple real datasets. On a dataset with 68,578 human blood cells,SSCC achieved 20%improvement for clustering accuracy and 50-fold acceleration,but only consumed 66%memory usage,compared to the widelyused software package SC3. Compared to k-means,the accuracy improvement of SSCC can reach 3-fold. An R implementation of SSCC is available at https://github.com/Japrin/sscClust.展开更多
The outbreak of coronavirus disease 2019(COVID-2019)has drawn public attention all over the world.As a newly emerging area,single cell sequencing also exerts its power in the battle over the epidemic.In this review,th...The outbreak of coronavirus disease 2019(COVID-2019)has drawn public attention all over the world.As a newly emerging area,single cell sequencing also exerts its power in the battle over the epidemic.In this review,the up-to-date knowledge of COVID-19 and its receptor is summarized,followed by a collection of the mining of single cell transcriptome profiling data for the information in aspects of the vulnerable cell types in humans and the potential mechanisms of the disease.展开更多
Study of gene expression has been arguably the most active research field in functional genomics.Over the last two decades,various high-throughput technologies,from gene expression microarray to RNA-seq,have been wide...Study of gene expression has been arguably the most active research field in functional genomics.Over the last two decades,various high-throughput technologies,from gene expression microarray to RNA-seq,have been widely applied to the wholegenome profiling of gene expression.The commonality of these experiments is that they measure the gene expression levels of"bulk"sample,which pools a large number(often in the scale of millions)of cells,and thus the measurements reflect the average expression展开更多
The rapid growth of single-cell RNA-seq studies (scRNA-seq) demands efficient data storage, processing, and analysis. Big-data technology provides a framework that facilitates the comprehensive discovery of biologic...The rapid growth of single-cell RNA-seq studies (scRNA-seq) demands efficient data storage, processing, and analysis. Big-data technology provides a framework that facilitates the comprehensive discovery of biological signals from inter-institutional scRNA-seq datasets. The strategies to solve the stochastic and heterogeneous single-cell transcriptome signal are discussed in this article. After extensively reviewing the available big-data applications of next-generation sequencing (NGS)-based studies, we propose a workflow that accounts for the unique characteris- tics of scRNA-seq data and primary objectives of single-cell studies.展开更多
Background: Video recording of cells offers a straightforward way to gainvaluable information from their response to treatments. An indispensable stepin obtaining such information involves tracking individual cells fr...Background: Video recording of cells offers a straightforward way to gainvaluable information from their response to treatments. An indispensable stepin obtaining such information involves tracking individual cells from therecorded data. A subsequent step is reducing such data to represent essentialbiological information. This can help to compare various single‐cell trackingdata yielding a novel source of information. The vast array of potential datasources highlights the significance of methodologies prioritizing simplicity,robustness, transparency, affordability, sensor independence, and freedomfrom reliance on specific software or online services.Methods: The provided data presents single‐cell tracking of clonal (A549)cells as they grow in two‐dimensional (2D) monolayers over 94 hours,spanning several cell cycles. The cells are exposed to three differentconcentrations of yessotoxin (YTX). The data treatments showcase theparametrization of population growth curves, as well as other statisticaldescriptions. These include the temporal development of cell speed in familytrees with and without cell death, correlations between sister cells, single‐cellaverage displacements, and the study of clustering tendencies.Results: Various statistics obtained from single‐cell tracking reveal patternssuitable for data compression and parametrization. These statistics encompassessential aspects such as cell division, movements, and mutual informationbetween sister cells.Conclusion: This work presents practical examples that highlight theabundant potential information within large sets of single‐cell tracking data.Data reduction is crucial in the process of acquiring such information whichcan be relevant for phenotypic drug discovery and therapeutics, extendingbeyond standardized procedures. Conducting meaningful big data analysistypically necessitates a substantial amount of data, which can stem fromstandalone case studies as an initial foundation.展开更多
The spatiotemporal relationships in high-resolution during odontogenesis remain poorly understood.We report a cell lineage and atlas of developing mouse teeth.We performed a large-scale(92,688 cells)single cell RNA se...The spatiotemporal relationships in high-resolution during odontogenesis remain poorly understood.We report a cell lineage and atlas of developing mouse teeth.We performed a large-scale(92,688 cells)single cell RNA sequencing,tracing the cell trajectories during odontogenesis from embryonic days 10.5 to 16.5.Combined with an assay for transposase-accessible chromatin with high-throughput sequencing,our results suggest that mesenchymal cells show the specific transcriptome profiles to distinguish the tooth types.Subsequently,we identified key gene regulatory networks in teeth and bone formation and uncovered spatiotemporal patterns of odontogenic mesenchymal cells.CD24^(+)and Plac8^(+)cells from the mesenchyme at the bell stage were distributed in the upper half and preodontoblast layer of the dental papilla,respectively,which could individually induce nonodontogenic epithelia to form tooth-like structures.Specifically,the Plac8^(+)tissue we discovered is the smallest piece with the most homogenous cells that could induce tooth regeneration to date.Our work reveals previously unknown heterogeneity and spatiotemporal patterns of tooth germs that may lead to tooth regeneration for regenerative dentistry.展开更多
Although some co-risk factors and hemodynamic alterations are involved in hypertension progression,their direct biomechanical effects are unclear.Here,we constructed a high-hydrostatic-pressure cell-culture system to ...Although some co-risk factors and hemodynamic alterations are involved in hypertension progression,their direct biomechanical effects are unclear.Here,we constructed a high-hydrostatic-pressure cell-culture system to imitate constant hypertension and identified novel molecular classifications of human aortic smooth muscle cells(HASMCs)by single-cell transcriptome analysis.Under 100-mmHg(analogous to healthy human blood pressure)or 200-mmHg(analogous to hypertension)hydrostatic pressure for 48 h,HASMCs showed six distinct vascular SMC(VSMC)clusters according to differential gene expression and gene ontology enrichment analysis.Especially,two novel HASMC subsets were identified,named the inflammatory subset,with CXCL2,CXCL3 and CCL2 as markers,and the endothelial-function inhibitory subset,with AKR1C2,AKR1C3,SERPINF1 as markers.The inflammatory subset promoted CXCL2&3 and CCL2 chemokine expression and secretion,triggering monocyte migration;the endothelial-function inhibitory subset secreted SERPINF1 and accelerated prostaglandin F2αgeneration to inhibit angiogenesis.The expression of the two VSMC subsets was greatly increased in arterial media from patients with hypertension and experimental animal models of hypertension.Collectively,we identified high hydrostatic pressure directly driving VSMCs into two new subsets,promoting or exacerbating endothelial dysfunction,thereby contributing to the pathogenesis of cardiovascular diseases.展开更多
单细胞多组学测序正在广泛应用于生物医学研究中,并产生大量的多样性组学数据。然而原始的单细胞多组学数据包含多种类型的测序噪声和冗余信息,对后续生物医疗层面的分析造成困难。现有的降噪方法主要依赖于单一的数据分布假设,并针对...单细胞多组学测序正在广泛应用于生物医学研究中,并产生大量的多样性组学数据。然而原始的单细胞多组学数据包含多种类型的测序噪声和冗余信息,对后续生物医疗层面的分析造成困难。现有的降噪方法主要依赖于单一的数据分布假设,并针对性的处理单个组学数据,这对模型联合处理不同组学数据造成极大地限制。本研究提出一种使用单细胞多组学数据降噪的分析方法,称为scMAED(single-cell multi-omics data via a multi-head autoencoder network to denoising)。模型在多头自动编码器网络中添加了分类解码器,以无监督的方式来最大程度的去除数据噪声。首先,使用两个编码器独立学习多组学数据的内部特征,并联合输出的低维特征进行共同解码。其次,分类解码器不做任何数据分布假设,通过使用预测的细胞簇标签来反馈数据信息,以最大限度的去除复杂噪声。最后,使用主成分分析和t-SNE进行可视化。本文基于模拟数据集和真实的小鼠数据集对模型进行性能评估,结果显示sc-MAED在降噪效果上优于实验中的对比方法,并能够极大的改善单细胞多组学数据的质量。展开更多
As an important part of the stomach,gastric antrum secretes gastrin which can regulate acid secretion and gastric emptying.Although most cell types in the gastric antrum are identified,the comparison of cell compositi...As an important part of the stomach,gastric antrum secretes gastrin which can regulate acid secretion and gastric emptying.Although most cell types in the gastric antrum are identified,the comparison of cell composition and gene expression in the gastric antrum among different species are not explored.In this study,we collected antrum epithelial tissues from human,pig,rat and mouse for scRNA-seq and compared cell types and gene expression among species.In pig antral epithelium,we identified a novel cell cluster,which is marked by high expression of AQP5,F3,CLCA1 and RRAD.We also discovered that the porcine antral epithelium has stronger immune function than the other species.Further analysis revealed that this may be due to the insufficient function of porcine immune cells.Together,our results replenish the information of multiple species of gastric antral epithelium at the single cell level and provide resources for understanding the homeostasis maintenance and regeneration of gastric antrum epithelium.展开更多
Background: Mammalian brain are composed of a large number of specialized cell types with diverse molecular composition, functions and differentiation potentials. The application of recently developed single-cell RNA...Background: Mammalian brain are composed of a large number of specialized cell types with diverse molecular composition, functions and differentiation potentials. The application of recently developed single-cell RNA sequencing (scRNA-seq) technology in this filed has provided us new insights about this sophisticated system, deepened our understanding of the cell type diversity and led to the discovery of novel cell types. Results: Here we review recent progresses of applying this technology on studying brain cell heterogeneity, adult neurogenesis as well as brain tumors, then we discuss some current limitations and future directions of using scRNA- seq on the investagation of nervous system. Conclusions: We believe the application of single-celi RNA sequencing in neuroscience will accelerate the progress of big brain projects.展开更多
基金supported by the National Natural Science Foundation of China (No.32070656)the Nanjing University Deng Feng Scholars Program+1 种基金the Priority Academic Program Development (PAPD) of Jiangsu Higher Education Institutions,China Postdoctoral Science Foundation funded project (No.2022M711563)Jiangsu Funding Program for Excellent Postdoctoral Talent (No.2022ZB50)
文摘Plant morphogenesis relies on precise gene expression programs at the proper time and position which is orchestrated by transcription factors(TFs)in intricate regulatory networks in a cell-type specific manner.Here we introduced a comprehensive single-cell transcriptomic atlas of Arabidopsis seedlings.This atlas is the result of meticulous integration of 63 previously published scRNA-seq datasets,addressing batch effects and conserving biological variance.This integration spans a broad spectrum of tissues,including both below-and above-ground parts.Utilizing a rigorous approach for cell type annotation,we identified 47 distinct cell types or states,largely expanding our current view of plant cell compositions.We systematically constructed cell-type specific gene regulatory networks and uncovered key regulators that act in a coordinated manner to control cell-type specific gene expression.Taken together,our study not only offers extensive plant cell atlas exploration that serves as a valuable resource,but also provides molecular insights into gene-regulatory programs that varies from different cell types.
基金supported by grants from Beijing Advanced Innovation Center for Genomics at Peking UniversityKey Technologies R&D Program (Grant No. 2016YFC0900100) by the Ministry of Science and Technology of Chinathe National Natural Science Foundation of China (Grant Nos. 81573022 and 31530036)
文摘Clustering is a prevalent analytical means to analyze single cell RNA sequencing (scRNA-seq) data but the rapidly expanding data volume can make this process computationally challenging. New methods for both accurate and efficient clustering are of pressing need. Here we proposed Spearman subsampling-clustering-classification (SSCC),a new clustering framework based on random projection and feature construction,for large-scale scRNA-seq data. SSCC greatly improves clustering accuracy,robustness,and computational efficacy for various state-of-the-art algorithms benchmarked on multiple real datasets. On a dataset with 68,578 human blood cells,SSCC achieved 20%improvement for clustering accuracy and 50-fold acceleration,but only consumed 66%memory usage,compared to the widelyused software package SC3. Compared to k-means,the accuracy improvement of SSCC can reach 3-fold. An R implementation of SSCC is available at https://github.com/Japrin/sscClust.
基金the National Key R&D Program of China under Grant No.2018YFC0910405the National Natural Science Foundation of China under Grants No.61922020,No.61771331,and No.91935302.
文摘The outbreak of coronavirus disease 2019(COVID-2019)has drawn public attention all over the world.As a newly emerging area,single cell sequencing also exerts its power in the battle over the epidemic.In this review,the up-to-date knowledge of COVID-19 and its receptor is summarized,followed by a collection of the mining of single cell transcriptome profiling data for the information in aspects of the vulnerable cell types in humans and the potential mechanisms of the disease.
基金partially supported by NIH grants (2U19AI090023,5P30AI50409,and R01GM122083)
文摘Study of gene expression has been arguably the most active research field in functional genomics.Over the last two decades,various high-throughput technologies,from gene expression microarray to RNA-seq,have been widely applied to the wholegenome profiling of gene expression.The commonality of these experiments is that they measure the gene expression levels of"bulk"sample,which pools a large number(often in the scale of millions)of cells,and thus the measurements reflect the average expression
基金supported by Baylor Research Institute start-up funding,USA to WL
文摘The rapid growth of single-cell RNA-seq studies (scRNA-seq) demands efficient data storage, processing, and analysis. Big-data technology provides a framework that facilitates the comprehensive discovery of biological signals from inter-institutional scRNA-seq datasets. The strategies to solve the stochastic and heterogeneous single-cell transcriptome signal are discussed in this article. After extensively reviewing the available big-data applications of next-generation sequencing (NGS)-based studies, we propose a workflow that accounts for the unique characteris- tics of scRNA-seq data and primary objectives of single-cell studies.
文摘Background: Video recording of cells offers a straightforward way to gainvaluable information from their response to treatments. An indispensable stepin obtaining such information involves tracking individual cells from therecorded data. A subsequent step is reducing such data to represent essentialbiological information. This can help to compare various single‐cell trackingdata yielding a novel source of information. The vast array of potential datasources highlights the significance of methodologies prioritizing simplicity,robustness, transparency, affordability, sensor independence, and freedomfrom reliance on specific software or online services.Methods: The provided data presents single‐cell tracking of clonal (A549)cells as they grow in two‐dimensional (2D) monolayers over 94 hours,spanning several cell cycles. The cells are exposed to three differentconcentrations of yessotoxin (YTX). The data treatments showcase theparametrization of population growth curves, as well as other statisticaldescriptions. These include the temporal development of cell speed in familytrees with and without cell death, correlations between sister cells, single‐cellaverage displacements, and the study of clustering tendencies.Results: Various statistics obtained from single‐cell tracking reveal patternssuitable for data compression and parametrization. These statistics encompassessential aspects such as cell division, movements, and mutual informationbetween sister cells.Conclusion: This work presents practical examples that highlight theabundant potential information within large sets of single‐cell tracking data.Data reduction is crucial in the process of acquiring such information whichcan be relevant for phenotypic drug discovery and therapeutics, extendingbeyond standardized procedures. Conducting meaningful big data analysistypically necessitates a substantial amount of data, which can stem fromstandalone case studies as an initial foundation.
基金supported by the National Key Research and Development Program of China Stem Cell and Translational Research,China(2017YFA0104800)the Research Funds from Health@InnoHK Program launched by Innovation Technology Commission of the Hong Kong SAR,China+4 种基金National Natural Science Foundation of China(81570944 and 92068201)Science and Technology Planning Project of Guangdong Province,China(2020B1212060052)High-level Hospital Construction Project(DFJHBF202110)Youth Innovation Promotion of the Chinese Academy of Sciences(2019348)Guangzhou Key Medical Disciplines(2021–2023)。
文摘The spatiotemporal relationships in high-resolution during odontogenesis remain poorly understood.We report a cell lineage and atlas of developing mouse teeth.We performed a large-scale(92,688 cells)single cell RNA sequencing,tracing the cell trajectories during odontogenesis from embryonic days 10.5 to 16.5.Combined with an assay for transposase-accessible chromatin with high-throughput sequencing,our results suggest that mesenchymal cells show the specific transcriptome profiles to distinguish the tooth types.Subsequently,we identified key gene regulatory networks in teeth and bone formation and uncovered spatiotemporal patterns of odontogenic mesenchymal cells.CD24^(+)and Plac8^(+)cells from the mesenchyme at the bell stage were distributed in the upper half and preodontoblast layer of the dental papilla,respectively,which could individually induce nonodontogenic epithelia to form tooth-like structures.Specifically,the Plac8^(+)tissue we discovered is the smallest piece with the most homogenous cells that could induce tooth regeneration to date.Our work reveals previously unknown heterogeneity and spatiotemporal patterns of tooth germs that may lead to tooth regeneration for regenerative dentistry.
基金supported by the National Key Research and Development Program of China(2018YFC1312703)CAMS Innovation Fund for Medical Sciences(CIFMS,2016-12M1–006)+1 种基金the National Natural Science Foundation of China(81630014,81825002,81800367,81870318,81670379)Beijing Outstanding Young Scientist Program(BJJWZYJH01201910023029).
文摘Although some co-risk factors and hemodynamic alterations are involved in hypertension progression,their direct biomechanical effects are unclear.Here,we constructed a high-hydrostatic-pressure cell-culture system to imitate constant hypertension and identified novel molecular classifications of human aortic smooth muscle cells(HASMCs)by single-cell transcriptome analysis.Under 100-mmHg(analogous to healthy human blood pressure)or 200-mmHg(analogous to hypertension)hydrostatic pressure for 48 h,HASMCs showed six distinct vascular SMC(VSMC)clusters according to differential gene expression and gene ontology enrichment analysis.Especially,two novel HASMC subsets were identified,named the inflammatory subset,with CXCL2,CXCL3 and CCL2 as markers,and the endothelial-function inhibitory subset,with AKR1C2,AKR1C3,SERPINF1 as markers.The inflammatory subset promoted CXCL2&3 and CCL2 chemokine expression and secretion,triggering monocyte migration;the endothelial-function inhibitory subset secreted SERPINF1 and accelerated prostaglandin F2αgeneration to inhibit angiogenesis.The expression of the two VSMC subsets was greatly increased in arterial media from patients with hypertension and experimental animal models of hypertension.Collectively,we identified high hydrostatic pressure directly driving VSMCs into two new subsets,promoting or exacerbating endothelial dysfunction,thereby contributing to the pathogenesis of cardiovascular diseases.
文摘单细胞多组学测序正在广泛应用于生物医学研究中,并产生大量的多样性组学数据。然而原始的单细胞多组学数据包含多种类型的测序噪声和冗余信息,对后续生物医疗层面的分析造成困难。现有的降噪方法主要依赖于单一的数据分布假设,并针对性的处理单个组学数据,这对模型联合处理不同组学数据造成极大地限制。本研究提出一种使用单细胞多组学数据降噪的分析方法,称为scMAED(single-cell multi-omics data via a multi-head autoencoder network to denoising)。模型在多头自动编码器网络中添加了分类解码器,以无监督的方式来最大程度的去除数据噪声。首先,使用两个编码器独立学习多组学数据的内部特征,并联合输出的低维特征进行共同解码。其次,分类解码器不做任何数据分布假设,通过使用预测的细胞簇标签来反馈数据信息,以最大限度的去除复杂噪声。最后,使用主成分分析和t-SNE进行可视化。本文基于模拟数据集和真实的小鼠数据集对模型进行性能评估,结果显示sc-MAED在降噪效果上优于实验中的对比方法,并能够极大的改善单细胞多组学数据的质量。
基金supported by grants from the National Key Research and Development Program of China(2017YFA0103601)the National Natural Science Foundation of China(31988101 and 31730056 to YGC).
文摘As an important part of the stomach,gastric antrum secretes gastrin which can regulate acid secretion and gastric emptying.Although most cell types in the gastric antrum are identified,the comparison of cell composition and gene expression in the gastric antrum among different species are not explored.In this study,we collected antrum epithelial tissues from human,pig,rat and mouse for scRNA-seq and compared cell types and gene expression among species.In pig antral epithelium,we identified a novel cell cluster,which is marked by high expression of AQP5,F3,CLCA1 and RRAD.We also discovered that the porcine antral epithelium has stronger immune function than the other species.Further analysis revealed that this may be due to the insufficient function of porcine immune cells.Together,our results replenish the information of multiple species of gastric antral epithelium at the single cell level and provide resources for understanding the homeostasis maintenance and regeneration of gastric antrum epithelium.
基金We thank Yanfei Yang for helping us to prepare the graphical abstract. This work was supported by the National Natural Science Foundation of China (No. 31600960) and the National Key R&D Program of China (2016 YFC1303100 and 2016YFC0901700)
文摘Background: Mammalian brain are composed of a large number of specialized cell types with diverse molecular composition, functions and differentiation potentials. The application of recently developed single-cell RNA sequencing (scRNA-seq) technology in this filed has provided us new insights about this sophisticated system, deepened our understanding of the cell type diversity and led to the discovery of novel cell types. Results: Here we review recent progresses of applying this technology on studying brain cell heterogeneity, adult neurogenesis as well as brain tumors, then we discuss some current limitations and future directions of using scRNA- seq on the investagation of nervous system. Conclusions: We believe the application of single-celi RNA sequencing in neuroscience will accelerate the progress of big brain projects.