基于图卷积神经网络的单细胞RNA测序数据聚类

Clustering of single-cell RNA sequencing data based on graph convolution

下载PDF

导出

摘要针对单细胞RNA测序数据的高维性和数据中存在大量丢失噪声的问题,将降噪、降维方法融合到聚类任务中,提出了基于图卷积神经网络的聚类模型——DGGAE.该模型使用零膨胀负二项分布的负对数的似然函数作为降噪自编码器的损失函数处理数据中的丢失噪声;利用图卷积自编码器获取数据的低维特征;利用KL散度函数作为聚类的损失函数进行深度嵌入聚类.在9个真实的高维度、高噪声的数据集上的实验结果表明,与其它传统聚类方法相比,DGGAE模型有更好的聚类效果. The thousands of gene types in a single cell have caused a dimensional disaster in RNA sequencing data,and low RNA capture rates have led to failed detection of expressed genes,resulting in a large number of false zero count observations in the sequencing data,resulting in high sparsity of the data,which is defined as a“loss event”.This article focuses on the high-dimensional nature of single-cell RNA sequencing data and the problem of a large amount of lost noise in the data.By integrating denoising and dimensionality reduction methods into clustering tasks,a clustering model based on graph convolutional neural network-DGGAE is proposed.This model uses the likelihood function of the negative logarithm of the zero expansion negative binomial distribution as the loss function of the denoising autoencoder to handle the loss noise in the data;utilies graph convolutional autoencoder to obtain low dimensional features of data;and applies KL divergence function as the loss function of clustering for deep embedding clustering.The experimental results on 9 real high-dimensional and high noise datasets show that the DGGAE model has better clustering performance compared to other traditional clustering methods.

作者孔晨曦鲁大营 KONG Chenxi;LU Daying(School of Cyber Science and Engineering,Qufu Normal University,273165,Qufu,Shandong,PRC)

机构地区曲阜师范大学网络空间安全学院

出处《曲阜师范大学学报（自然科学版）》 CAS 2024年第4期83-89,共7页 Journal of Qufu Normal University(Natural Science)

基金山东省高等学校科技计划(J17KA062) 教育部产学合作协同育人项目(201602028014) 山东省研究生教育质量提升计划(SDYKC19183).

关键词图卷积神经网络降维降噪聚类自编码器 graph convolution neural network dimensionality reduction noise reduction clustering autoencoder

分类号 TP391 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

参考文献2

1Xianwen Ren,Liangtao Zheng,Zemin Zhang.SSCC: A Novel Computational Framework for Rapid and Accurate Clustering Large-scale Single Cell RNA-seq Data[J].Genomics, Proteomics & Bioinformatics,2019,17(2):201-210. 被引量：4
2Dongfang Wang,Jin Gu.VASC: Dimension Reduction and Visualization of Single-cell RNA-seq Data by Deep Variational Autoencoder[J].Genomics, Proteomics & Bioinformatics,2018,16(5):320-331. 被引量：8

二级参考文献1

1Huipeng Li, Elise T Courtois, Debarka Sengupta, Yuliana Tan, Shyam Prabhakar,Elise T Courtois, Yuliana Tan, Paul Robson,Debarka Sengupta,Kok Hao Chen, Jolene Jie Lin Goh, Paul Jongjoon Choi,Say Li Kong, Axel M Hillmer, Iain Beehuat Tan,Clarinda Chua, Iain Beehuat Tan,Lim Kiat Hon,Wah Siew Tan, Mark Wong,Lawrence J K Wee,Iain Beehuat Tan,Paul Robson,Paul Robson,Paul Robson.Nat Genet:单细胞分析解开结直肠癌细胞的神秘面纱[J].现代生物医学进展,2017,17(15). 被引量：41

共引文献8

1Jie,Zheng Ke Wang.Emerging deep learning methods for single-cell RNA-seq data analysis[J].Quantitative Biology,2019,7(4):247-254. 被引量：3
2陈松乐,孙知信.一种基于双层自编码的运动数据集可视化方法[J].南京邮电大学学报（自然科学版）,2020,40(3):22-30. 被引量：1
3Jiangyong Wei,Tianshou Zhou,Xinan Zhang,Tianhai Tian.DTFLOW:Inference and Visualization of Single-cell Pseudotime Trajectory Using Diffusion Propagation[J].Genomics, Proteomics & Bioinformatics,2021,19(2):306-318. 被引量：2
4Qianqian Song,Jing Su,Lance D.Miller,Wei Zhang.scLM:Automatic Detection of Consensus Gene Clusters Across Multiple Single-cell Datasets[J].Genomics, Proteomics & Bioinformatics,2021,19(2):330-341. 被引量：2
5汪颖,张立莹.单细胞RNA测序数据的聚类研究[J].黑龙江科学,2022,13(22):15-18.
6温兆琦,王晓哲,侯艳芳,董玉坤,张玉林.基于sc RNA-seq数据的单细胞非线性降维方法[J].数学建模及其应用,2023,12(3):33-44.
7Ren Qi,Quan Zou.Trends and Potential of Machine Learning and Deep Learning in Drug Study at Single-Cell Level[J].Research,2023(4):145-162. 被引量：1
8OU JiaJun,LUO XiaoShan,LIU JunYang,HUANG LinYan,ZHOU LiHua,YUAN Yong.Predicting microbial extracellular electron transfer activity in paddy soils with soil physicochemical properties using machine learning[J].Science China(Technological Sciences),2024,67(1):259-270.

1李校林,陈泽.基于多尺度特征融合的轻量级目标检测算法[J].微电子学与计算机,2024,41(9):32-40.
2孙悦琦,陈省宏.工业智能化对企业绿色转型的影响研究[J].金融与经济,2024(9):76-83.
3张祥银,胡立坤.基于迭代注意力归一化流的低光图像增强[J].激光杂志,2024,45(8):131-137.
4王红军.运用递归公式求解几个离散型随机变量的数字特征[J].高等数学研究,2024,27(5):94-94.
5董俊,刘瑞,束洪春,罗琨,刘壮.基于BIRCH聚类的L-Transformer分布式光伏短期发电功率预测[J].高电压技术,2024,50(9):3883-3893.
6蔡凯,林培忠.碳化硅/碳吸波材料制备及雷达快速精准识别技术研究[J].粘接,2024,51(10):100-103.
7税雨翔,李辉.一种基于自编码器辅助的鲁棒多目标进化算法[J].工程数学学报,2024,41(5):793-807.
8高淑芝,陈一丹,张义民,陈国庆.基于随机临近嵌入和逻辑回归的滚动轴承可靠性评估[J].机械设计与制造,2024(10):1-4.
9刘博文,武越,李晓斌,仪凡,刘长英,胡世平.中药治疗慢性乙型病毒性肝炎用药规律及作用机制研究[J].浙江中医药大学学报,2024,48(8):1017-1032.
10高大菊.基于计算机大数据分析的社交媒体用户行为挖掘与情感分析研究[J].信息记录材料,2024,25(10):118-120.

曲阜师范大学学报（自然科学版）

2024年第4期

浏览历史

内容加载中请稍等...

基于图卷积神经网络的单细胞RNA测序数据聚类

参考文献2

二级参考文献1

共引文献8

相关作者

相关机构

相关主题

浏览历史