摘要
单颗粒冷冻电镜图被广泛应用于生物大分子结构的重构,其中电镜图像聚类是三维重构的一个重要步骤.由于电镜图像信噪比极低、数据量大,使得电镜图像聚类成为非常具有挑战性的工作.根据电镜图像特点,本研究提出了一种基于对比学习的无监督冷冻电镜单颗粒图像聚类算法(CL-Clustering),对图像数据进行针对性的数据增强后,使用处理后的图像数据通过对比学习训练编码器,之后使用K-means++对编码器提取的特征进行聚类得到类别标签.该算法不需要使用人工数据集预训练,且在聚类迭代过程中不需要图像二维校准.在三种仿真数据集的不同信噪比梯度下,相比当前主流的基于极大似然估计的二维聚类(ML2D)算法,CL-Clustering聚类精度平均提升约10%;同时目标算法被应用到真实拍摄电镜图像并成功重构出了高分辨率三维结构.
Single-particle cryo-electron microscopy(cryo-EM)has been widely used as a biomolecule structural determination technique.In the reconstruction process,clustering of single-particle cryo-EM images appears challenging because of the extremely low signal-to-noise ratio(SNR)of images captured by cryo-EM and the huge volume of images data.To address this issue,we propose a novel cryo-EM images clustering method based on contrastive learning(CL-Clustering).First,data augmentation is performed by considering these characteristics of cryo-EM images.Then the encoder is trained in a contrastive learning manner,and features extracted from the encoder are clustered by K-means++.The proposed method does not require the pre-training while using synthetic datasets and avoids 2D alignment while clustering.The clustering accuracy of CL-Clustering reaches averagely 10%higher than that of widely-used methods(ML2D)on three synthetic datasets with different SNR,and it is also tested on the real-world dataset and finally reconstructed high-resolution 3D structure.
作者
颜阳
郑清炳
张东旭
李少伟
葛胜祥
张军
夏宁邵
YAN Yang;ZHENG Qingbing;ZHANG Dongxu;LI Shaowei;GE Shengxiang;ZHANG Jun;XIA Ningshao(School of Public Health,Xiamen University,Xiamen 361102,China)
出处
《厦门大学学报(自然科学版)》
CAS
CSCD
北大核心
2022年第6期1053-1061,共9页
Journal of Xiamen University:Natural Science
基金
福建省自然科学基金(2019J05018)。
关键词
深度学习
聚类
冷冻电镜
对比学习
deep learning
clustering
cryo-EM
contrastive learning