摘要
膜蛋白在细胞膜上的时空分布形式决定了其活性状态及功能,在调控细胞生命活动过程中起着重要作用。单分子定位超分辨成像(SMLM)技术为在纳米尺度解析膜蛋白的空间分布提供了可能,但分辨率的极大提升对图像准确聚类分割提出了更高要求。基于密度的空间聚类算法(DBSCAN)是常用的聚类方法之一,但其对于膜蛋白分布不均匀的SMLM超分辨图像的分割效果往往不太理想。本文提出了一种结合多次DBSCAN和层次聚类的混合聚类算法,该算法以DBSCAN方法为分割基础,通过进一步的面积阈值分析和层次聚类,在保持超分辨点簇图像精确聚类识别的前提下,仍能保留每个点簇内的多次定位信号。将该算法应用于模拟数据集和实验数据分割得到的轮廓系数等性能普遍优于传统DBSCAN算法。这种混合聚类方法为膜蛋白SMLM超分辨图像的聚类分割提供了新思路和新方法,有助于更精准地分析膜蛋白在纳米尺度上的空间分布信息。
Objective There are a variety of functional proteins localized on the cell membrane that participate in many crucial cellular processes,such as signal transduction and transmembrane transport.The spatiotemporal distribution of specific membrane proteins largely determines their activity states and functions.It is known that the sizes of membrane proteins and the distances between them are both on a nanometer scale.Owing to diffraction limits,traditional optical microscopy cannot provide the spatial distribution of membrane proteins at the singlemolecule level.Therefore,imaging techniques with strong specificity and high resolution are urgently required to reveal the precise spatial distribution of membrane proteins.Nowadays,singlemolecule localization microscopy(SMLM)offers new opportunities to resolve the detailed distribution information of membrane proteins at the nanoscale,while the great improvement in spatial resolution also presents higher demands for accurate clustering segmentation of images.Densitybased spatial clustering of applications with noise clustering(DBSCAN)is one of the most commonly used clustering methods;however,it shows relatively poor performance in clustering segmentation in SMLM images of membrane proteins with heterogeneous density.To address this issue,we propose a novel clustering method using a combination of a multistep DBSCAN and a hierarchical clustering algorithm.This improved clustering method is based on the traditional DBSCAN method,which combines area threshold analysis and hierarchical clustering.Methods In the present work,we improved the traditional DBSCAN method by introducing a variable neighborhood radius and hierarchical clustering to perform precise image clustering segmentation in the original image(Fig.2).First,we inputted a relatively large parameter(ε1,M1)to perform the DBSCAN calculation.Owing to this relatively large parameter,the excessively discrete points in the original image were removed as noise points.Meanwhile,some of the closepoint clusters merged together.Subsequently,the area of each preliminarily identified cluster was calculated and divided by the average area for normalization.Based on the acquired normalized values,we selected an appropriate threshold parameter for extracting clusters with a relatively large area.Subsequently,secondary DBSCAN was performed by the input of a smaller or equal parameter(ε2,M1;ε2≤ε1).For each point cluster extracted in the second step,the calculation was looped fromε2 toε1.The parameter showing the maximum number of divisible point clusters in the output during the looped process fromε2 toε1 was selected as the clustering parameter for the next hierarchical clustering.Finally,we combined the above two DBSCAN results to obtain the final clustering segmentation result.Results and Discussions We tested this improved clustering method on both simulated and experimental SMLM data.For the simulation datasets,we chose the D31 and S2 datasets from previous studies as our test objects(Fig.4).The purity of the improved method on the D31 dataset was 95.64%(86.52%for the traditional DBSCAN method),and the adjusted Rand index was 0.9186(0.6463 for the traditional DBSCAN method).In addition,the silhouette coefficient and noise ratio were used to analyze the two datasets.Compared with the traditional DBSCAN method,the silhouette coefficient of the improved method significantly increased,and the noise ratio decreased(Table 1).For the S2 dataset,the improved method also exhibited a more accurate segmentation effect than the traditional DBSCAN method.The identification purity of the improved method for the S2 dataset was 95.52%(77.38%for the traditional DBSCAN method),and the adjusted Rand index was 0.9128(0.6777 for the traditional DBSCAN method).The silhouette coefficient and noise ratio increased and decreased,respectively(Table 1).For the experimental SMLM data,we tested the clustering segmentation effect of the improved method on the uniform,random,and nonuniform SMLM images of membrane proteins(Fig.5).Similarly,the improved clustering method has a higher accuracy and silhouette coefficient,and a lower noise ratio(Table 1).However,it is regrettable that the time consumption of the improved clustering method is higher than that of the traditional DBSCAN method for both the simulated and experimental datasets(Table 1).Conclusions Based on the characteristics of the point clusters in SMLM images of membrane proteins,we proposed a novel clustering method that combines area threshold segmentation and multistep clustering segmentation based on the traditional DBSCAN algorithm.When we applied this method for the image segmentation of simulated datasets as well as experimental SMLM data of membrane proteins,the obtained parameters,including purity,adjusted Rand index,silhouette coefficients,and noise ratio,were generally improved compared with those of the traditional DBSCAN method.On the premise of accurate clustering recognition of superresolution images and a certain noise reduction ability,the localization information of each cluster can be preserved as much as possible.Our method exhibites a good clustering segmentation ability,especially for SMLM images of membrane proteins with heterogeneous densities.This improved clustering method provides novel insights into the segmentation of membrane protein SMLM images,which is expected to facilitate research into the nanoscale spatial distribution of various membrane proteins.
作者
杨建宇
胡芬
邢福临
董浩
侯梦迪
李任植
潘雷霆
许京军
Yang Jianyu;Hu Fen;Xing Fulin;Dong Hao;Hou Mengdi;Imshik Lee;Pan Leiting;Xu Jingjun(Key Laboratory of WeakLight Nonlinear Photonics,Ministry of Education,School of Physics,TEDA Institute of Applied Physics,Nankai University,Tianjin 300071,China;Frontiers Science Center for Cell Responses,State Key Laboratory of Medicinal Chemical Biology,College of Life Sciences,Nankai University,Tianjin 300071,China;Shenzhen Research Institute of Nankai University,Shenzhen 518083,Guangdong,China;Collaborative Innovation Center of Extreme Optics Shanxi University,Taiyuan 030006,Shanxi,China)
出处
《中国激光》
EI
CAS
CSCD
北大核心
2023年第3期78-85,共8页
Chinese Journal of Lasers
基金
广东省基础与应用基础研究重大项目(2020B0301030009)
国家重点研发计划(2022YFC3400600)
国家自然科学基金(11874231,32227802,12174208,31870843)
中国博士后科学基金(2020M680032)
天津市自然科学基金(20JCYB⁃JC01010)
中央高校基本科研业务费(2122021337,2122021405)。
关键词
生物光学
单分子定位超分辨成像
超分辨图像分割
膜蛋白
基于密度的空间聚类算法
层次聚类算法
biooptics
singlemolecule localization microscopy(SMLM)
superresolution image segmentation
membrane protein
densitybased spatial clustering of applications with noise clustering(DBSCAN)
hierarchical clustering algorithm