One of the major challenges in single-cell data analysis is the determination of cellular developmental trajectories using single-cell data.Although substantial studies have been conducted in recent years,more effecti...One of the major challenges in single-cell data analysis is the determination of cellular developmental trajectories using single-cell data.Although substantial studies have been conducted in recent years,more effective methods are still strongly needed to infer the developmental processes accurately.This work devises a new method,named DTFLOW,for determining the pseudotemporal trajectories with multiple branches.DTFLOW consists of two major steps:a new method called Bhattacharyya kernel feature decomposition(BKFD)to reduce the data dimensions,and a novel approach named Reverse Searching on k-nearest neighbor graph(RSKG)to identify the multi-branching processes of cellular differentiation.In BKFD,we first establish a stationary distribution for each cell to represent the transition of cellular developmental states based on the random walk with restart algorithm,and then propose a new distance metric for calculating pseudotime of single cells by introducing the Bhattacharyya kernel matrix.The effectiveness of DTFLOW is rigorously examined by using four single-cell datasets.We compare the efficiency of DTFLOW with the published state-of-the-art methods.Simulation results suggest that DTFLOW has superior accuracy and strong robustness properties for constructing pseudotime trajectories.The Python source code of DTFLOW can be freely accessed at https://github.com/statway/DTFLOW.展开更多
基金the National Natural Science Foundation of China(Grant Nos.11571368,11931019,11775314,and 11871238)the Fundamental Research Funds for the Central Universities,China(Grant No.2662019QD031).
文摘One of the major challenges in single-cell data analysis is the determination of cellular developmental trajectories using single-cell data.Although substantial studies have been conducted in recent years,more effective methods are still strongly needed to infer the developmental processes accurately.This work devises a new method,named DTFLOW,for determining the pseudotemporal trajectories with multiple branches.DTFLOW consists of two major steps:a new method called Bhattacharyya kernel feature decomposition(BKFD)to reduce the data dimensions,and a novel approach named Reverse Searching on k-nearest neighbor graph(RSKG)to identify the multi-branching processes of cellular differentiation.In BKFD,we first establish a stationary distribution for each cell to represent the transition of cellular developmental states based on the random walk with restart algorithm,and then propose a new distance metric for calculating pseudotime of single cells by introducing the Bhattacharyya kernel matrix.The effectiveness of DTFLOW is rigorously examined by using four single-cell datasets.We compare the efficiency of DTFLOW with the published state-of-the-art methods.Simulation results suggest that DTFLOW has superior accuracy and strong robustness properties for constructing pseudotime trajectories.The Python source code of DTFLOW can be freely accessed at https://github.com/statway/DTFLOW.