摘要
随着质谱技术的进步以及生物信息学与统计学算法的发展,以疾病研究为主要目的之一的人类蛋白质组计划正快速推进。蛋白质生物标志物在疾病早期诊断和临床治疗等方面有着非常重要的意义,其发现策略和方法的研究已成为一个重要的热点领域。特征选择与机器学习对于解决蛋白质组数据'高维度'及'稀疏性'问题有较好的效果,因而逐渐被广泛地应用于发现蛋白质生物标志物的研究中。文中主要阐述蛋白质生物标志物的发现策略以及其中特征选择与机器学习方法的原理、应用实例和适用范围,并讨论深度学习方法在本领域的应用前景及局限性,以期为相关研究提供参考。
With the development of mass spectrometry technologies and bioinformatics analysis algorithms,disease research-driven human proteome project(HPP)is advancing rapidly.Protein biomarkers play critical roles in clinical applications and the biomarker discovery strategies and methods have become one of research hotspots.Feature selection and machine learning methods have good effects on solving the'dimensionality'and'sparsity'problems of proteomics data,which have been widely used in the discovery of protein biomarkers.Here,we systematically review the strategy of protein biomarker discovery and the frequently-used machine learning methods.Also,the review illustrates the prospects and limitations of deep learning in this field.It is aimed at providing a valuable reference for corresponding researchers.
作者
徐开琨
韩明飞
黄传玺
常乘
朱云平
Kaikun Xu;Mingfei Han;Chuanxi Huang;Cheng Chang;Yunping Zhu(Beijing Institute of Lifeomics,Beijing 102206,China;State Key Laboratory of Proteomics,Beijing Proteome Research Center,National Center for Protein Sciences(Beijing),Beijing 102206,China;College of Life Sciences,Hebei University,Baoding 071002,Hebei,China)
出处
《生物工程学报》
CAS
CSCD
北大核心
2019年第9期1619-1632,共14页
Chinese Journal of Biotechnology
基金
国家自然科学基金(No.21605159)资助~~
关键词
质谱
蛋白质组学
生物标志物
机器学习
特征选择
深度学习
mass spectrometry
proteomics
biomarkers
machine learning
feature selection
deep learning