摘要
蛋白质亚细胞定位预测有助于了解蛋白质性质和功能,理解蛋白质复杂的生理过程和调控机理,对开发新药物等方面有着很大的促进作用。随着人类在蛋白质组学方面研究的不断深入,蛋白质序列数据量呈指数式增长,单纯依靠传统的试验方法已无法满足生命科学研究的需要,于是人们将蛋白质亚细胞定位的研究方法转向了机器学习领域,研究机器学习算法变得越来越重要。从蛋白质序列特征的刻画、预测算法、算法评价3个方面阐述现阶段蛋白质亚细胞定位预测的研究进展。最后,总结蛋白质亚细胞定位预测方法方面取得的成果及需要不断完善的3个方面(特征选择、数据处理和改进算法),并提出了未来机器学习在提高预测性能方面的研究重点及重要意义。
Prediction of protein subcellular localization helps to understand the properties and functions of proteins,to understand the complex physiological processes and regulatory mechanisms of proteins,and has a great role in promoting the development of new drugs.With the con-tinuous deepening of human research in proteomics,the amount of protein sequence data has increased exponentially.Relying solely on tradi-tional experimental methods can no longer meet the needs of life science research,so people have turned to the research methods of protein subcellular localization.In the field of machine learning,research on machine learning algorithms is becoming more and more important.We de-scribed the current research progress of protein subcellular location prediction from three aspects of characterization of protein sequence fea-tures,prediction algorithmsand algorithm evaluation.Finally,we summarized the achievements of protein subcellular localization prediction methods and the three aspects that need to be continuously improved,which were feature selection,data processing and improved algorithms.The research focus and important significance of future machine learning in improving prediction performance were put forward.
作者
李佳楠
李卓
滕小华
高兴泉
唐友
LI Jia-nan;LI Zhuo;TENG Xiao-hua(School of Information and Control Engineering,Jilin Institute of Chemical Technology,Jilin,Jilin 132000;Electrical and Information Engineering College,Jilin Agricultural Science and Technology University,Jilin,Jilin 132101)
出处
《安徽农业科学》
CAS
2022年第16期198-204,共7页
Journal of Anhui Agricultural Sciences
基金
吉林省智慧农业工程研究中心项目
吉林省特色高水平学科新兴交叉学科“数字农业”。
关键词
蛋白质
亚细胞定位
机器学习
预测算法
Protein
Subcellular localization
Machine learning
Prediction algorithm