摘要
基于支持向量机算法,本文提出了一种能快速准确区分癌细胞经典分泌蛋白与非经典分泌蛋白的方法.通过严格的特征筛选,氨基酸组成、位置特异性得分矩阵和信号肽组成了最优特征集.测试集检测结果表明,本方法对癌细胞经典分泌蛋白与非经典分泌蛋白具有较强的区分能力,可为寻找到不同种类癌症间通用的生物标志物提供理论参考.
Based on support vector machine(SVM)algorithm,a fast and accurate method is proposed to distinguish the classically and non-classically secreted proteins from cancer cells.By a strict feature selection,the optimal feature set is obtained which consists of amino acid composition(AAC),position specificity score matrix(PSSM)and signal peptide(SP).The test results show that our method has strong ability to distinguish the non-classically secreted proteins(NCSPs)from the classically secreted proteins(CSPs)of cancer cells,which may provide theoretical reference for finding common biomarkers among different kinds of cancers.
作者
余乐正
柳凤娟
李东海
郭延芝
李益洲
YU Le-Zheng;LIU Feng-Juan;LI Dong-Hai;GUO Yan-Zhi;LI Yi-Zhou(School of Chemistry and Materials Science,Guizhou Education University,Guiyang 550018,China;College of Chemistry,Sichuan University,Chengdu 610065,China)
出处
《四川大学学报(自然科学版)》
CAS
CSCD
北大核心
2020年第1期152-156,共5页
Journal of Sichuan University(Natural Science Edition)
基金
国家科技部和国家自然科学基金奖励补助资金(黔科合平台人才[2017]5790-07)
贵州省普通本科高等学校青年科技人才成长项目(黔教合KY字[2016]219)
贵州省科学技术基金一般项目(黔科合J字[2014]2134号)
关键词
支持向量机
癌症
非经典分泌蛋白
位置特异性得分矩阵
信号肽
Support vector machine
Cancer
Non-classically secreted protein
Position specific scoring matrix
Signal peptide