期刊文献+

CL-RBF:一种基于改进ML-RBF的蛋白质亚细胞多点定位预测算法

CL-RBF:An Improved ML-RBF Method for Prediction of Protein Subcellular Location
下载PDF
导出
摘要 综合考虑标记内和标记间的聚类结果对多目标学习径向基神经网络算法(RBF Neural Networks for Multi-Label Learning,ML-RBF)的影响,提出CL-RBF算法并应用到蛋白质亚细胞多点定位预测中。通过引入轮廓系数(Silhouette Coefficient)对ML-RBF隐层中心的个数进行优化,并通过分析标记间聚类结果的关系,对小于某一阈值的标记间的聚类中心重新聚类,使用梯度下降算法进行参数调整,最后依据测试样本与标记L的隐层中心和不属于标记L的样本生成的聚类中心的欧式距离差调整预测结果。在10折交叉验证下,采用词袋模型(Bag of Words)和氨基酸组成法(Amino acid composition,AAC)结合的方式提取特征向量,选取另外4种多目标学习算法作对比实验,根据不同评价指标的结果,得出CL-RBF算法在4个多标记数据集上的综合性能最优的结论。本研究预测算法通过网站https://njau.applinzi.com/homepage_final.jsp实现。 CL-RBF algorithm was proposed to predict the protein subcellular localization,which is considered about cluster results within one label and between different labels of the ML-RBF method.Silhouette coefficient was introduced to get the optimal number of centroids on hidden layer.The previous approach only considered optimization of clustering algorithms within the same label.In this paper,larger distance between two centroids which were generated from two labels was taken into account,when there were less samples covering these two labels.Besides,gradient descent algorithm was used to adjust the parameters.The final adjustment was made by analyzing the distance between train samples,the hidden centers obtained by label L and the clustering centers not belonging to label L.Bag of words and AAC method were employed to extract the feature of protein sequence.Compared with the methods which have been introduced previously for bacterial protein subcellular localization prediction via 10-fold cross-validation test,the new predictor performed more powerful and flexible on four different multi-label datasets.The prediction server was available on https://njau.applinzi.com/homepage_final.jsp.
作者 薛卫 洪晓宇 胡雪娇 陈行健 张梁 XUE Wei;HONG Xiaoyu;HU Xuejiao;CHEN Xingjian;ZHANG Liang(School of Information Science and Technology,Nanjing Agricultural University,Nanjing 210095,China;National Engineering Laboratory for Cereal Fermention Technology,Jiangnan University,Wuxi 214122,China)
出处 《食品与生物技术学报》 CAS CSCD 北大核心 2020年第2期66-73,共8页 Journal of Food Science and Biotechnology
基金 国家重点研发计划项目(2017YFD0800204) 国家“十二五”科技支撑计划项目(2015BAK36B05) 江苏省自然科学基金项目(BK2012363) 中央高校基本科研业务费专项资金项目(Y0201600175)。
关键词 ML-RBF 亚细胞定位 轮廓系数 词袋模型 ML-RBF protein subcellular localization silhouette coefficient Bag of Words
  • 相关文献

参考文献4

二级参考文献21

共引文献59

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部