摘要
关键蛋白质在维持生物体的生理活动中发挥着重要的作用,预测关键蛋白质有助于设计药物分子靶标.随着高通量技术的发展,基于蛋白质相互作用关系数据采用计算方法识别关键蛋白质成为当前的热门研究.研究表明,将蛋白质相互作用网络与其他生物学信息结合起来能够更有效地识别关键蛋白质.因此,本研究提出一种整合蛋白质相互作用数据、基因本体注释信息、蛋白质亚细胞定位信息及蛋白质结构域信息的识别关键蛋白质的新方法TGSD.为了评估新算法的有效性,选取4组常用的酵母测试数据集进行仿真实验,详细比较TGSD方法与其他7种经典方法的识别效果.数值结果显示,TGSD在预测正确关键蛋白质数目和准确率等统计指标上明显优于其他算法.
Essential proteins play important roles in maintaining the normal physiological activities of organisms, and the detection of essential proteins is helpful for designing drug target. With the development of high-throughput technology, developing computational algorithms for identifying essential proteins is a hot topic at present based on the existing protein-protein interaction data. Existing studies have shown that effectively integrating PPI network topologies with other biological information is helpful for improving the performance of identifying key proteins. Therefore, we presented a new method TGSD, which integrated fusing network topology with gene ontology annotation, protein subcellular localization and protein domain information. In order to evaluate the effectiveness of the new algorithm, four benchmark yeast datasets were selected for simulation experiments. Numerical results show that TGSD performs significantly better in predicting the number of correct essential proteins, the accuracy and other statistical indicators than other classic methods.
作者
薛晓丽
刘俊宏
张伟
XUE Xiaoli;LIU Junhong;ZHANG Wei(School of Science,East China Jiaotong University,Nanchang 330013,China)
出处
《湖北大学学报(自然科学版)》
CAS
2023年第1期139-148,共10页
Journal of Hubei University:Natural Science
基金
国家自然科学基金(12161039,61802125)
江西省自然科学基金(20212ACB211002,20181BAB202006)资助。
关键词
关键蛋白质
蛋白质相互作用网络
多源生物学数据
计算方法
essential protein
protein-protein interaction network
multi-sources biological data
calculation method