摘要
针对蛋白质相互作用(protein-protein interaction,PPI)网络中存在大量噪声以及现有关键蛋白识别方法准确率不高等问题,提出了一种基于中心性和模块特性(united centrality and modularity,UCM)的方法来识别关键蛋白质。首先,整合蛋白质拓扑数据和生物数据构建多元属性网络,以降低PPI网络中噪声的影响;其次,根据关键蛋白质的拓扑特性和生物特性,提出一种挖掘稠密且高度共表达的关键模块算法,从多元属性网络中挖掘高可靠性的关键模块,以从多维角度强化关键蛋白质在模块中的重要程度;最后,整合蛋白质的中心性和模块化特性,设计一种衡量蛋白质关键性的策略(essential integration strategy,EIS),以提高识别高关键蛋白质的准确率。UCM方法应用在DIP数据集上进行验证,实验结果表明,与其他10种关键蛋白质识别方法相比较,该方法具有较好的识别性能,能够识别更多的关键蛋白质。
Due to the noise in PPI network,as well as the poor identification accuracy of essential proteins,this paper proposed a method named UCM based on centrality and modularity to identify essential proteins.Firstly,this method integrated topological data and biological data to construct multi-attribute network to reduce the noise(the false positive and the false negative)impact in the original PPI network.Secondly,according to the topological property and biological property of essential proteins,this paper developed a clustering algorithm to mine essential modules from multi-attribute network,which emphasized the importance of the essential proteins from multi-dimension in essential modules.Finally,based on centrality and modularity,it designed an EIS to improve the accuracy of predicting essential proteins by topological properties and biological properties.This paper applied UCM method to the DIP dataset for predicting essential proteins.Compared with other ten methods of predicting essential proteins,the experimental results show that this method can identify more essential proteins and have a better performance on predicting essential proteins.
作者
毛伊敏
章宇盟
胡健
Mao Yimin;Zhang Yumeng;Hu Jian(School of Information Engineering,Jiangxi University of Science&Technolo-gy,Ganzhou Jiangxi 341000,China;Dept.of Information Engineering,College of Applied Science,Jiangxi University of Science&Technolo-gy,Ganzhou Jiangxi 341000,China)
出处
《计算机应用研究》
CSCD
北大核心
2020年第7期1983-1988,共6页
Application Research of Computers
基金
国家自然科学基金资助项目(41562019,41530640)
江西省自然基金资助项目(GJJ161566,20161BAB203093)
江西省教育厅科技项目(GJJ181504,GJJ151528)。
关键词
蛋白质相互作用网络
多元属性
关键模块
中心性
关键蛋白质
protein interaction network
multiple attribute
essential modules
centrality
essential proteins