摘要
针对图神经网络(GNN)ProteinSolver结构特征约束不充分的问题,增加了骨架二面角、配对氨基酸的相对位置编码和相对方向等结构约束,提出了一种基于GNN的固定骨架蛋白质设计方法。实现了基于Transformer多头注意力机制的GNN架构,将物理坐标添加到消息传递和更新步骤中,提高了原子坐标的等变特性。在CATH数据集上的训练和测试结果显示:该文模型平均困惑度为8.12,比ProteinSolver的平均困惑度8.97降低了0.85;在掩盖率为50%时,ProteinSolver的恢复率为28.7%;然后,增加更多的结构约束,恢复率达到了30.3%;随后,将ProteinSolver的GNN替换成基于Transformer的GNN,恢复率达到了34.3%;最后,通过再引入等变特性,恢复率进一步提高到35.0%。
To solve the problem of insufficient constraints on the structural characteristics of graph neural network(GNN)ProteinSolver,structural constraints such as skeleton dihedral angle,relative position encoding and relative direction of paired amino acids are added,and a design method of GNN based fixed skeleton protein is proposed.The GNN architecture based on Transformer’s multi head attention mechanism is implemented,and the physical coordinates are further added to the message transmission and update steps to improve the equivariant characteristics of atomic coordinates.The training and testing results on CATH dataset show that the average sequence perplexity of this model is 8.12,which is 0.85 lower than the average sequence perplexity of ProteinSolver of 8.97;when the concealment rate is 50%,the sequence recovery rate of ProteinSolver is 28.7%;then,by adding more structural constraints,the sequence recovery rate reaches 30.3%;subsequently,the GNN of ProteinSolver is replaced by the GNN based on Transformer,and the sequence recovery rate reaches 34.3%;finally,by introducing the equivariant feature,the sequence recovery rate further improves to 35.0%.
作者
刘炎
袁野
沈红斌
Liu Yan;Yuan Ye;Shen Hongbin(Institute of Image Processing and Pattern Recognition,Shanghai Jiao Tong University,Shanghai 200240,China)
出处
《南京理工大学学报》
CAS
CSCD
北大核心
2023年第3期311-317,329,共8页
Journal of Nanjing University of Science and Technology
关键词
图神经网络
固定骨架蛋白质
蛋白质设计
结构特征约束
骨架二面角
配对氨基酸
相对位置编码
相对方向
graph neural network
fixed skeleton protein
protein design
structural feature constraints
skeleton dihedral angle
paired amino acids
relative position coding
relative direction