基于机器学习的蛋白质编码问题研究

Application and Research of Machine Leaning Based Protein Sequence Encoding

下载PDF

导出

摘要氨基酸序列编码问题一直是在蛋白质结构预测中导致算法输入空间较大的主要原因。只有对氨基酸序列进行更好的编码．才能为后续进行计算机分析打下基础。提出并实现了综合考虑了氨基酸序列的划分和长程作用效应，利用氨基酸正交编码区分每个氨基酸个体，利用基本正交矩阵获得氨基酸在物理、化学、生物上的相似性，利用分属概率来获得当前蛋白质序列中氨基酸构成不同二级结构的趋势的新的混合编码方法，从而改进了氨基酸残基序列编码，并利用现有算法比较了不同编码方式对蛋白质二级结构预测的影响，结果证实该编码方式能够提高蛋白质二级结构预测的准确性。 Amino acid sequence encoding problem will lead to the protein structure prediction overfit. With good encoding scheme, we can get a better prediction. We discuss and implement a better encoding for the computer analysis. We mainly think about the effect of amino acid sequence division and interaction, using cross-matrix to present the comparability of physical, chemical, biological characters of amino acid. The encoding scheme of amino acid is improved and a comparison of different encoding schemes is made. We also make a compare of the difference between our encoding and other encoding for protein structure prediction, and our encoding is proved to be better.

作者李冠宇朱宏明周闻钧 LI Guan-yu, ZHU Hong-ming, ZHOU Wen-jun （School of software engineering, Tongji University, Shanghai 200092, China）

机构地区同济大学软件学院

出处《电脑知识与技术》 2008年第12期1713-1716,共4页 Computer Knowledge and Technology

关键词蛋白质结构预测编码机器学习 protein structure prediction encoding scheme machine learning

分类号 TP181 [自动化与计算机技术—控制理论与控制工程]

引文网络
相关文献

参考文献5

1Day R O,Lamont G B,Pachter R.Protein Structure Prediction by Applying an Evolutionary Algorithm[].IEEE Proceedings of the inter-national parallel and distributed processing symposium.
2Baldi P,Pollastri G.A Machine Learning Strategy for Protein Analysis[].IEEE intelligent system.2002
3Chou PY,Fasman GD.Prediction of the secondary structure of proteins from their amino acid sequence[].Advances in Enzymology.1978
4Rost B,Sander C.Prediction of protein secondary structure at better than 70% accuracy[].Journal of Molecular Biology.1993
5Baum,L.E.An inequality and associated maximization technique in statistical estimation for probabilistic functions of markov chains[].Inequalities.1972

1最小的基因组[J].百科知识,2005(12X):4-4.
2CarolineA.Kovac.通向生命科学的桥[J].中国计算机用户,2002(24):26-26.
3王孝洪,刘伟.基于小波包分解的遥感图像压缩方法[J].兵工自动化,2004,23(5):49-51.
4黄波,陈怀熹,马培羽.基于自反馈Hopfield网络的快速文本分类器[J].计算机工程与设计,2009,30(11):2760-2762.
5缪建明,张全.信息检索中的广义作用效应链应用研究[J].计算机科学,2009,36(8):193-195.
6孙磊,王龙,戴丹,马若楠.人和小鼠长非编码RNA基因的特征分析[J].江苏师范大学学报（自然科学版）,2016,34(2):51-55.
7甘敏,刘得潭,王超.分子量分解问题[J].科技致富向导,2015,0(3):166-166.
8曹叡,吴玲达,邓维.一种面向重构的XML混合编码方法[J].微电子学与计算机,2014,31(4):1-5. 被引量：1
9吴金华,洪春勇.一种基于提升小波变换和快速分形的混合编码[J].科技广场,2008(3):125-127.
10付天晖,隋波,苏敏.码分多址实验仿真软件设计[J].实验室科学,2014,17(3):71-74.

电脑知识与技术

2008年第12期

浏览历史

内容加载中请稍等...

基于机器学习的蛋白质编码问题研究

参考文献5

相关作者

相关机构

相关主题

浏览历史