期刊文献+

面向合成生物学的机器学习方法及应用 被引量:9

Machine learning for synthetic biology:Methods and applications
原文传递
导出
摘要 机器学习的目标是设计可以根据先验知识和观测数据不断改进其性能的算法.该算法可以帮助机器从大量的数据中提取知识,从而提升其在特定任务上的性能.作为数据驱动的方法,机器学习可以有效利用高通量实验技术产生的大批量生物数据,实现合成生物体的功能预测与智能化设计,改变合成生物学的研究范式.本文首先介绍机器学习在合成生物学领域广泛应用的几个模型及方法,如支持向量机、神经网络、生成式对抗网络、深度强化学习等.然后介绍机器学习方法在合成生物学领域的典型应用,如启动子预测、酶催化设计、代谢途径构建、基因线路设计等.本文综述面向合成生物学的机器学习方法及应用,并试图启发读者如何选择和设计机器学习方法用于合成生物学的研究. Traditional synthetic biology takes a trial-and-error approach,suffering from inefficiency and local optima.Recent advances in high-throughput experimental techniques generate a huge amount of biological data,which enables the use of machine learning to close the“design-build-test-learn”loop.Machine learning,especially deep learning,is a data-driven modeling method,which extracts useful patterns from big data and then leverages learned knowledge to tackle specific tasks.In this review,we aim to provide a brief primer of machine learning to synthetic biologists.Starting with common taxonomy,we introduce representative methods,pipelines,and underlying principles of machine learning that can be applied in synthetic biology.We include typical methods such as support vector machine,deep neural networks,generative adversarial nets,transfer learning and reinforcement learning.In particular,discriminative models,including convolutional neural networks and support vector machine,are appropriate for predicting sequence-function relationship.Generative models,including generative adversarial nets(GANs)and deep generative models for graph generation,are suitable for sequence or network design.Next,we review the recent applications of machine learning in studying synthetic biology parts and modules,including promoters,bioactive peptides,enzymes,metabolic pathways,and genetic circuits.For example,DeePromoter combined a convolutional neural network and a long-short term memory to achieve an accuracy as high as 90%when predicting promoter sequences.For enzyme design,a Gauss Process model was proposed with Bayesian optimization by upper confidence bound method,which resulted in the engineering of thermostable P450 enzymes.For antimicrobial peptides,a generative GAN model enhanced with a feedback mechanism was trained to design peptide sequences with new functions.Finally,we conclude with future challenges and directions.Particularly,interpretable machine learning models are desirable to guide mechanistic investigation.Moreover,it is necessary to develop new machine learning methods that are more compatible with biological data,which are heterogeneous,multi-modal(such as sequence,network,image,and structure),and lack of proper labels.With the increasing availability of big biological data and development of machine learning methods tailored for synthetic biology,we envision a paradigm shift towards a closed cycle of“design-build-testlearn”in creating artificial life with predictable functions.
作者 胡如云 张嵩亚 蒙海林 余函 张建志 罗小舟 司同 刘陈立 乔宇 Ruyun Hu;Songya Zhang;Hailin Meng;Han Yu;Jianzhi Zhang;Xiaozhou Luo;Tong Si;Chenli Liu;Yu Qiao(Institute of Advanced Computing and Digital Engineering,Shenzhen Institutes of Advanced Technology,Chinese Academy of Sciences,Shenzhen 518055,China;Institute of Synthetic Biology,Shenzhen Institutes of Advanced Technology,Chinese Academy of Sciences,Shenzhen 518055,China;Shenzhen Institute of Synthetic Biology,Shenzhen 518055,China;CAS Key Laboratory of Quantitative Engineering Biology,Shenzhen 518055,China;Center for Biological Engineering,Guangzhou Institute of Advanced Technology,Chinese Academy of Sciences,Guangzhou 511458,China)
出处 《科学通报》 EI CAS CSCD 北大核心 2021年第3期284-299,共16页 Chinese Science Bulletin
基金 深圳市科技创新委员会项目(KQTD2015033117210153)资助。
关键词 机器学习 合成生物学 生物元件设计 生物网络设计 machine learning synthetic biology synthetic biology parts design bio-networks design
  • 相关文献

参考文献1

二级参考文献39

  • 1De Mey, M., Maertens, J., Lequeux, G. J., Soetaert, W. K. and Vandamme, E. J. (2007) Construction and model-based analysis of a promoter library for E. coli: an indispensable tool for metabolic engineering. BMC Biotechnol., 7, 34.
  • 2Meng, H., Wang, J., Xiong, Z., Xu, F., Zhao, G. and Wang, Y (2013) Quantitative design of regulatory elements based on high-precision strength prediction using artificial neural network. PLoS One, 8, e60288.
  • 3Wang, J., Meng, H., Xiong, Z. and Wang, Y (2013) Design and construction of artificial biological systems for complex natural products biosynthesis. Chinese J. Biotech. (in Chinese), 29, 1146-1160.
  • 4Rhodius, V. A. and Mutalik, V. K. (2010) Predicting strength and function for promoters of the Escherichia coli alternative sigma factor, aE. Proc. Natl. Acad. Sci. USA, 107,2854-2859.
  • 5Salis, H. M., Mirsky, E. A. and Voigt, C A. (2009) Automated design of synthetic ribosome binding sites to control protein expression. Nat. Biotechnol., 27, 946-950.
  • 6Canton, B., Labno, A. and Endy, D. (2008) Refinement and standardization of synthetic biological parts and devices. Nat. Biotechnol., 26, 787-793.
  • 7Yuan, Y, Liu, 8., Xie, P., Zhang, M. Q., Li, Y, Xie, Z. and Wang, X. (2015) Model-guided quantitative analysis of microRNA-mediated regulation on competing endogenous RNAs using a synthetic gene circuit. Proc. Natl. Acad. Sci. USA, 112,3158-3163.
  • 8Qi, L., Haurwitz, R. E., Shao, w., Doudna, J. A. and Arkin, A. P. (2012) RNA processing enables predictable programming of gene expression. Nat. Biotechnol., 30, 1002-1006.
  • 9Alper, H., Fischer, C, Nevoigt, E. and Stephanopoulos, G. (2005) Tuning genetic control through promoter engineering. Proc. Natl. Acad. Sci. USA, 102, 12678-12683.
  • 10Wong, W. w., Tsai, T Y. and Liao, J. C. (2007) Single-cell zeroth-order protein degradation enhances the robustness of synthetic oscillator. Mol. Syst. Biol., 3, 130.

共引文献5

同被引文献41

引证文献9

二级引证文献33

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部