超过6000种人类疾病是由非同义单核苷酸变异(Non-synonymous single nucleotide variations,nsSNVs)引发的,快速准确地预测nsSNVs的致病性,有助于理解发病原理和设计新药物,也是生物信息领域的重要研究课题之一。该文给出了nsSNVs致病...超过6000种人类疾病是由非同义单核苷酸变异(Non-synonymous single nucleotide variations,nsSNVs)引发的,快速准确地预测nsSNVs的致病性,有助于理解发病原理和设计新药物,也是生物信息领域的重要研究课题之一。该文给出了nsSNVs致病性研究的重要意义与背景知识;总结了国内外研究的主流方法,包括基于突变频率的方法、基于通路的方法、结合基因组和转录信息的方法、基于序列进化保守性的方法、基于序列和结构混合特征的方法以及综合评价类方法,对代表性方法进行了阐述;给出了nsSNVs致病性研究中常用的数据库、特征表示方法以及性能评价指标,多角度地对12种nsSNVs致病性预测方法进行了比较;最后,展望了nsSNVs致病性预测中可能取得突破的若干研究方向。展开更多
A large language model(LLM)is constructed to address the sophisticated demands of data retrieval and analysis,detailed well profiling,computation of key technical indicators,and the solutions to complex problems in re...A large language model(LLM)is constructed to address the sophisticated demands of data retrieval and analysis,detailed well profiling,computation of key technical indicators,and the solutions to complex problems in reservoir performance analysis(RPA).The LLM is constructed for RPA scenarios with incremental pre-training,fine-tuning,and functional subsystems coupling.Functional subsystem and efficient coupling methods are proposed based on named entity recognition(NER),tool invocation,and Text-to-SQL construction,all aimed at resolving pivotal challenges in developing the specific application of LLMs for RDA.This study conducted a detailed accuracy test on feature extraction models,tool classification models,data retrieval models and analysis recommendation models.The results indicate that these models have demonstrated good performance in various key aspects of reservoir dynamic analysis.The research takes some injection and production well groups in the PK3 Block of the Daqing Oilfield as an example for testing.Testing results show that our model has significant potential and practical value in assisting reservoir engineers with RDA.The research results provide a powerful support to the application of LLM in reservoir performance analysis.展开更多
文摘超过6000种人类疾病是由非同义单核苷酸变异(Non-synonymous single nucleotide variations,nsSNVs)引发的,快速准确地预测nsSNVs的致病性,有助于理解发病原理和设计新药物,也是生物信息领域的重要研究课题之一。该文给出了nsSNVs致病性研究的重要意义与背景知识;总结了国内外研究的主流方法,包括基于突变频率的方法、基于通路的方法、结合基因组和转录信息的方法、基于序列进化保守性的方法、基于序列和结构混合特征的方法以及综合评价类方法,对代表性方法进行了阐述;给出了nsSNVs致病性研究中常用的数据库、特征表示方法以及性能评价指标,多角度地对12种nsSNVs致病性预测方法进行了比较;最后,展望了nsSNVs致病性预测中可能取得突破的若干研究方向。
基金Supported by the National Talent Fund of the Ministry of Science and Technology of China(20230240011)China University of Geosciences(Wuhan)Research Fund(162301192687)。
文摘A large language model(LLM)is constructed to address the sophisticated demands of data retrieval and analysis,detailed well profiling,computation of key technical indicators,and the solutions to complex problems in reservoir performance analysis(RPA).The LLM is constructed for RPA scenarios with incremental pre-training,fine-tuning,and functional subsystems coupling.Functional subsystem and efficient coupling methods are proposed based on named entity recognition(NER),tool invocation,and Text-to-SQL construction,all aimed at resolving pivotal challenges in developing the specific application of LLMs for RDA.This study conducted a detailed accuracy test on feature extraction models,tool classification models,data retrieval models and analysis recommendation models.The results indicate that these models have demonstrated good performance in various key aspects of reservoir dynamic analysis.The research takes some injection and production well groups in the PK3 Block of the Daqing Oilfield as an example for testing.Testing results show that our model has significant potential and practical value in assisting reservoir engineers with RDA.The research results provide a powerful support to the application of LLM in reservoir performance analysis.