Medulloblastoma is the most common malignant pediatric brain tumor. In mice, Ptcl haploinsufficiency and disruption of DNA repair (DNA ligase IV inactivation) or cell cycle regulation (Kipl, Ink4d, or Inkd.c inactivat...Medulloblastoma is the most common malignant pediatric brain tumor. In mice, Ptcl haploinsufficiency and disruption of DNA repair (DNA ligase IV inactivation) or cell cycle regulation (Kipl, Ink4d, or Inkd.c inactivation), in conjunction with p53 dysfunction, predispose to medulloblastoma. To identify genes important for this tumor, we evaluated gene expression profiles in medulloblastomas from these mice. Unexpectedly, medulloblastoma展开更多
Natural wetlands are known to store huge amounts of organic carbon in their soils. Despite the importance of this storage,uncertainties remain about the molecular characteristics of soil organic matter(SOM), a key fac...Natural wetlands are known to store huge amounts of organic carbon in their soils. Despite the importance of this storage,uncertainties remain about the molecular characteristics of soil organic matter(SOM), a key factor governing the stability of soil organic carbon(SOC). In this study, the molecular fingerprints of SOM in a typical freshwater wetland in Northeast China were investigated using pyrolysis gas-chromatography/mass-spectrometry technology(Py-GC/MS). Results indicated that the SOC, total nitrogen(TN),and total sulfur contents of the cores varied between 16.88% and 45.83%, 0.93% and 2.82%, and 1.09% and 3.79%, respectively. The bulk δ^13C and δ^15N varied over a range of 9.85‰, between –26.85‰ and –17.00‰, and between –0.126‰ and 1.002‰, respectively. A total of 134 different pyrolytic products were identified, and they were grouped into alkyl(including n-alkanes(C:0) and n-alkenes(C:1),aliphatics(Al), aromatics(Ar), lignin(Lg), nitrogen-containing compounds(Nc), polycyclic aromatic hydrocarbons(PAHs), phenols(Phs), polysaccharides(Ps), and sulfur-containing compounds(Sc). On average, Phs moieties accounted for roughly 24.11% peak areas of the total pyrolysis products, followed by Lg(19.27%), alkyl(18.96%), other aliphatics(12.39%), Nc compounds(8.08%), Ps(6.49%), aromatics(6.32%), Sc(3.26%), and PAHs(1.12%). Soil organic matter from wetlands had more Phs and Lg and less Nc moieties in pyrolytic products than soil organic matters from forests, lake sediments, pastures, and farmland.δ^13 C distribution patterns implied more C3 plant-derived soil organic matter, but the vegetation was in succession to C4 plant from C3 plant. Significant negative correlations between Lg or Ps proportions and C3 plant proportions were observed. Multiple linear analyses implied that the Ar and Al components had negative effects on SOC. Alkyl and Ar could facilitate ratios between SOC and total nitrogen(C/N), while Al plays the opposite role. Al was positively related to the ratio of dissolved organic carbon(DOC) to SOC. In summary, SOM of wetlands might characterize by more Phs and lignin and less Nc moieties in pyrolytic products. The use of Pyrolysis gas-chromatography/mass-spectrometry(Py-GC/MS) technology provided detailed information on the molecular characteristics of SOM from a typical freshwater wetland.展开更多
Carbonate radical is among the most important environmental relevant reactive species which govern the transformation and fate of pharmaceutical contaminants(PCs).However,reaction rate constants between carbonate radi...Carbonate radical is among the most important environmental relevant reactive species which govern the transformation and fate of pharmaceutical contaminants(PCs).However,reaction rate constants between carbonate radical and most of the PCs have not been experimentally determined,and quantitative structural-activity relationships(QSARs)have not been established for rate estimation.This study applied Max Min data processing method and used molecular fingerprints(MF)as the input of a deep neural network(DNN)to predict the rate constants between carbonate radical and organic compounds.MF parameters and the hyper-structure of the DNN were adjusted to yield satisfactory accuracy of rate prediction.The vector length of 512 bits with radius of 1 for MF and 5 hidden layers gave the best performance.The optimized MaxMin-MF-DNN model was compared with some of the most commonly used QSARs and machine learning methods,including random data splitting,molecular descriptors,supporting vector machine,decision tree,etc.Results showed that the MF-DNN model out-performed the other methods by more than 10%increase in prediction accuracy.Applying this MF-DNN model,we estimated reaction rates between carbonate radical and pharmaceuticals used in human medicine(1576)and veterinary practice(390).Among them,46 drugs were identified as fast-reacting compounds,suggesting the important relations of their environmental fate with carbonate radical.展开更多
The drug development process takes a long time since it requires sorting through a large number of inactive compounds from a large collection of compounds chosen for study and choosing just the most pertinent compound...The drug development process takes a long time since it requires sorting through a large number of inactive compounds from a large collection of compounds chosen for study and choosing just the most pertinent compounds that can bind to a disease protein.The use of virtual screening in pharmaceutical research is growing in popularity.During the early phases of medication research and development,it is crucial.Chemical compound searches are nowmore narrowly targeted.Because the databases containmore andmore ligands,thismethod needs to be quick and exact.Neural network fingerprints were created more effectively than the well-known Extended Connectivity Fingerprint(ECFP).Only the largest sub-graph is taken into consideration to learn the representation,despite the fact that the conventional graph network generates a better-encoded fingerprint.When using the average or maximum pooling layer,it also contains unrelated data.This article suggested the Graph Convolutional Attention Network(GCAN),a graph neural network with an attention mechanism,to address these problems.Additionally,it makes the nodes or sub-graphs that are used to create the molecular fingerprint more significant.The generated fingerprint is used to classify drugs using ensemble learning.As base classifiers,ensemble stacking is applied to Support Vector Machines(SVM),Random Forest,Nave Bayes,Decision Trees,AdaBoost,and Gradient Boosting.When compared to existing models,the proposed GCAN fingerprint with an ensemble model achieves relatively high accuracy,sensitivity,specificity,and area under the curve.Additionally,it is revealed that our ensemble learning with generated molecular fingerprint yields 91%accuracy,outperforming earlier approaches.展开更多
The capture of trace amounts of non-methane hydrocarbons(NMHCs)from air due to the toxicity of volatile organic compounds is a significant challenge.A total of 31399 hydrophobic metal–organic frameworks(MOFs)were fir...The capture of trace amounts of non-methane hydrocarbons(NMHCs)from air due to the toxicity of volatile organic compounds is a significant challenge.A total of 31399 hydrophobic metal–organic frameworks(MOFs)were first screened from 137953 hypothetical MOFs using high-throughput computational screening(HTCS),and their performance indices(adsorption capacity and selectivity)for the adsorption of NMHCs(C_(3)–C_(6))were obtained by molecular simulations.The discovery of a“second peak”near twice the kinetic diameter of the corresponding NMHC provided more choices for excellent MOFs that adsorb NMHCs.Four machine learning(ML)classification and regression algorithms predicted the performance of MOFs,and the relative importance values of the six descriptors were determined.The combination of the Random Forests algorithm and Molecular ACCess Systems molecular fingerprint(MF)had an excellent predictive ability for MOFs.According to the performance,the fingerprint commonalities of the 100 top-performing MOFs were counted,and the excellent bits(EBs)that could promote the performance were defined.Finally,new substructures containing all of the EBs were designed for each NMHC to build a new MOF database.This work combined the HTCS,ML,and MF to provide a detailed insight into the design of efficient MOFs for adsorbing NMHCs.展开更多
The main cultivated varieties in the world belong to the species of upland cotton(Gossypium hirsutum L.),and their genetic background is very narrow.However,the wild species and races in
【目的】分析玫瑰及其近缘种之间的遗传多样性并构建指纹图谱,为玫瑰种质资源鉴定与开发利用奠定基础。【方法】在玫瑰的传统品种、杂交繁育品种和国内外引进品种中各选择1种,取其鲜嫩叶片进行转录组测序;基于测序得到的玫瑰转录组数据...【目的】分析玫瑰及其近缘种之间的遗传多样性并构建指纹图谱,为玫瑰种质资源鉴定与开发利用奠定基础。【方法】在玫瑰的传统品种、杂交繁育品种和国内外引进品种中各选择1种,取其鲜嫩叶片进行转录组测序;基于测序得到的玫瑰转录组数据,使用MISA在reads覆盖的基因组数据中查找玫瑰的SSR位点,并根据SSR位点两端的保守序列使用Primer 3.0设计引物。选取10种玫瑰的DNA作为试验材料,筛选设计、合成后的引物。以48份玫瑰及其近缘种的DNA作为试验材料,利用筛选出的峰值较好的引物进行TP-M13-SSR PCR,并对其扩增产物进行毛细管电泳检测,应用GeneMarker 2.2.0(SoftGenetics,USA)读取毛细管电泳数据并用Excel进行整理;使用POPGEN VERSION 1.32计算筛选引物的观测杂合度、期望杂合度、Nei’s遗传多样性指数、观测等位基因数、有效等位基因数、Shannon信息指数,并用CERVUS version 3.0计算多态性信息含量。利用Powermarker计算玫瑰及其近缘种各种质之间的遗传距离;采用NTSYSpc 2.10e计算每2个种质之间的遗传相似性系数,并绘制UPGMA聚类树状图。最后采用引物与基因型组合的方式构建玫瑰及其近缘种的指纹图谱。【结果】基于玫瑰样品转录组测序数据,使用MISA共检测出48796个SSR位点,分布于139712条Unigene中,碱基重复类型数量最多的为二核苷酸重复和三核苷酸重复,分别为20628和12828个。使用Primer 3.0以SSR位点两端的保守序列为依据初步设计并合成了144对引物;以10个玫瑰品种的DNA作为模板筛选引物,共筛选出峰值较好的28对引物。以48份玫瑰及其近缘种的DNA为试验材料,28对引物在48份试验材料中均能扩增出峰型良好、多态性高的DNA片段;28对引物的观测杂合度、期望杂合度、Nei’s遗传多样性指数、观测等位基因数、有效等位基因数、Shannon信息指数和多态性信息含量的平均值分别为0.4101,0.7505,0.7011,4.607个,3.5116个,1.3442和0.6526。大多数供试样品间的遗传距离为0.6000~0.8000;聚类分析结果显示,在遗传相似系数为0.571时,48份玫瑰及其近缘种被分为两类。运用核心引物法筛选出的4对核心引物可将48份试验材料全部区分开,并构建了其指纹图谱。【结论】开发并筛选出28对多态性较好的SSR引物,可用于后续玫瑰的遗传多样性分析、遗传图谱构建、遗传稳定性鉴定等方面。展开更多
文摘Medulloblastoma is the most common malignant pediatric brain tumor. In mice, Ptcl haploinsufficiency and disruption of DNA repair (DNA ligase IV inactivation) or cell cycle regulation (Kipl, Ink4d, or Inkd.c inactivation), in conjunction with p53 dysfunction, predispose to medulloblastoma. To identify genes important for this tumor, we evaluated gene expression profiles in medulloblastomas from these mice. Unexpectedly, medulloblastoma
基金Under the auspices of the National Key R&D Program of China(No.2016YFC0500404)the National Natural Science Foundation of China(No.41671087,41671081,41771103)the Youth Innovation Promotion Association,Chinese Academy of Sciences(No.2018265)
文摘Natural wetlands are known to store huge amounts of organic carbon in their soils. Despite the importance of this storage,uncertainties remain about the molecular characteristics of soil organic matter(SOM), a key factor governing the stability of soil organic carbon(SOC). In this study, the molecular fingerprints of SOM in a typical freshwater wetland in Northeast China were investigated using pyrolysis gas-chromatography/mass-spectrometry technology(Py-GC/MS). Results indicated that the SOC, total nitrogen(TN),and total sulfur contents of the cores varied between 16.88% and 45.83%, 0.93% and 2.82%, and 1.09% and 3.79%, respectively. The bulk δ^13C and δ^15N varied over a range of 9.85‰, between –26.85‰ and –17.00‰, and between –0.126‰ and 1.002‰, respectively. A total of 134 different pyrolytic products were identified, and they were grouped into alkyl(including n-alkanes(C:0) and n-alkenes(C:1),aliphatics(Al), aromatics(Ar), lignin(Lg), nitrogen-containing compounds(Nc), polycyclic aromatic hydrocarbons(PAHs), phenols(Phs), polysaccharides(Ps), and sulfur-containing compounds(Sc). On average, Phs moieties accounted for roughly 24.11% peak areas of the total pyrolysis products, followed by Lg(19.27%), alkyl(18.96%), other aliphatics(12.39%), Nc compounds(8.08%), Ps(6.49%), aromatics(6.32%), Sc(3.26%), and PAHs(1.12%). Soil organic matter from wetlands had more Phs and Lg and less Nc moieties in pyrolytic products than soil organic matters from forests, lake sediments, pastures, and farmland.δ^13 C distribution patterns implied more C3 plant-derived soil organic matter, but the vegetation was in succession to C4 plant from C3 plant. Significant negative correlations between Lg or Ps proportions and C3 plant proportions were observed. Multiple linear analyses implied that the Ar and Al components had negative effects on SOC. Alkyl and Ar could facilitate ratios between SOC and total nitrogen(C/N), while Al plays the opposite role. Al was positively related to the ratio of dissolved organic carbon(DOC) to SOC. In summary, SOM of wetlands might characterize by more Phs and lignin and less Nc moieties in pyrolytic products. The use of Pyrolysis gas-chromatography/mass-spectrometry(Py-GC/MS) technology provided detailed information on the molecular characteristics of SOM from a typical freshwater wetland.
基金supported by the National Natural Science Foundation of China(No.41703101)the Beijing Outstanding Young Scientist Program(No.BJJWZYJH01201910004016)。
文摘Carbonate radical is among the most important environmental relevant reactive species which govern the transformation and fate of pharmaceutical contaminants(PCs).However,reaction rate constants between carbonate radical and most of the PCs have not been experimentally determined,and quantitative structural-activity relationships(QSARs)have not been established for rate estimation.This study applied Max Min data processing method and used molecular fingerprints(MF)as the input of a deep neural network(DNN)to predict the rate constants between carbonate radical and organic compounds.MF parameters and the hyper-structure of the DNN were adjusted to yield satisfactory accuracy of rate prediction.The vector length of 512 bits with radius of 1 for MF and 5 hidden layers gave the best performance.The optimized MaxMin-MF-DNN model was compared with some of the most commonly used QSARs and machine learning methods,including random data splitting,molecular descriptors,supporting vector machine,decision tree,etc.Results showed that the MF-DNN model out-performed the other methods by more than 10%increase in prediction accuracy.Applying this MF-DNN model,we estimated reaction rates between carbonate radical and pharmaceuticals used in human medicine(1576)and veterinary practice(390).Among them,46 drugs were identified as fast-reacting compounds,suggesting the important relations of their environmental fate with carbonate radical.
文摘The drug development process takes a long time since it requires sorting through a large number of inactive compounds from a large collection of compounds chosen for study and choosing just the most pertinent compounds that can bind to a disease protein.The use of virtual screening in pharmaceutical research is growing in popularity.During the early phases of medication research and development,it is crucial.Chemical compound searches are nowmore narrowly targeted.Because the databases containmore andmore ligands,thismethod needs to be quick and exact.Neural network fingerprints were created more effectively than the well-known Extended Connectivity Fingerprint(ECFP).Only the largest sub-graph is taken into consideration to learn the representation,despite the fact that the conventional graph network generates a better-encoded fingerprint.When using the average or maximum pooling layer,it also contains unrelated data.This article suggested the Graph Convolutional Attention Network(GCAN),a graph neural network with an attention mechanism,to address these problems.Additionally,it makes the nodes or sub-graphs that are used to create the molecular fingerprint more significant.The generated fingerprint is used to classify drugs using ensemble learning.As base classifiers,ensemble stacking is applied to Support Vector Machines(SVM),Random Forest,Nave Bayes,Decision Trees,AdaBoost,and Gradient Boosting.When compared to existing models,the proposed GCAN fingerprint with an ensemble model achieves relatively high accuracy,sensitivity,specificity,and area under the curve.Additionally,it is revealed that our ensemble learning with generated molecular fingerprint yields 91%accuracy,outperforming earlier approaches.
基金National Natural Science Foundation of China(Nos.21978058 and 21676094)the Pearl River Talent Recruitment Program,China(No.2019QN01L255)+1 种基金the Natural Science Foundation of Guangdong Province,China(No.2020A1515010800)the Guangzhou Municipal Science and Technology Project,China(No.202102020875)for the financial support.
文摘The capture of trace amounts of non-methane hydrocarbons(NMHCs)from air due to the toxicity of volatile organic compounds is a significant challenge.A total of 31399 hydrophobic metal–organic frameworks(MOFs)were first screened from 137953 hypothetical MOFs using high-throughput computational screening(HTCS),and their performance indices(adsorption capacity and selectivity)for the adsorption of NMHCs(C_(3)–C_(6))were obtained by molecular simulations.The discovery of a“second peak”near twice the kinetic diameter of the corresponding NMHC provided more choices for excellent MOFs that adsorb NMHCs.Four machine learning(ML)classification and regression algorithms predicted the performance of MOFs,and the relative importance values of the six descriptors were determined.The combination of the Random Forests algorithm and Molecular ACCess Systems molecular fingerprint(MF)had an excellent predictive ability for MOFs.According to the performance,the fingerprint commonalities of the 100 top-performing MOFs were counted,and the excellent bits(EBs)that could promote the performance were defined.Finally,new substructures containing all of the EBs were designed for each NMHC to build a new MOF database.This work combined the HTCS,ML,and MF to provide a detailed insight into the design of efficient MOFs for adsorbing NMHCs.
文摘The main cultivated varieties in the world belong to the species of upland cotton(Gossypium hirsutum L.),and their genetic background is very narrow.However,the wild species and races in
文摘【目的】分析玫瑰及其近缘种之间的遗传多样性并构建指纹图谱,为玫瑰种质资源鉴定与开发利用奠定基础。【方法】在玫瑰的传统品种、杂交繁育品种和国内外引进品种中各选择1种,取其鲜嫩叶片进行转录组测序;基于测序得到的玫瑰转录组数据,使用MISA在reads覆盖的基因组数据中查找玫瑰的SSR位点,并根据SSR位点两端的保守序列使用Primer 3.0设计引物。选取10种玫瑰的DNA作为试验材料,筛选设计、合成后的引物。以48份玫瑰及其近缘种的DNA作为试验材料,利用筛选出的峰值较好的引物进行TP-M13-SSR PCR,并对其扩增产物进行毛细管电泳检测,应用GeneMarker 2.2.0(SoftGenetics,USA)读取毛细管电泳数据并用Excel进行整理;使用POPGEN VERSION 1.32计算筛选引物的观测杂合度、期望杂合度、Nei’s遗传多样性指数、观测等位基因数、有效等位基因数、Shannon信息指数,并用CERVUS version 3.0计算多态性信息含量。利用Powermarker计算玫瑰及其近缘种各种质之间的遗传距离;采用NTSYSpc 2.10e计算每2个种质之间的遗传相似性系数,并绘制UPGMA聚类树状图。最后采用引物与基因型组合的方式构建玫瑰及其近缘种的指纹图谱。【结果】基于玫瑰样品转录组测序数据,使用MISA共检测出48796个SSR位点,分布于139712条Unigene中,碱基重复类型数量最多的为二核苷酸重复和三核苷酸重复,分别为20628和12828个。使用Primer 3.0以SSR位点两端的保守序列为依据初步设计并合成了144对引物;以10个玫瑰品种的DNA作为模板筛选引物,共筛选出峰值较好的28对引物。以48份玫瑰及其近缘种的DNA为试验材料,28对引物在48份试验材料中均能扩增出峰型良好、多态性高的DNA片段;28对引物的观测杂合度、期望杂合度、Nei’s遗传多样性指数、观测等位基因数、有效等位基因数、Shannon信息指数和多态性信息含量的平均值分别为0.4101,0.7505,0.7011,4.607个,3.5116个,1.3442和0.6526。大多数供试样品间的遗传距离为0.6000~0.8000;聚类分析结果显示,在遗传相似系数为0.571时,48份玫瑰及其近缘种被分为两类。运用核心引物法筛选出的4对核心引物可将48份试验材料全部区分开,并构建了其指纹图谱。【结论】开发并筛选出28对多态性较好的SSR引物,可用于后续玫瑰的遗传多样性分析、遗传图谱构建、遗传稳定性鉴定等方面。