Maize stalk rot reduces grain yield and quality.Information about the genetics of resistance to maize stalk rot could help breeders design effective breeding strategies for the trait.Genomic prediction may be a more e...Maize stalk rot reduces grain yield and quality.Information about the genetics of resistance to maize stalk rot could help breeders design effective breeding strategies for the trait.Genomic prediction may be a more effective breeding strategy for stalk-rot resistance than marker-assisted selection.We performed a genome-wide association study(GWAS)and genomic prediction of resistance in testcross hybrids of 677 inbred lines from the Tuxpe?o and non-Tuxpe?o heterotic pools grown in three environments and genotyped with 200,681 single-nucleotide polymorphisms(SNPs).Eighteen SNPs associated with stalk rot shared genomic regions with gene families previously associated with plant biotic and abiotic responses.More favorable SNP haplotypes traced to tropical than to temperate progenitors of the inbred lines.Incorporating genotype-by-environment(G×E)interaction increased genomic prediction accuracy.展开更多
Objective:To analyze the independent risk factors for the occurrence of moderate-to-severe metabolic-associated fatty liver disease(MAFLD),to construct a prediction model for moderate-to-severe MAFLD,and to verify the...Objective:To analyze the independent risk factors for the occurrence of moderate-to-severe metabolic-associated fatty liver disease(MAFLD),to construct a prediction model for moderate-to-severe MAFLD,and to verify the validity of the model.Methods:In the first part,278 medical examiners who were diagnosed with MAFLD in Medical Examination Center at the Second Affiliated Hospital of Hainan University from January to May 2022 were taken as the study subjects(training set),and they were divided into mild MAFLD group(200)and moderate-severe MAFLD group(78)based on ultrasound results.Demographic data and laboratory indexes were collected,and risk factors were screened by univariate and multifactor analysis.In the second part,a dichotomous logistic regression equation was used to construct a prediction model for moderate-to-severe MAFLD,and the model was visualized in a line graph.In the third part,the MAFLD population(200 people in the external validation set)from our physical examination center from November to December 2022 was collected as the moderate-to-severe MAFLD prediction model,and the risk factors in both groups were compared.The receiver operating characteristic(ROC)curves,calibration curves,and clinical applicability of the model were plotted to represent model discrimination for internal and external validation.Results:The risk factors of moderate-to-severe MAFLD were fasting glucose(FPG),blood uric acid(UA),triglycerides(TG),triglyceride glucose index(TyG),total cholesterol(CHOL),and high-density lipoprotein(HDL-C).UA[OR=1.021,95%CI(1.015,1.027),P<0.001]and FPG[OR=1.575,95%CI(1.158,2.143),P=0.004]were independent risk factors for people with moderate to severe MAFLD.The visualized line graph model showed that UA was the factor contributing more to the risk of moderate to severe MAFLD in this model.The ROC curves showed AUC values of 0.8701,0.8686 and 0.7991 for the training set,internal validation set and external validation set,respectively.The curves almost coincided with the reference line after calibration of the model calibration degree with P>0.05 in Hosmer-Lemeshow test.The decision curve analysis(DCA)plotted by the clinical applicability of the model was higher than the two extreme curves,predicting that patients with moderate to severe MAFLD would benefit from the prediction model.Conclusion:The prediction model constructed by combining FPG with UA has higher accuracy and better clinical applicability,and can be used for clinical diagnosis.展开更多
Association rule learning(ARL)is a widely used technique for discovering relationships within datasets.However,it often generates excessive irrelevant or ambiguous rules.Therefore,post-processing is crucial not only f...Association rule learning(ARL)is a widely used technique for discovering relationships within datasets.However,it often generates excessive irrelevant or ambiguous rules.Therefore,post-processing is crucial not only for removing irrelevant or redundant rules but also for uncovering hidden associations that impact other factors.Recently,several post-processing methods have been proposed,each with its own strengths and weaknesses.In this paper,we propose THAPE(Tunable Hybrid Associative Predictive Engine),which combines descriptive and predictive techniques.By leveraging both techniques,our aim is to enhance the quality of analyzing generated rules.This includes removing irrelevant or redundant rules,uncovering interesting and useful rules,exploring hidden association rules that may affect other factors,and providing backtracking ability for a given product.The proposed approach offers a tailored method that suits specific goals for retailers,enabling them to gain a better understanding of customer behavior based on factual transactions in the target market.We applied THAPE to a real dataset as a case study in this paper to demonstrate its effectiveness.Through this application,we successfully mined a concise set of highly interesting and useful association rules.Out of the 11,265 rules generated,we identified 125 rules that are particularly relevant to the business context.These identified rules significantly improve the interpretability and usefulness of association rules for decision-making purposes.展开更多
The traditional method of screening plants for disease resistance phenotype is both time-consuming and costly.Genomic selection offers a potential solution to improve efficiency,but accurately predicting plant disease...The traditional method of screening plants for disease resistance phenotype is both time-consuming and costly.Genomic selection offers a potential solution to improve efficiency,but accurately predicting plant disease resistance remains a challenge.In this study,we evaluated eight different machine learning(ML)methods,including random forest classification(RFC),support vector classifier(SVC),light gradient boosting machine(lightGBM),random forest classification plus kinship(RFC_K),support vector classification plus kinship(SVC_K),light gradient boosting machine plus kinship(lightGBM_K),deep neural network genomic prediction(DNNGP),and densely connected convolutional networks(DenseNet),for predicting plant disease resistance.Our results demonstrate that the three plus kinship(K)methods developed in this study achieved high prediction accuracy.Specifically,these methods achieved accuracies of up to 95%for rice blast(RB),85%for rice black-streaked dwarf virus(RBSDV),and 85%for rice sheath blight(RSB)when trained and applied to the rice diversity panel I(RDPI).Furthermore,the plus K models performed well in predicting wheat blast(WB)and wheat stripe rust(WSR)diseases,with mean accuracies of up to 90%and 93%,respectively.To assess the generalizability of our models,we applied the trained plus K methods to predict RB disease resistance in an independent population,rice diversity panel II(RDPII).Concurrently,we evaluated the RB resistance of RDPII cultivars using spray inoculation.Comparing the predictions with the spray inoculation results,we found that the accuracy of the plus K methods reached 91%.These findings highlight the effectiveness of the plus K methods(RFC_K,SVC_K,and lightGBM_K)in accurately predicting plant disease resistance for RB,RBSDV,RSB,WB,and WSR.The methods developed in this study not only provide valuable strategies for predicting disease resistance,but also pave the way for using machine learning to streamline genome-based crop breeding.展开更多
Ocean temperature is an important physical variable in marine ecosystems,and ocean temperature prediction is an important research objective in ocean-related fields.Currently,one of the commonly used methods for ocean...Ocean temperature is an important physical variable in marine ecosystems,and ocean temperature prediction is an important research objective in ocean-related fields.Currently,one of the commonly used methods for ocean temperature prediction is based on data-driven,but research on this method is mostly limited to the sea surface,with few studies on the prediction of internal ocean temperature.Existing graph neural network-based methods usually use predefined graphs or learned static graphs,which cannot capture the dynamic associations among data.In this study,we propose a novel dynamic spatiotemporal graph neural network(DSTGN)to predict threedimensional ocean temperature(3D-OT),which combines static graph learning and dynamic graph learning to automatically mine two unknown dependencies between sequences based on the original 3D-OT data without prior knowledge.Temporal and spatial dependencies in the time series were then captured using temporal and graph convolutions.We also integrated dynamic graph learning,static graph learning,graph convolution,and temporal convolution into an end-to-end framework for 3D-OT prediction using time-series grid data.In this study,we conducted prediction experiments using high-resolution 3D-OT from the Copernicus global ocean physical reanalysis,with data covering the vertical variation of temperature from the sea surface to 1000 m below the sea surface.We compared five mainstream models that are commonly used for ocean temperature prediction,and the results showed that the method achieved the best prediction results at all prediction scales.展开更多
Due to the lack of consideration of movement behavior information other than time and location perception in current location prediction methods,the movement characteristics of trajectory data cannot be well expressed...Due to the lack of consideration of movement behavior information other than time and location perception in current location prediction methods,the movement characteristics of trajectory data cannot be well expressed,which in turn affects the accuracy of the prediction results.First,a new trajectory data expression method by associating the movement behavior information is given.The pre-association method is used to model the movement behavior information according to the individual movement behavior features and the group movement behavior features extracted from the trajectory sequence and the region.The movement behavior features based on pre-association may not always be the best for the prediction model.Therefore,through association analysis and importance analysis,the final association feature is selected from the pre-association features.The trajectory data is input into the LSTM networks after associated features and genetic algorithm(GA)is used to optimize the combination of the length of time window and the number of hidden layer nodes.The experimental results show that compared with the original trajectory data,the trajectory data associated with the movement behavior information helps to improve the accuracy of location prediction.展开更多
Fusarium ear rot(FER)is a destructive maize fungal disease worldwide.In this study,three tropical maize populations consisting of 874 inbred lines were used to perform genomewide association study(GWAS)and genomic pre...Fusarium ear rot(FER)is a destructive maize fungal disease worldwide.In this study,three tropical maize populations consisting of 874 inbred lines were used to perform genomewide association study(GWAS)and genomic prediction(GP)analyses of FER resistance.Broad phenotypic variation and high heritability for FER were observed,although it was highly influenced by large genotype-by-environment interactions.In the 874 inbred lines,GWAS with general linear model(GLM)identified 3034 single-nucleotide polymorphisms(SNPs)significantly associated with FER resistance at the P-value threshold of 1×10^(-5),the average phenotypic variation explained(PVE)by these associations was 3%with a range from 2.33%to 6.92%,and 49 of these associations had PVE values greater than 5%.The GWAS analysis with mixed linear model(MLM)identified 19 significantly associated SNPs at the P-value threshold of 1×10^(-4),the average PVE of these associations was 1.60%with a range from 1.39%to 2.04%.Within each of the three populations,the number of significantly associated SNPs identified by GLM and MLM ranged from 25 to 41,and from 5 to 22,respectively.Overlapping SNP associations across populations were rare.A few stable genomic regions conferring FER resistance were identified,which located in bins 3.04/05,7.02/04,9.00/01,9.04,9.06/07,and 10.03/04.The genomic regions in bins 9.00/01 and 9.04 are new.GP produced moderate accuracies with genome-wide markers,and relatively high accuracies with SNP associations detected from GWAS.Moderate prediction accuracies were observed when the training and validation sets were closely related.These results implied that FER resistance in maize is controlled by minor QTL with small effects,and highly influenced by the genetic background of the populations studied.Genomic selection(GS)by incorporating SNP associations detected from GWAS is a promising tool for improving FER resistance in maize.展开更多
Recentlythearticle"PerioperativevonWillebrandfactordynamics are associated with liver regeneration and predict outcome afterliver resection" was published in Hepatology[1].Prof.Starlinger et al. aimed to ass...Recentlythearticle"PerioperativevonWillebrandfactordynamics are associated with liver regeneration and predict outcome afterliver resection" was published in Hepatology[1].Prof.Starlinger et al. aimed to assess the association of von Willebrand factor (vWF) levels and clinical outcome in patients with liver cancers post-liverresection(LR).Basedonthemechanismthatplatelets accumulation in the liver may promote liver regeneration after partial LR in mice, they found the vWF-dependent pattern of platelets accumulationduringliverregenerationinpatientsaftersurgery.展开更多
Genome-wide association mapping studies(GWAS)based on Big Data are a potential approach to improve marker-assisted selection in plant breeding.The number of available phenotypic and genomic data sets in which medium-s...Genome-wide association mapping studies(GWAS)based on Big Data are a potential approach to improve marker-assisted selection in plant breeding.The number of available phenotypic and genomic data sets in which medium-sized populations of several hundred individuals have been studied is rapidly increasing.Combining these data and using them in GWAS could increase both the power of QTL discovery and the accuracy of estimation of underlying genetic effects,but is hindered by data heterogeneity and lack of interoperability.In this study,we used genomic and phenotypic data sets,focusing on Central European winter wheat populations evaluated for heading date.We explored strategies for integrating these data and subsequently the resulting potential for GWAS.Establishing interoperability between data sets was greatly aided by some overlapping genotypes and a linear relationship between the different phenotyping protocols,resulting in high quality integrated phenotypic data.In this context,genomic prediction proved to be a suitable tool to study relevance of interactions between genotypes and experimental series,which was low in our case.Contrary to expectations,fewer associations between markers and traits were found in the larger combined data than in the individual experimental series.However,the predictive power based on the marker-trait associations of the integrated data set was higher across data sets.Therefore,the results show that the integration of medium-sized to Big Data is an approach to increase the power to detect QTL in GWAS.The results encourage further efforts to standardize and share data in the plant breeding community.展开更多
Single nucletide polymorphism(SNP)is an important factor for the study of genetic variation in human families and animal and plant strains.Therefore,it is widely used in the study of population genetics and disease re...Single nucletide polymorphism(SNP)is an important factor for the study of genetic variation in human families and animal and plant strains.Therefore,it is widely used in the study of population genetics and disease related gene.In pharmacogenomics research,identifying the association between SNP site and drug is the key to clinical precision medication,therefore,a predictive model of SNP site and drug association based on denoising variational auto-encoder(DVAE-SVM)is proposed.Firstly,k-mer algorithm is used to construct the initial SNP site feature vector,meanwhile,MACCS molecular fingerprint is introduced to generate the feature vector of the drug module.Then,we use the DVAE to extract the effective features of the initial feature vector of the SNP site.Finally,the effective feature vector of the SNP site and the feature vector of the drug module are fused input to the support vector machines(SVM)to predict the relationship of SNP site and drug module.The results of five-fold cross-validation experiments indicate that the proposed algorithm performs better than random forest(RF)and logistic regression(LR)classification.Further experiments show that compared with the feature extraction algorithms of principal component analysis(PCA),denoising auto-encoder(DAE)and variational auto-encode(VAE),the proposed algorithm has better prediction results.展开更多
Objective The rate of post-operative complications has been increased with the changes in patients’age,prolonged duration,more severe and diffused lesions,and more patients with complications in recent years. We try ...Objective The rate of post-operative complications has been increased with the changes in patients’age,prolonged duration,more severe and diffused lesions,and more patients with complications in recent years. We try to identify the risk factors associated with prolonged stay in the intensive care unit (ICU) after coronary artery bypass graft surgery (CABG) . Methods 1623 patients who received CABG surgery in Beijing Anzhen Hospital展开更多
Our previous study demonstrated that human KIAA0100 gene was a novel acute monocytic leukemia-associated antigen (MLAA) gene. But the functional characterization of human KIAA0100 gene has remained unknown to date. He...Our previous study demonstrated that human KIAA0100 gene was a novel acute monocytic leukemia-associated antigen (MLAA) gene. But the functional characterization of human KIAA0100 gene has remained unknown to date. Here, firstly, bioinformatic prediction of human KIAA0100 gene was carried out using online softwares; Secondly, Human KIAA0100 gene expression was downregulated by the clustered regularly interspaced short palindromic repeats (CRISPR)/CRISPR-associated (Cas) 9 system in U937 cells. Cell proliferation and apoptosis were next evaluated in KIAA0100-knockdown U937 cells. The bioinformatic prediction showed that human KIAA0100 gene was located on 17q11.2, and human KIAA0100 protein was located in the secretory pathway. Besides, human KIAA0100 protein contained a signalpeptide, a transmembrane region, three types of secondary structures (alpha helix, extended strand, and random coil) , and four domains from mitochondrial protein 27 (FMP27). The observation on functional characterization of human KIAA0100 gene revealed that its downregulation inhibited cell proliferation, and promoted cell apoptosis in U937 cells. To summarize, these results suggest human KIAA0100 gene possibly comes within mitochondrial genome; moreover, it is a novel anti-apoptotic factor related to carcinogenesis or progression in acute monocytic leukemia, and may be a potential target for immunotherapy against acute monocytic leukemia.展开更多
To evaluate and predict liver fibrosis in patients with nonalcoholic fatty liver disease(NAFLD),several non-invasive scoring systems were built and widely used in the progress of diagnosis and treatment,which showed g...To evaluate and predict liver fibrosis in patients with nonalcoholic fatty liver disease(NAFLD),several non-invasive scoring systems were built and widely used in the progress of diagnosis and treatment,which showed great diagnostic efficiency,such as aspartate aminotransferase to platelet ratio index,fibrosis-4 index,body mass index,aspartate aminotransferase to alanine aminotransferase ratio,diabetes score and NAFLD fibrosis score.Since the new concept of metabolic associated fatty liver disease(MAFLD)was proposed,the clinical application value of the non-invasive scoring systems mentioned above has not been assessed in MAFLD.The evaluation of the diagnostic performance of these non-invasive scoring systems will provide references for clinicians in the diagnosis of MAFLD.展开更多
The stomach is the most frequently involved site for extranodal lymphomas,accounting for nearly two-thirds of all gastrointestinal cases.It is widely accepted that gastric B-cell,low-grade mucosal-associated lymphoid ...The stomach is the most frequently involved site for extranodal lymphomas,accounting for nearly two-thirds of all gastrointestinal cases.It is widely accepted that gastric B-cell,low-grade mucosal-associated lymphoid tissue(MALT)-lymphoma is caused by Helicobacter pylori(H.pylori)infection.MALT-lymphomas may engender different clinical and endoscopic patterns.Often,diagnosis is confirmed in patients with only vague dyspeptic symptoms and without macroscopic lesions on gastric mucosa.H.pylori eradication leads to lymphoma remission in a large number of patients when treatment occurs at an early stage(Ⅰ-Ⅱ1).Neoplasia confined to the submucosa,localized in the antral region of the stomach,and without API2-MALT1 translocation,shows a high probability of remission following H.pylori eradication.When both bacterial infection and lymphoma recur,further eradication therapy is generally effective.Radiotherapy,chemotherapy and,in selected cases,surgery are the available therapeutic options with a high success rate for those patients who fail to achieve remission,while data on immunotherapy with monoclonal antibodies (rituximab)are still scarce.The 5-year survival rate is higher than 90%,but careful,long-term follow-up is required in these patients since lymphoma recurrence has been reported in some cases.展开更多
The software defects are managed through the knowledge base,and defect management is upgraded from the data level to the knowledge level. The rule knowledge is mined from bug data based on a rule-based knowledge extra...The software defects are managed through the knowledge base,and defect management is upgraded from the data level to the knowledge level. The rule knowledge is mined from bug data based on a rule-based knowledge extraction model,and the appropriate strategy is configured in the strategy layer to predict software defects. The model is extracted by direct association rules and extended association rules,which improve the prediction rate of related defects and the efficiency of software testing.展开更多
Accurately predicting the chiller coefficient of performance(COP)is essential for improving the energy efficiency of heating,ventilation,and air conditioning(HVAC)systems,significantly contributing to energy conservat...Accurately predicting the chiller coefficient of performance(COP)is essential for improving the energy efficiency of heating,ventilation,and air conditioning(HVAC)systems,significantly contributing to energy conservation in buildings.Traditional performance prediction methods often overlook the dynamic interaction among sensor variables and face challenges in using extensive historical data efficiently,which impedes accurate predictions.To overcome these challenges,this paper proposes an innovative on-site chiller performance prediction method employing a dynamic graph convolutional network(GCN)enhanced by association rules.The distinctive feature of this method is constructing an association graph bank containing static graphs in each operating mode by mining the association rules between various sensor variables in historical operating data.A real-time graph is created by analyzing the correlation between various sensor variables in the current operating data.This graph is fused online with the static graph in the current operating mode to obtain a dynamic graph used for feature extraction and training of GCN.The effectiveness of this method has been empirically confirmed through the operational data of an actual building chiller system.Comparative analysis with state-of-the-art methods highlights the superior performance of the proposed method.展开更多
基金funded by the CGIAR Research Program(CRP)on MAIZEthe USAID through the Accelerating Genetic Gains Supplemental Project(Amend.No.9 MTO 069033),and the One CGIAR Initiative on Accelerated Breeding+1 种基金funding from the governments of Australia,Belgium,Canada,China,France,India,Japan,the Republic of Korea,Mexico,the Netherlands,New Zealand,Norway,Sweden,Switzerland,the United Kingdom,the United States,and the World Banksupported by the China Scholarship Council。
文摘Maize stalk rot reduces grain yield and quality.Information about the genetics of resistance to maize stalk rot could help breeders design effective breeding strategies for the trait.Genomic prediction may be a more effective breeding strategy for stalk-rot resistance than marker-assisted selection.We performed a genome-wide association study(GWAS)and genomic prediction of resistance in testcross hybrids of 677 inbred lines from the Tuxpe?o and non-Tuxpe?o heterotic pools grown in three environments and genotyped with 200,681 single-nucleotide polymorphisms(SNPs).Eighteen SNPs associated with stalk rot shared genomic regions with gene families previously associated with plant biotic and abiotic responses.More favorable SNP haplotypes traced to tropical than to temperate progenitors of the inbred lines.Incorporating genotype-by-environment(G×E)interaction increased genomic prediction accuracy.
基金Clinical Medical Center Construction Project of Hainan Province(No.2021818)Construction of Innovation Center of Academician Team of Hainan Province(No.2022136)+2 种基金Academician Innovation Platform of Hainan Province(No.00817378)Health Industry Research Project of Hainan Province(No.22A200078)Innovative Research Project of Hainan Graduate Students(No.Qhyb2022‑133)。
文摘Objective:To analyze the independent risk factors for the occurrence of moderate-to-severe metabolic-associated fatty liver disease(MAFLD),to construct a prediction model for moderate-to-severe MAFLD,and to verify the validity of the model.Methods:In the first part,278 medical examiners who were diagnosed with MAFLD in Medical Examination Center at the Second Affiliated Hospital of Hainan University from January to May 2022 were taken as the study subjects(training set),and they were divided into mild MAFLD group(200)and moderate-severe MAFLD group(78)based on ultrasound results.Demographic data and laboratory indexes were collected,and risk factors were screened by univariate and multifactor analysis.In the second part,a dichotomous logistic regression equation was used to construct a prediction model for moderate-to-severe MAFLD,and the model was visualized in a line graph.In the third part,the MAFLD population(200 people in the external validation set)from our physical examination center from November to December 2022 was collected as the moderate-to-severe MAFLD prediction model,and the risk factors in both groups were compared.The receiver operating characteristic(ROC)curves,calibration curves,and clinical applicability of the model were plotted to represent model discrimination for internal and external validation.Results:The risk factors of moderate-to-severe MAFLD were fasting glucose(FPG),blood uric acid(UA),triglycerides(TG),triglyceride glucose index(TyG),total cholesterol(CHOL),and high-density lipoprotein(HDL-C).UA[OR=1.021,95%CI(1.015,1.027),P<0.001]and FPG[OR=1.575,95%CI(1.158,2.143),P=0.004]were independent risk factors for people with moderate to severe MAFLD.The visualized line graph model showed that UA was the factor contributing more to the risk of moderate to severe MAFLD in this model.The ROC curves showed AUC values of 0.8701,0.8686 and 0.7991 for the training set,internal validation set and external validation set,respectively.The curves almost coincided with the reference line after calibration of the model calibration degree with P>0.05 in Hosmer-Lemeshow test.The decision curve analysis(DCA)plotted by the clinical applicability of the model was higher than the two extreme curves,predicting that patients with moderate to severe MAFLD would benefit from the prediction model.Conclusion:The prediction model constructed by combining FPG with UA has higher accuracy and better clinical applicability,and can be used for clinical diagnosis.
文摘Association rule learning(ARL)is a widely used technique for discovering relationships within datasets.However,it often generates excessive irrelevant or ambiguous rules.Therefore,post-processing is crucial not only for removing irrelevant or redundant rules but also for uncovering hidden associations that impact other factors.Recently,several post-processing methods have been proposed,each with its own strengths and weaknesses.In this paper,we propose THAPE(Tunable Hybrid Associative Predictive Engine),which combines descriptive and predictive techniques.By leveraging both techniques,our aim is to enhance the quality of analyzing generated rules.This includes removing irrelevant or redundant rules,uncovering interesting and useful rules,exploring hidden association rules that may affect other factors,and providing backtracking ability for a given product.The proposed approach offers a tailored method that suits specific goals for retailers,enabling them to gain a better understanding of customer behavior based on factual transactions in the target market.We applied THAPE to a real dataset as a case study in this paper to demonstrate its effectiveness.Through this application,we successfully mined a concise set of highly interesting and useful association rules.Out of the 11,265 rules generated,we identified 125 rules that are particularly relevant to the business context.These identified rules significantly improve the interpretability and usefulness of association rules for decision-making purposes.
基金supported by the National Natural Science Foundation of China(32261143468)the National Key Research and Development(R&D)Program of China(2021YFC2600400)+1 种基金the Seed Industry Revitalization Project of Jiangsu Province(JBGS(2021)001)the Project of Zhongshan Biological Breeding Laboratory(BM2022008-02)。
文摘The traditional method of screening plants for disease resistance phenotype is both time-consuming and costly.Genomic selection offers a potential solution to improve efficiency,but accurately predicting plant disease resistance remains a challenge.In this study,we evaluated eight different machine learning(ML)methods,including random forest classification(RFC),support vector classifier(SVC),light gradient boosting machine(lightGBM),random forest classification plus kinship(RFC_K),support vector classification plus kinship(SVC_K),light gradient boosting machine plus kinship(lightGBM_K),deep neural network genomic prediction(DNNGP),and densely connected convolutional networks(DenseNet),for predicting plant disease resistance.Our results demonstrate that the three plus kinship(K)methods developed in this study achieved high prediction accuracy.Specifically,these methods achieved accuracies of up to 95%for rice blast(RB),85%for rice black-streaked dwarf virus(RBSDV),and 85%for rice sheath blight(RSB)when trained and applied to the rice diversity panel I(RDPI).Furthermore,the plus K models performed well in predicting wheat blast(WB)and wheat stripe rust(WSR)diseases,with mean accuracies of up to 90%and 93%,respectively.To assess the generalizability of our models,we applied the trained plus K methods to predict RB disease resistance in an independent population,rice diversity panel II(RDPII).Concurrently,we evaluated the RB resistance of RDPII cultivars using spray inoculation.Comparing the predictions with the spray inoculation results,we found that the accuracy of the plus K methods reached 91%.These findings highlight the effectiveness of the plus K methods(RFC_K,SVC_K,and lightGBM_K)in accurately predicting plant disease resistance for RB,RBSDV,RSB,WB,and WSR.The methods developed in this study not only provide valuable strategies for predicting disease resistance,but also pave the way for using machine learning to streamline genome-based crop breeding.
基金The National Key R&D Program of China under contract No.2021YFC3101603.
文摘Ocean temperature is an important physical variable in marine ecosystems,and ocean temperature prediction is an important research objective in ocean-related fields.Currently,one of the commonly used methods for ocean temperature prediction is based on data-driven,but research on this method is mostly limited to the sea surface,with few studies on the prediction of internal ocean temperature.Existing graph neural network-based methods usually use predefined graphs or learned static graphs,which cannot capture the dynamic associations among data.In this study,we propose a novel dynamic spatiotemporal graph neural network(DSTGN)to predict threedimensional ocean temperature(3D-OT),which combines static graph learning and dynamic graph learning to automatically mine two unknown dependencies between sequences based on the original 3D-OT data without prior knowledge.Temporal and spatial dependencies in the time series were then captured using temporal and graph convolutions.We also integrated dynamic graph learning,static graph learning,graph convolution,and temporal convolution into an end-to-end framework for 3D-OT prediction using time-series grid data.In this study,we conducted prediction experiments using high-resolution 3D-OT from the Copernicus global ocean physical reanalysis,with data covering the vertical variation of temperature from the sea surface to 1000 m below the sea surface.We compared five mainstream models that are commonly used for ocean temperature prediction,and the results showed that the method achieved the best prediction results at all prediction scales.
基金supported by the Hunan University of Science and Technology Doctoral Research Foundation Project(E51873).
文摘Due to the lack of consideration of movement behavior information other than time and location perception in current location prediction methods,the movement characteristics of trajectory data cannot be well expressed,which in turn affects the accuracy of the prediction results.First,a new trajectory data expression method by associating the movement behavior information is given.The pre-association method is used to model the movement behavior information according to the individual movement behavior features and the group movement behavior features extracted from the trajectory sequence and the region.The movement behavior features based on pre-association may not always be the best for the prediction model.Therefore,through association analysis and importance analysis,the final association feature is selected from the pre-association features.The trajectory data is input into the LSTM networks after associated features and genetic algorithm(GA)is used to optimize the combination of the length of time window and the number of hidden layer nodes.The experimental results show that compared with the original trajectory data,the trajectory data associated with the movement behavior information helps to improve the accuracy of location prediction.
基金The authors gratefully acknowledge the financial support from the MasAgro project funded by Mexico’s Secretary of Agriculture and Rural Development(SADER),the Genomic Open-source Breeding Informatics Initiative(GOBII)(grant number OPP1093167)supported by the Bill&Melinda Gates Foundation,and the CGIAR Research Program(CRP)on maize(MAIZE)MAIZE receives W1&W2 support from the Governments of Australia,Belgium,Canada,China,France,India,Japan,the Republic of Korea,Mexico,Netherlands,New Zealand,Norway,Sweden,Switzerland,the United Kingdom,USA,and the World Bank+2 种基金The authors also thank the National Natural Science Foundation of China(grant number 31801442)the CIMMYT–China Specialty Maize Research Center Project funded by the Shanghai Municipal Finance Bureauthe China Scholarship Council.
文摘Fusarium ear rot(FER)is a destructive maize fungal disease worldwide.In this study,three tropical maize populations consisting of 874 inbred lines were used to perform genomewide association study(GWAS)and genomic prediction(GP)analyses of FER resistance.Broad phenotypic variation and high heritability for FER were observed,although it was highly influenced by large genotype-by-environment interactions.In the 874 inbred lines,GWAS with general linear model(GLM)identified 3034 single-nucleotide polymorphisms(SNPs)significantly associated with FER resistance at the P-value threshold of 1×10^(-5),the average phenotypic variation explained(PVE)by these associations was 3%with a range from 2.33%to 6.92%,and 49 of these associations had PVE values greater than 5%.The GWAS analysis with mixed linear model(MLM)identified 19 significantly associated SNPs at the P-value threshold of 1×10^(-4),the average PVE of these associations was 1.60%with a range from 1.39%to 2.04%.Within each of the three populations,the number of significantly associated SNPs identified by GLM and MLM ranged from 25 to 41,and from 5 to 22,respectively.Overlapping SNP associations across populations were rare.A few stable genomic regions conferring FER resistance were identified,which located in bins 3.04/05,7.02/04,9.00/01,9.04,9.06/07,and 10.03/04.The genomic regions in bins 9.00/01 and 9.04 are new.GP produced moderate accuracies with genome-wide markers,and relatively high accuracies with SNP associations detected from GWAS.Moderate prediction accuracies were observed when the training and validation sets were closely related.These results implied that FER resistance in maize is controlled by minor QTL with small effects,and highly influenced by the genetic background of the populations studied.Genomic selection(GS)by incorporating SNP associations detected from GWAS is a promising tool for improving FER resistance in maize.
基金supported by grants from the National Science and Technology Major Project(2017ZX10203201)the opening foundation of the State Key Laboratory for Diagnosis and Treatmentof Infectious Diseases and Collaborative Innovation Center for Diag-nosis and Treatment of Infectious Diseases,First Affiliated Hospital,Zhejiang University School of Medicine(2015KF04)
文摘Recentlythearticle"PerioperativevonWillebrandfactordynamics are associated with liver regeneration and predict outcome afterliver resection" was published in Hepatology[1].Prof.Starlinger et al. aimed to assess the association of von Willebrand factor (vWF) levels and clinical outcome in patients with liver cancers post-liverresection(LR).Basedonthemechanismthatplatelets accumulation in the liver may promote liver regeneration after partial LR in mice, they found the vWF-dependent pattern of platelets accumulationduringliverregenerationinpatientsaftersurgery.
基金funding within the Wheat BigData Project(German Federal Ministry of Food and Agriculture,FKZ2818408B18)。
文摘Genome-wide association mapping studies(GWAS)based on Big Data are a potential approach to improve marker-assisted selection in plant breeding.The number of available phenotypic and genomic data sets in which medium-sized populations of several hundred individuals have been studied is rapidly increasing.Combining these data and using them in GWAS could increase both the power of QTL discovery and the accuracy of estimation of underlying genetic effects,but is hindered by data heterogeneity and lack of interoperability.In this study,we used genomic and phenotypic data sets,focusing on Central European winter wheat populations evaluated for heading date.We explored strategies for integrating these data and subsequently the resulting potential for GWAS.Establishing interoperability between data sets was greatly aided by some overlapping genotypes and a linear relationship between the different phenotyping protocols,resulting in high quality integrated phenotypic data.In this context,genomic prediction proved to be a suitable tool to study relevance of interactions between genotypes and experimental series,which was low in our case.Contrary to expectations,fewer associations between markers and traits were found in the larger combined data than in the individual experimental series.However,the predictive power based on the marker-trait associations of the integrated data set was higher across data sets.Therefore,the results show that the integration of medium-sized to Big Data is an approach to increase the power to detect QTL in GWAS.The results encourage further efforts to standardize and share data in the plant breeding community.
基金Lanzhou Talent Innovation and Entrepreneurship Project(No.2020-RC-14)。
文摘Single nucletide polymorphism(SNP)is an important factor for the study of genetic variation in human families and animal and plant strains.Therefore,it is widely used in the study of population genetics and disease related gene.In pharmacogenomics research,identifying the association between SNP site and drug is the key to clinical precision medication,therefore,a predictive model of SNP site and drug association based on denoising variational auto-encoder(DVAE-SVM)is proposed.Firstly,k-mer algorithm is used to construct the initial SNP site feature vector,meanwhile,MACCS molecular fingerprint is introduced to generate the feature vector of the drug module.Then,we use the DVAE to extract the effective features of the initial feature vector of the SNP site.Finally,the effective feature vector of the SNP site and the feature vector of the drug module are fused input to the support vector machines(SVM)to predict the relationship of SNP site and drug module.The results of five-fold cross-validation experiments indicate that the proposed algorithm performs better than random forest(RF)and logistic regression(LR)classification.Further experiments show that compared with the feature extraction algorithms of principal component analysis(PCA),denoising auto-encoder(DAE)and variational auto-encode(VAE),the proposed algorithm has better prediction results.
文摘Objective The rate of post-operative complications has been increased with the changes in patients’age,prolonged duration,more severe and diffused lesions,and more patients with complications in recent years. We try to identify the risk factors associated with prolonged stay in the intensive care unit (ICU) after coronary artery bypass graft surgery (CABG) . Methods 1623 patients who received CABG surgery in Beijing Anzhen Hospital
基金financial support from the National Natural Science Foundation of China (No. 081718110232)
文摘Our previous study demonstrated that human KIAA0100 gene was a novel acute monocytic leukemia-associated antigen (MLAA) gene. But the functional characterization of human KIAA0100 gene has remained unknown to date. Here, firstly, bioinformatic prediction of human KIAA0100 gene was carried out using online softwares; Secondly, Human KIAA0100 gene expression was downregulated by the clustered regularly interspaced short palindromic repeats (CRISPR)/CRISPR-associated (Cas) 9 system in U937 cells. Cell proliferation and apoptosis were next evaluated in KIAA0100-knockdown U937 cells. The bioinformatic prediction showed that human KIAA0100 gene was located on 17q11.2, and human KIAA0100 protein was located in the secretory pathway. Besides, human KIAA0100 protein contained a signalpeptide, a transmembrane region, three types of secondary structures (alpha helix, extended strand, and random coil) , and four domains from mitochondrial protein 27 (FMP27). The observation on functional characterization of human KIAA0100 gene revealed that its downregulation inhibited cell proliferation, and promoted cell apoptosis in U937 cells. To summarize, these results suggest human KIAA0100 gene possibly comes within mitochondrial genome; moreover, it is a novel anti-apoptotic factor related to carcinogenesis or progression in acute monocytic leukemia, and may be a potential target for immunotherapy against acute monocytic leukemia.
文摘To evaluate and predict liver fibrosis in patients with nonalcoholic fatty liver disease(NAFLD),several non-invasive scoring systems were built and widely used in the progress of diagnosis and treatment,which showed great diagnostic efficiency,such as aspartate aminotransferase to platelet ratio index,fibrosis-4 index,body mass index,aspartate aminotransferase to alanine aminotransferase ratio,diabetes score and NAFLD fibrosis score.Since the new concept of metabolic associated fatty liver disease(MAFLD)was proposed,the clinical application value of the non-invasive scoring systems mentioned above has not been assessed in MAFLD.The evaluation of the diagnostic performance of these non-invasive scoring systems will provide references for clinicians in the diagnosis of MAFLD.
文摘The stomach is the most frequently involved site for extranodal lymphomas,accounting for nearly two-thirds of all gastrointestinal cases.It is widely accepted that gastric B-cell,low-grade mucosal-associated lymphoid tissue(MALT)-lymphoma is caused by Helicobacter pylori(H.pylori)infection.MALT-lymphomas may engender different clinical and endoscopic patterns.Often,diagnosis is confirmed in patients with only vague dyspeptic symptoms and without macroscopic lesions on gastric mucosa.H.pylori eradication leads to lymphoma remission in a large number of patients when treatment occurs at an early stage(Ⅰ-Ⅱ1).Neoplasia confined to the submucosa,localized in the antral region of the stomach,and without API2-MALT1 translocation,shows a high probability of remission following H.pylori eradication.When both bacterial infection and lymphoma recur,further eradication therapy is generally effective.Radiotherapy,chemotherapy and,in selected cases,surgery are the available therapeutic options with a high success rate for those patients who fail to achieve remission,while data on immunotherapy with monoclonal antibodies (rituximab)are still scarce.The 5-year survival rate is higher than 90%,but careful,long-term follow-up is required in these patients since lymphoma recurrence has been reported in some cases.
文摘The software defects are managed through the knowledge base,and defect management is upgraded from the data level to the knowledge level. The rule knowledge is mined from bug data based on a rule-based knowledge extraction model,and the appropriate strategy is configured in the strategy layer to predict software defects. The model is extracted by direct association rules and extended association rules,which improve the prediction rate of related defects and the efficiency of software testing.
基金supported in part by the Science and Technology Innovation Program of Hunan Province(No.2022RC1090)in part by the National Natural Science Foundation of China(No.62173349)+2 种基金in part by the Natural Science Foundation of Hunan Province(No.2022J20076)in part by the Innovation Driven Projection of Central South University(No.2023CXQD073)in part by the Major Program of Xiangjiang Laboratory(No.22XJ01005).
文摘Accurately predicting the chiller coefficient of performance(COP)is essential for improving the energy efficiency of heating,ventilation,and air conditioning(HVAC)systems,significantly contributing to energy conservation in buildings.Traditional performance prediction methods often overlook the dynamic interaction among sensor variables and face challenges in using extensive historical data efficiently,which impedes accurate predictions.To overcome these challenges,this paper proposes an innovative on-site chiller performance prediction method employing a dynamic graph convolutional network(GCN)enhanced by association rules.The distinctive feature of this method is constructing an association graph bank containing static graphs in each operating mode by mining the association rules between various sensor variables in historical operating data.A real-time graph is created by analyzing the correlation between various sensor variables in the current operating data.This graph is fused online with the static graph in the current operating mode to obtain a dynamic graph used for feature extraction and training of GCN.The effectiveness of this method has been empirically confirmed through the operational data of an actual building chiller system.Comparative analysis with state-of-the-art methods highlights the superior performance of the proposed method.