N-11-azaartemisinins potentially active against Plasmodium falciparum are designed by combining molecular electrostatic potential (MEP), ligand-receptor interaction, and models built with supervised machine learning m...N-11-azaartemisinins potentially active against Plasmodium falciparum are designed by combining molecular electrostatic potential (MEP), ligand-receptor interaction, and models built with supervised machine learning methods (PCA, HCA, KNN, SIMCA, and SDA). The optimization of molecular structures was performed using the B3LYP/6-31G* approach. MEP maps and ligand-receptor interactions were used to investigate key structural features required for biological activities and likely interactions between N-11-azaartemisinins and heme, respectively. The supervised machine learning methods allowed the separation of the investigated compounds into two classes: cha and cla, with the properties ε<sub>LUMO+1</sub> (one level above lowest unoccupied molecular orbital energy), d(C<sub>6</sub>-C<sub>5</sub>) (distance between C<sub>6</sub> and C<sub>5</sub> atoms in ligands), and TSA (total surface area) responsible for the classification. The insights extracted from the investigation developed and the chemical intuition enabled the design of sixteen new N-11-azaartemisinins (prediction set), moreover, models built with supervised machine learning methods were applied to this prediction set. The result of this application showed twelve new promising N-11-azaartemisinins for synthesis and biological evaluation.展开更多
To perform landslide susceptibility prediction(LSP),it is important to select appropriate mapping unit and landslide-related conditioning factors.The efficient and automatic multi-scale segmentation(MSS)method propose...To perform landslide susceptibility prediction(LSP),it is important to select appropriate mapping unit and landslide-related conditioning factors.The efficient and automatic multi-scale segmentation(MSS)method proposed by the authors promotes the application of slope units.However,LSP modeling based on these slope units has not been performed.Moreover,the heterogeneity of conditioning factors in slope units is neglected,leading to incomplete input variables of LSP modeling.In this study,the slope units extracted by the MSS method are used to construct LSP modeling,and the heterogeneity of conditioning factors is represented by the internal variations of conditioning factors within slope unit using the descriptive statistics features of mean,standard deviation and range.Thus,slope units-based machine learning models considering internal variations of conditioning factors(variant slope-machine learning)are proposed.The Chongyi County is selected as the case study and is divided into 53,055 slope units.Fifteen original slope unit-based conditioning factors are expanded to 38 slope unit-based conditioning factors through considering their internal variations.Random forest(RF)and multi-layer perceptron(MLP)machine learning models are used to construct variant Slope-RF and Slope-MLP models.Meanwhile,the Slope-RF and Slope-MLP models without considering the internal variations of conditioning factors,and conventional grid units-based machine learning(Grid-RF and MLP)models are built for comparisons through the LSP performance assessments.Results show that the variant Slopemachine learning models have higher LSP performances than Slope-machine learning models;LSP results of variant Slope-machine learning models have stronger directivity and practical application than Grid-machine learning models.It is concluded that slope units extracted by MSS method can be appropriate for LSP modeling,and the heterogeneity of conditioning factors within slope units can more comprehensively reflect the relationships between conditioning factors and landslides.The research results have important reference significance for land use and landslide prevention.展开更多
Machine learning(ML) models provide great opportunities to accelerate novel material development, offering a virtual alternative to laborious and resource-intensive empirical methods. In this work, the second of a two...Machine learning(ML) models provide great opportunities to accelerate novel material development, offering a virtual alternative to laborious and resource-intensive empirical methods. In this work, the second of a two-part study, an ML approach is presented that offers accelerated digital design of Mg alloys. A systematic evaluation of four ML regression algorithms was explored to rationalise the complex relationships in Mg-alloy data and to capture the composition-processing-property patterns. Cross-validation and hold-out set validation techniques were utilised for unbiased estimation of model performance. Using atomic and thermodynamic properties of the alloys, feature augmentation was examined to define the most descriptive representation spaces for the alloy data. Additionally, a graphical user interface(GUI) webtool was developed to facilitate the use of the proposed models in predicting the mechanical properties of new Mg alloys. The results demonstrate that random forest regression model and neural network are robust models for predicting the ultimate tensile strength and ductility of Mg alloys, with accuracies of ~80% and 70% respectively. The developed models in this work are a step towards high-throughput screening of novel candidates for target mechanical properties and provide ML-guided alloy design.展开更多
The Northeast China cold vortex(NCCV)during late summer(from July to August)is identified and classified into three types in terms of its movement path using machine learning.The relationships of the three types of NC...The Northeast China cold vortex(NCCV)during late summer(from July to August)is identified and classified into three types in terms of its movement path using machine learning.The relationships of the three types of NCCV intensity with atmospheric circulations in late summer,the sea surface temperature(SST),and Arctic sea ice concentration(SIC)in the preceding months,are analyzed.The sensitivity tests by the Community Atmosphere Model version 5.3(CAM5.3)are used to verify the statistical results.The results show that the coordination pattern of East Asia-Pacific(EAP)and Lake Baikal high pressure forced by SST anomalies in the North Indian Ocean dipole mode(NIOD)during the preceding April and SIC anomalies in the Nansen Basin during the preceding June results in an intensity anomaly for the first type of NCCV.While the pattern of high pressure over the Urals and Okhotsk Sea and low pressure over Lake Baikal during late summer-which is forced by SST anomalies in the South Indian Ocean dipole mode(SIOD)in the preceding June and SIC anomalies in the Barents Sea in the preceding April-causes the intensity anomaly of the second type.The third type is atypical and is not analyzed in detail.Sensitivity tests,jointly forced by the SST and SIC in the preceding period,can well reproduce the observations.In contrast,the results forced separately by the SST and SIC are poor,indicating that the NCCV during late summer is likely influenced by the coordinated effects of both SST and SIC in the preceding months.展开更多
Drought is the least understood natural disaster due to the complex relationship of multiple contributory factors. Itsbeginning and end are hard to gauge, and they can last for months or even for years. India has face...Drought is the least understood natural disaster due to the complex relationship of multiple contributory factors. Itsbeginning and end are hard to gauge, and they can last for months or even for years. India has faced many droughtsin the last few decades. Predicting future droughts is vital for framing drought management plans to sustainnatural resources. The data-driven modelling for forecasting the metrological time series prediction is becomingmore powerful and flexible with computational intelligence techniques. Machine learning (ML) techniques havedemonstrated success in the drought prediction process and are becoming popular to predict the weather, especiallythe minimum temperature using backpropagation algorithms. The favourite ML techniques for weather forecastinginclude support vector machines (SVM), support vector regression, random forest, decision tree, logistic regression,Naive Bayes, linear regression, gradient boosting tree, k-nearest neighbours (KNN), the adaptive neuro-fuzzyinference system, the feed-forward neural networks, Markovian chain, Bayesian network, hidden Markov models,and autoregressive moving averages, evolutionary algorithms, deep learning and many more. This paper presentsa recent review of the literature using ML in drought prediction, the drought indices, dataset, and performancemetrics.展开更多
An emerging real-time ground compaction and quality control, known as intelligent compaction(IC), has been applied for efficiently optimising the full-area compaction. Although IC technology can provide real-time asse...An emerging real-time ground compaction and quality control, known as intelligent compaction(IC), has been applied for efficiently optimising the full-area compaction. Although IC technology can provide real-time assessment of uniformity of the compacted area, accurate determination of the soil stiffness required for quality control and design remains challenging. In this paper, a novel and advanced numerical model simulating the interaction of vibratory drum and soil beneath is developed. The model is capable of evaluating the nonlinear behaviour of underlying soil subjected to dynamic loading by capturing the variations of damping with the cyclic shear strains and degradation of soil modulus. The interaction of the drum and the soil is simulated via the finite element method to develop a comprehensive dataset capturing the dynamic responses of the drum and the soil. Indeed, more than a thousand three-dimensional(3D) numerical models covering various soil characteristics, roller weights, vibration amplitudes and frequencies were adopted. The developed dataset is then used to train the inverse solver using an innovative machine learning approach, i.e. the extended support vector regression, to simulate the stiffness of the compacted soil by adopting drum acceleration records. Furthermore, the impacts of the amplitude and frequency of the vibration on the level of underlying soil compaction are discussed.The proposed machine learning approach is promising for real-time extraction of actual soil stiffness during compaction. Results of the study can be employed by practising engineers to interpret roller drum acceleration data to estimate the level of compaction and ground stiffness during compaction.展开更多
In the design process of berm breakwaters, their front slope recession has an inevitable rule in large number of model tests, and this parameter being studied. This research draws its data from Moghim's and Shekar...In the design process of berm breakwaters, their front slope recession has an inevitable rule in large number of model tests, and this parameter being studied. This research draws its data from Moghim's and Shekari's experiment results. These experiments consist of two different 2D model tests in two wave flumes, in which the berm recession to different sea state and structural parameters have been studied. Irregular waves with a JONSWAP spectrum were used in both test series. A total of 412 test results were used to cover the impact of sea state conditions such as wave height, wave period, storm duration and water depth at the toe of the structure, and structural parameters such as berm elevation from still water level, berm width and stone diameter on berm recession parameters. In this paper, a new set of equations for berm recession is derived using the M5' model tree as a machine learning approach. A comparison is made between the estimations by the new formula and the formulae recently given by other researchers to show the preference of new M5' approach.展开更多
In the existing landslide susceptibility prediction(LSP)models,the influences of random errors in landslide conditioning factors on LSP are not considered,instead the original conditioning factors are directly taken a...In the existing landslide susceptibility prediction(LSP)models,the influences of random errors in landslide conditioning factors on LSP are not considered,instead the original conditioning factors are directly taken as the model inputs,which brings uncertainties to LSP results.This study aims to reveal the influence rules of the different proportional random errors in conditioning factors on the LSP un-certainties,and further explore a method which can effectively reduce the random errors in conditioning factors.The original conditioning factors are firstly used to construct original factors-based LSP models,and then different random errors of 5%,10%,15% and 20%are added to these original factors for con-structing relevant errors-based LSP models.Secondly,low-pass filter-based LSP models are constructed by eliminating the random errors using low-pass filter method.Thirdly,the Ruijin County of China with 370 landslides and 16 conditioning factors are used as study case.Three typical machine learning models,i.e.multilayer perceptron(MLP),support vector machine(SVM)and random forest(RF),are selected as LSP models.Finally,the LSP uncertainties are discussed and results show that:(1)The low-pass filter can effectively reduce the random errors in conditioning factors to decrease the LSP uncertainties.(2)With the proportions of random errors increasing from 5%to 20%,the LSP uncertainty increases continuously.(3)The original factors-based models are feasible for LSP in the absence of more accurate conditioning factors.(4)The influence degrees of two uncertainty issues,machine learning models and different proportions of random errors,on the LSP modeling are large and basically the same.(5)The Shapley values effectively explain the internal mechanism of machine learning model predicting landslide sus-ceptibility.In conclusion,greater proportion of random errors in conditioning factors results in higher LSP uncertainty,and low-pass filter can effectively reduce these random errors.展开更多
AIM To develop a framework to incorporate background domain knowledge into classification rule learning for knowledge discovery in biomedicine.METHODS Bayesian rule learning(BRL) is a rule-based classifier that uses a...AIM To develop a framework to incorporate background domain knowledge into classification rule learning for knowledge discovery in biomedicine.METHODS Bayesian rule learning(BRL) is a rule-based classifier that uses a greedy best-first search over a space of Bayesian belief-networks(BN) to find the optimal BN to explain the input dataset, and then infers classification rules from this BN. BRL uses a Bayesian score to evaluate the quality of BNs. In this paper, we extended the Bayesian score to include informative structure priors, which encodes our prior domain knowledge about the dataset. We call this extension of BRL as BRL_p. The structure prior has a λ hyperparameter that allows the user to tune the degree of incorporation of the prior knowledge in the model learning process. We studied the effect of λ on model learning using a simulated dataset and a real-world lung cancer prognostic biomarker dataset, by measuring the degree of incorporation of our specified prior knowledge. We also monitored its effect on the model predictive performance. Finally, we compared BRL_p to other stateof-the-art classifiers commonly used in biomedicine.RESULTS We evaluated the degree of incorporation of prior knowledge into BRL_p, with simulated data by measuring the Graph Edit Distance between the true datagenerating model and the model learned by BRL_p. We specified the true model using informative structurepriors. We observed that by increasing the value of λ we were able to increase the influence of the specified structure priors on model learning. A large value of λ of BRL_p caused it to return the true model. This also led to a gain in predictive performance measured by area under the receiver operator characteristic curve(AUC). We then obtained a publicly available real-world lung cancer prognostic biomarker dataset and specified a known biomarker from literature [the epidermal growth factor receptor(EGFR) gene]. We again observed that larger values of λ led to an increased incorporation of EGFR into the final BRL_p model. This relevant background knowledge also led to a gain in AUC.CONCLUSION BRL_p enables tunable structure priors to be incorporated during Bayesian classification rule learning that integrates data and knowledge as demonstrated using lung cancer biomarker data.展开更多
文摘N-11-azaartemisinins potentially active against Plasmodium falciparum are designed by combining molecular electrostatic potential (MEP), ligand-receptor interaction, and models built with supervised machine learning methods (PCA, HCA, KNN, SIMCA, and SDA). The optimization of molecular structures was performed using the B3LYP/6-31G* approach. MEP maps and ligand-receptor interactions were used to investigate key structural features required for biological activities and likely interactions between N-11-azaartemisinins and heme, respectively. The supervised machine learning methods allowed the separation of the investigated compounds into two classes: cha and cla, with the properties ε<sub>LUMO+1</sub> (one level above lowest unoccupied molecular orbital energy), d(C<sub>6</sub>-C<sub>5</sub>) (distance between C<sub>6</sub> and C<sub>5</sub> atoms in ligands), and TSA (total surface area) responsible for the classification. The insights extracted from the investigation developed and the chemical intuition enabled the design of sixteen new N-11-azaartemisinins (prediction set), moreover, models built with supervised machine learning methods were applied to this prediction set. The result of this application showed twelve new promising N-11-azaartemisinins for synthesis and biological evaluation.
基金funded by the Natural Science Foundation of China(Grant Nos.41807285,41972280 and 52179103).
文摘To perform landslide susceptibility prediction(LSP),it is important to select appropriate mapping unit and landslide-related conditioning factors.The efficient and automatic multi-scale segmentation(MSS)method proposed by the authors promotes the application of slope units.However,LSP modeling based on these slope units has not been performed.Moreover,the heterogeneity of conditioning factors in slope units is neglected,leading to incomplete input variables of LSP modeling.In this study,the slope units extracted by the MSS method are used to construct LSP modeling,and the heterogeneity of conditioning factors is represented by the internal variations of conditioning factors within slope unit using the descriptive statistics features of mean,standard deviation and range.Thus,slope units-based machine learning models considering internal variations of conditioning factors(variant slope-machine learning)are proposed.The Chongyi County is selected as the case study and is divided into 53,055 slope units.Fifteen original slope unit-based conditioning factors are expanded to 38 slope unit-based conditioning factors through considering their internal variations.Random forest(RF)and multi-layer perceptron(MLP)machine learning models are used to construct variant Slope-RF and Slope-MLP models.Meanwhile,the Slope-RF and Slope-MLP models without considering the internal variations of conditioning factors,and conventional grid units-based machine learning(Grid-RF and MLP)models are built for comparisons through the LSP performance assessments.Results show that the variant Slopemachine learning models have higher LSP performances than Slope-machine learning models;LSP results of variant Slope-machine learning models have stronger directivity and practical application than Grid-machine learning models.It is concluded that slope units extracted by MSS method can be appropriate for LSP modeling,and the heterogeneity of conditioning factors within slope units can more comprehensively reflect the relationships between conditioning factors and landslides.The research results have important reference significance for land use and landslide prevention.
基金the support of the Monash-IITB Academy Scholarshipthe Australian Research Council for funding the present research (DP190103592)。
文摘Machine learning(ML) models provide great opportunities to accelerate novel material development, offering a virtual alternative to laborious and resource-intensive empirical methods. In this work, the second of a two-part study, an ML approach is presented that offers accelerated digital design of Mg alloys. A systematic evaluation of four ML regression algorithms was explored to rationalise the complex relationships in Mg-alloy data and to capture the composition-processing-property patterns. Cross-validation and hold-out set validation techniques were utilised for unbiased estimation of model performance. Using atomic and thermodynamic properties of the alloys, feature augmentation was examined to define the most descriptive representation spaces for the alloy data. Additionally, a graphical user interface(GUI) webtool was developed to facilitate the use of the proposed models in predicting the mechanical properties of new Mg alloys. The results demonstrate that random forest regression model and neural network are robust models for predicting the ultimate tensile strength and ductility of Mg alloys, with accuracies of ~80% and 70% respectively. The developed models in this work are a step towards high-throughput screening of novel candidates for target mechanical properties and provide ML-guided alloy design.
基金jointly supported by the National Natural Science Foundation of China (Grant No. 42005037)Special Project of Innovative Development, CMA (CXFZ2021J022, CXFZ2022J008, and CXFZ2021J028)+1 种基金Liaoning Provincial Natural Science Foundation Project (Ph.D. Start-up Research Fund 2019-BS214)Research Project of the Institute of Atmospheric Environment, CMA (2021SYIAEKFMS08, 2020SYIAE08 and 2021SYIAEKFMS09)
文摘The Northeast China cold vortex(NCCV)during late summer(from July to August)is identified and classified into three types in terms of its movement path using machine learning.The relationships of the three types of NCCV intensity with atmospheric circulations in late summer,the sea surface temperature(SST),and Arctic sea ice concentration(SIC)in the preceding months,are analyzed.The sensitivity tests by the Community Atmosphere Model version 5.3(CAM5.3)are used to verify the statistical results.The results show that the coordination pattern of East Asia-Pacific(EAP)and Lake Baikal high pressure forced by SST anomalies in the North Indian Ocean dipole mode(NIOD)during the preceding April and SIC anomalies in the Nansen Basin during the preceding June results in an intensity anomaly for the first type of NCCV.While the pattern of high pressure over the Urals and Okhotsk Sea and low pressure over Lake Baikal during late summer-which is forced by SST anomalies in the South Indian Ocean dipole mode(SIOD)in the preceding June and SIC anomalies in the Barents Sea in the preceding April-causes the intensity anomaly of the second type.The third type is atypical and is not analyzed in detail.Sensitivity tests,jointly forced by the SST and SIC in the preceding period,can well reproduce the observations.In contrast,the results forced separately by the SST and SIC are poor,indicating that the NCCV during late summer is likely influenced by the coordinated effects of both SST and SIC in the preceding months.
文摘Drought is the least understood natural disaster due to the complex relationship of multiple contributory factors. Itsbeginning and end are hard to gauge, and they can last for months or even for years. India has faced many droughtsin the last few decades. Predicting future droughts is vital for framing drought management plans to sustainnatural resources. The data-driven modelling for forecasting the metrological time series prediction is becomingmore powerful and flexible with computational intelligence techniques. Machine learning (ML) techniques havedemonstrated success in the drought prediction process and are becoming popular to predict the weather, especiallythe minimum temperature using backpropagation algorithms. The favourite ML techniques for weather forecastinginclude support vector machines (SVM), support vector regression, random forest, decision tree, logistic regression,Naive Bayes, linear regression, gradient boosting tree, k-nearest neighbours (KNN), the adaptive neuro-fuzzyinference system, the feed-forward neural networks, Markovian chain, Bayesian network, hidden Markov models,and autoregressive moving averages, evolutionary algorithms, deep learning and many more. This paper presentsa recent review of the literature using ML in drought prediction, the drought indices, dataset, and performancemetrics.
文摘An emerging real-time ground compaction and quality control, known as intelligent compaction(IC), has been applied for efficiently optimising the full-area compaction. Although IC technology can provide real-time assessment of uniformity of the compacted area, accurate determination of the soil stiffness required for quality control and design remains challenging. In this paper, a novel and advanced numerical model simulating the interaction of vibratory drum and soil beneath is developed. The model is capable of evaluating the nonlinear behaviour of underlying soil subjected to dynamic loading by capturing the variations of damping with the cyclic shear strains and degradation of soil modulus. The interaction of the drum and the soil is simulated via the finite element method to develop a comprehensive dataset capturing the dynamic responses of the drum and the soil. Indeed, more than a thousand three-dimensional(3D) numerical models covering various soil characteristics, roller weights, vibration amplitudes and frequencies were adopted. The developed dataset is then used to train the inverse solver using an innovative machine learning approach, i.e. the extended support vector regression, to simulate the stiffness of the compacted soil by adopting drum acceleration records. Furthermore, the impacts of the amplitude and frequency of the vibration on the level of underlying soil compaction are discussed.The proposed machine learning approach is promising for real-time extraction of actual soil stiffness during compaction. Results of the study can be employed by practising engineers to interpret roller drum acceleration data to estimate the level of compaction and ground stiffness during compaction.
文摘In the design process of berm breakwaters, their front slope recession has an inevitable rule in large number of model tests, and this parameter being studied. This research draws its data from Moghim's and Shekari's experiment results. These experiments consist of two different 2D model tests in two wave flumes, in which the berm recession to different sea state and structural parameters have been studied. Irregular waves with a JONSWAP spectrum were used in both test series. A total of 412 test results were used to cover the impact of sea state conditions such as wave height, wave period, storm duration and water depth at the toe of the structure, and structural parameters such as berm elevation from still water level, berm width and stone diameter on berm recession parameters. In this paper, a new set of equations for berm recession is derived using the M5' model tree as a machine learning approach. A comparison is made between the estimations by the new formula and the formulae recently given by other researchers to show the preference of new M5' approach.
基金This work is funded by the National Natural Science Foundation of China(Grant Nos.42377164 and 52079062)the National Science Fund for Distinguished Young Scholars of China(Grant No.52222905).
文摘In the existing landslide susceptibility prediction(LSP)models,the influences of random errors in landslide conditioning factors on LSP are not considered,instead the original conditioning factors are directly taken as the model inputs,which brings uncertainties to LSP results.This study aims to reveal the influence rules of the different proportional random errors in conditioning factors on the LSP un-certainties,and further explore a method which can effectively reduce the random errors in conditioning factors.The original conditioning factors are firstly used to construct original factors-based LSP models,and then different random errors of 5%,10%,15% and 20%are added to these original factors for con-structing relevant errors-based LSP models.Secondly,low-pass filter-based LSP models are constructed by eliminating the random errors using low-pass filter method.Thirdly,the Ruijin County of China with 370 landslides and 16 conditioning factors are used as study case.Three typical machine learning models,i.e.multilayer perceptron(MLP),support vector machine(SVM)and random forest(RF),are selected as LSP models.Finally,the LSP uncertainties are discussed and results show that:(1)The low-pass filter can effectively reduce the random errors in conditioning factors to decrease the LSP uncertainties.(2)With the proportions of random errors increasing from 5%to 20%,the LSP uncertainty increases continuously.(3)The original factors-based models are feasible for LSP in the absence of more accurate conditioning factors.(4)The influence degrees of two uncertainty issues,machine learning models and different proportions of random errors,on the LSP modeling are large and basically the same.(5)The Shapley values effectively explain the internal mechanism of machine learning model predicting landslide sus-ceptibility.In conclusion,greater proportion of random errors in conditioning factors results in higher LSP uncertainty,and low-pass filter can effectively reduce these random errors.
基金Supported by National Institute of General Medical Sciences of the National Institutes of Health,No.R01GM100387
文摘AIM To develop a framework to incorporate background domain knowledge into classification rule learning for knowledge discovery in biomedicine.METHODS Bayesian rule learning(BRL) is a rule-based classifier that uses a greedy best-first search over a space of Bayesian belief-networks(BN) to find the optimal BN to explain the input dataset, and then infers classification rules from this BN. BRL uses a Bayesian score to evaluate the quality of BNs. In this paper, we extended the Bayesian score to include informative structure priors, which encodes our prior domain knowledge about the dataset. We call this extension of BRL as BRL_p. The structure prior has a λ hyperparameter that allows the user to tune the degree of incorporation of the prior knowledge in the model learning process. We studied the effect of λ on model learning using a simulated dataset and a real-world lung cancer prognostic biomarker dataset, by measuring the degree of incorporation of our specified prior knowledge. We also monitored its effect on the model predictive performance. Finally, we compared BRL_p to other stateof-the-art classifiers commonly used in biomedicine.RESULTS We evaluated the degree of incorporation of prior knowledge into BRL_p, with simulated data by measuring the Graph Edit Distance between the true datagenerating model and the model learned by BRL_p. We specified the true model using informative structurepriors. We observed that by increasing the value of λ we were able to increase the influence of the specified structure priors on model learning. A large value of λ of BRL_p caused it to return the true model. This also led to a gain in predictive performance measured by area under the receiver operator characteristic curve(AUC). We then obtained a publicly available real-world lung cancer prognostic biomarker dataset and specified a known biomarker from literature [the epidermal growth factor receptor(EGFR) gene]. We again observed that larger values of λ led to an increased incorporation of EGFR into the final BRL_p model. This relevant background knowledge also led to a gain in AUC.CONCLUSION BRL_p enables tunable structure priors to be incorporated during Bayesian classification rule learning that integrates data and knowledge as demonstrated using lung cancer biomarker data.