Regression is a widely used econometric tool in research. In observational studies, based on a number of assumptions, regression-based statistical control methods attempt to analyze the causation between treatment and...Regression is a widely used econometric tool in research. In observational studies, based on a number of assumptions, regression-based statistical control methods attempt to analyze the causation between treatment and outcome by adding control variables. However, this approach may not produce reliable estimates of causal effects. In addition to the shortcomings of the method, this lack of confidence is mainly related to ambiguous formulations in econometrics, such as the definition of selection bias, selection of core control variables, and method of testing for robustness. Within the framework of the causal models, we clarify the assumption of causal inference using regression-based statistical controls, as described in econometrics, and discuss how to select core control variables to satisfy this assumption and conduct robustness tests for regression estimates.展开更多
Causal inference is a powerful modeling tool for explanatory analysis,which might enable current machine learning to become explainable.How to marry causal inference with machine learning to develop explainable artifi...Causal inference is a powerful modeling tool for explanatory analysis,which might enable current machine learning to become explainable.How to marry causal inference with machine learning to develop explainable artificial intelligence(XAI)algorithms is one of key steps toward to the artificial intelligence 2.0.With the aim of bringing knowledge of causal inference to scholars of machine learning and artificial intelligence,we invited researchers working on causal inference to write this survey from different aspects of causal inference.This survey includes the following sections:“Estimating average treatment effect:A brief review and beyond”from Dr.Kun Kuang,“Attribution problems in counterfactual inference”from Prof.Lian Li,“The Yule–Simpson paradox and the surrogate paradox”from Prof.Zhi Geng,“Causal potential theory”from Prof.Lei Xu,“Discovering causal information from observational data”from Prof.Kun Zhang,“Formal argumentation in causal reasoning and explanation”from Profs.Beishui Liao and Huaxin Huang,“Causal inference with complex experiments”from Prof.Peng Ding,“Instrumental variables and negative controls for observational studies”from Prof.Wang Miao,and“Causal inference with interference”from Dr.Zhichao Jiang.展开更多
BACKGROUND Despite being one of the most prevalent sleep disorders,obstructive sleep apnea hypoventilation syndrome(OSAHS)has limited information on its immunologic foundation.The immunological underpinnings of certai...BACKGROUND Despite being one of the most prevalent sleep disorders,obstructive sleep apnea hypoventilation syndrome(OSAHS)has limited information on its immunologic foundation.The immunological underpinnings of certain major psychiatric diseases have been uncovered in recent years thanks to the extensive use of genome-wide association studies(GWAS)and genotyping techniques using highdensity genetic markers(e.g.,SNP or CNVs).But this tactic hasn't yet been applied to OSAHS.Using a Mendelian randomization analysis,we analyzed the causal link between immune cells and the illness in order to comprehend the immunological bases of OSAHS.AIM To investigate the immune cells'association with OSAHS via genetic methods,guiding future clinical research.METHODS A comprehensive two-sample mendelian randomization study was conducted to investigate the causal relationship between immune cell characteristics and OSAHS.Summary statistics for each immune cell feature were obtained from the GWAS catalog.Information on 731 immune cell properties,such as morphologic parameters,median fluorescence intensity,absolute cellular,and relative cellular,was compiled using publicly available genetic databases.The results'robustness,heterogeneity,and horizontal pleiotropy were confirmed using extensive sensitivity examination.RESULTS Following false discovery rate(FDR)correction,no statistically significant effect of OSAHS on immunophenotypes was observed.However,two lymphocyte subsets were found to have a significant association with the risk of OSAHS:Basophil%CD33dim HLA DR-CD66b-(OR=1.03,95%CI=1.01-1.03,P<0.001);CD38 on IgD+CD24-B cell(OR=1.04,95%CI=1.02-1.04,P=0.019).CONCLUSION This study shows a strong link between immune cells and OSAHS through a gene approach,thus offering direction for potential future medical research.展开更多
Statistical approaches for evaluating causal effects and for discovering causal networks are discussed in this paper.A causal relation between two variables is different from an association or correlation between them...Statistical approaches for evaluating causal effects and for discovering causal networks are discussed in this paper.A causal relation between two variables is different from an association or correlation between them.An association measurement between two variables and may be changed dramatically from positive to negative by omitting a third variable,which is called Yule-Simpson paradox.We shall discuss how to evaluate the causal effect of a treatment or exposure on an outcome to avoid the phenomena of Yule-Simpson paradox. Surrogates and intermediate variables are often used to reduce measurement costs or duration when measurement of endpoint variables is expensive,inconvenient,infeasible or unobservable in practice.There have been many criteria for surrogates.However,it is possible that for a surrogate satisfying these criteria,a treatment has a positive effect on the surrogate,which in turn has a positive effect on the outcome,but the treatment has a negative effect on the outcome,which is called the surrogate paradox.We shall discuss criteria for surrogates to avoid the phenomena of the surrogate paradox. Causal networks which describe the causal relationships among a large number of variables have been applied to many research fields.It is important to discover structures of causal networks from observed data.We propose a recursive approach for discovering a causal network in which a structural learning of a large network is decomposed recursively into learning of small networks.Further to discover causal relationships,we present an active learning approach in terms of external interventions on some variables.When we focus on the causes of an interest outcome, instead of discovering a whole network,we propose a local learning approach to discover these causes that affect the outcome.展开更多
Modern industrial systems are usually in large scale,consisting of massive components and variables that form a complex system topology.Owing to the interconnections among devices,a fault may occur and propagate to ex...Modern industrial systems are usually in large scale,consisting of massive components and variables that form a complex system topology.Owing to the interconnections among devices,a fault may occur and propagate to exert widespread influences and lead to a variety of alarms.Obtaining the root causes of alarms is beneficial to the decision supports in making corrective alarm responses.Existing data-driven methods for alarm root cause analysis detect causal relations among alarms mainly based on historical alarm event data.To improve the accuracy,this paper proposes a causal fusion inference method for industrial alarm root cause analysis based on process topology and alarm events.A Granger causality inference method considering process topology is exploited to find out the causal relations among alarms.The topological nodes are used as the inputs of the model,and the alarm causal adjacency matrix between alarm variables is obtained by calculating the likelihood of the topological Hawkes process.The root cause is then obtained from the directed acyclic graph(DAG)among alarm variables.The effectiveness of the proposed method is verified by simulations based on both a numerical example and the Tennessee Eastman process(TEP)model.展开更多
Causal inference prevails in the field of laparoscopic surgery.Once the causality between an intervention and outcome is established,the intervention can be applied to a target population to improve clinical outcomes....Causal inference prevails in the field of laparoscopic surgery.Once the causality between an intervention and outcome is established,the intervention can be applied to a target population to improve clinical outcomes.In many clinical scenarios,interventions are applied longitudinally in response to patients’conditions.Such longitudinal data comprise static variables,such as age,gender,and comorbidities;and dynamic variables,such as the treatment regime,laboratory variables,and vital signs.Some dynamic variables can act as both the confounder and mediator for the effect of an intervention on the outcome;in such cases,simple adjustment with a conventional regression model will bias the effect sizes.To address this,numerous statistical methods are being developed for causal inference;these include,but are not limited to,the structural marginal Cox regression model,dynamic treatment regime,and Cox regression model with time-varying covariates.This technical note provides a gentle introduction to such models and illustrates their use with an example in the field of laparoscopic surgery.展开更多
Propensity score (PS) adjustment can control confounding effects and reduce bias when estimating treatment effects in non-randomized trials or observational studies. PS methods are becoming increasingly used to estima...Propensity score (PS) adjustment can control confounding effects and reduce bias when estimating treatment effects in non-randomized trials or observational studies. PS methods are becoming increasingly used to estimate causal effects, including when the sample size is small compared to the number of confounders. With numerous confounders, quasi-complete separation can easily occur in logistic regression used for estimating the PS, but this has not been addressed. We focused on a Bayesian PS method to address the limitations of quasi-complete separation faced by small trials. Bayesian methods are useful because they estimate the PS and causal effects simultaneously while considering the uncertainty of the PS by modelling it as a latent variable. In this study, we conducted simulations to evaluate the performance of Bayesian simultaneous PS estimation by considering the specification of prior distributions for model comparison. We propose a method to improve predictive performance with discrete outcomes in small trials. We found that the specification of prior distributions assigned to logistic regression coefficients was more important in the second step than in the first step, even when there was a quasi-complete separation in the first step. Assigning Cauchy (0, 2.5) to coefficients improved the predictive performance for estimating causal effects and improving the balancing properties of the confounder.展开更多
基金This research was funded by the National Natural Science Foundation of China(Grant No.72074060).
文摘Regression is a widely used econometric tool in research. In observational studies, based on a number of assumptions, regression-based statistical control methods attempt to analyze the causation between treatment and outcome by adding control variables. However, this approach may not produce reliable estimates of causal effects. In addition to the shortcomings of the method, this lack of confidence is mainly related to ambiguous formulations in econometrics, such as the definition of selection bias, selection of core control variables, and method of testing for robustness. Within the framework of the causal models, we clarify the assumption of causal inference using regression-based statistical controls, as described in econometrics, and discuss how to select core control variables to satisfy this assumption and conduct robustness tests for regression estimates.
文摘Causal inference is a powerful modeling tool for explanatory analysis,which might enable current machine learning to become explainable.How to marry causal inference with machine learning to develop explainable artificial intelligence(XAI)algorithms is one of key steps toward to the artificial intelligence 2.0.With the aim of bringing knowledge of causal inference to scholars of machine learning and artificial intelligence,we invited researchers working on causal inference to write this survey from different aspects of causal inference.This survey includes the following sections:“Estimating average treatment effect:A brief review and beyond”from Dr.Kun Kuang,“Attribution problems in counterfactual inference”from Prof.Lian Li,“The Yule–Simpson paradox and the surrogate paradox”from Prof.Zhi Geng,“Causal potential theory”from Prof.Lei Xu,“Discovering causal information from observational data”from Prof.Kun Zhang,“Formal argumentation in causal reasoning and explanation”from Profs.Beishui Liao and Huaxin Huang,“Causal inference with complex experiments”from Prof.Peng Ding,“Instrumental variables and negative controls for observational studies”from Prof.Wang Miao,and“Causal inference with interference”from Dr.Zhichao Jiang.
基金Supported by Doctoral Research Fund Project of Henan Provincial Hospital of Traditional Chinese Medicine,No.2022BSJJ10.
文摘BACKGROUND Despite being one of the most prevalent sleep disorders,obstructive sleep apnea hypoventilation syndrome(OSAHS)has limited information on its immunologic foundation.The immunological underpinnings of certain major psychiatric diseases have been uncovered in recent years thanks to the extensive use of genome-wide association studies(GWAS)and genotyping techniques using highdensity genetic markers(e.g.,SNP or CNVs).But this tactic hasn't yet been applied to OSAHS.Using a Mendelian randomization analysis,we analyzed the causal link between immune cells and the illness in order to comprehend the immunological bases of OSAHS.AIM To investigate the immune cells'association with OSAHS via genetic methods,guiding future clinical research.METHODS A comprehensive two-sample mendelian randomization study was conducted to investigate the causal relationship between immune cell characteristics and OSAHS.Summary statistics for each immune cell feature were obtained from the GWAS catalog.Information on 731 immune cell properties,such as morphologic parameters,median fluorescence intensity,absolute cellular,and relative cellular,was compiled using publicly available genetic databases.The results'robustness,heterogeneity,and horizontal pleiotropy were confirmed using extensive sensitivity examination.RESULTS Following false discovery rate(FDR)correction,no statistically significant effect of OSAHS on immunophenotypes was observed.However,two lymphocyte subsets were found to have a significant association with the risk of OSAHS:Basophil%CD33dim HLA DR-CD66b-(OR=1.03,95%CI=1.01-1.03,P<0.001);CD38 on IgD+CD24-B cell(OR=1.04,95%CI=1.02-1.04,P=0.019).CONCLUSION This study shows a strong link between immune cells and OSAHS through a gene approach,thus offering direction for potential future medical research.
文摘Statistical approaches for evaluating causal effects and for discovering causal networks are discussed in this paper.A causal relation between two variables is different from an association or correlation between them.An association measurement between two variables and may be changed dramatically from positive to negative by omitting a third variable,which is called Yule-Simpson paradox.We shall discuss how to evaluate the causal effect of a treatment or exposure on an outcome to avoid the phenomena of Yule-Simpson paradox. Surrogates and intermediate variables are often used to reduce measurement costs or duration when measurement of endpoint variables is expensive,inconvenient,infeasible or unobservable in practice.There have been many criteria for surrogates.However,it is possible that for a surrogate satisfying these criteria,a treatment has a positive effect on the surrogate,which in turn has a positive effect on the outcome,but the treatment has a negative effect on the outcome,which is called the surrogate paradox.We shall discuss criteria for surrogates to avoid the phenomena of the surrogate paradox. Causal networks which describe the causal relationships among a large number of variables have been applied to many research fields.It is important to discover structures of causal networks from observed data.We propose a recursive approach for discovering a causal network in which a structural learning of a large network is decomposed recursively into learning of small networks.Further to discover causal relationships,we present an active learning approach in terms of external interventions on some variables.When we focus on the causes of an interest outcome, instead of discovering a whole network,we propose a local learning approach to discover these causes that affect the outcome.
基金supported by the National Natural Science Foundation of China(Nos.61903345 and 61973287)。
文摘Modern industrial systems are usually in large scale,consisting of massive components and variables that form a complex system topology.Owing to the interconnections among devices,a fault may occur and propagate to exert widespread influences and lead to a variety of alarms.Obtaining the root causes of alarms is beneficial to the decision supports in making corrective alarm responses.Existing data-driven methods for alarm root cause analysis detect causal relations among alarms mainly based on historical alarm event data.To improve the accuracy,this paper proposes a causal fusion inference method for industrial alarm root cause analysis based on process topology and alarm events.A Granger causality inference method considering process topology is exploited to find out the causal relations among alarms.The topological nodes are used as the inputs of the model,and the alarm causal adjacency matrix between alarm variables is obtained by calculating the likelihood of the topological Hawkes process.The root cause is then obtained from the directed acyclic graph(DAG)among alarm variables.The effectiveness of the proposed method is verified by simulations based on both a numerical example and the Tennessee Eastman process(TEP)model.
基金funding from the National Natural Science Foundation of China(82272180)Open Foundation of Key Laboratory of Digital Technology in Medical Diagnostics of Zhejiang Province(SZZD202206)+2 种基金funding from the Sichuan Medical Association Scientific Research Project(S21019)funding from the Key Research and Development Project of Zhejiang Province(2021C03071)funding from Zhejiang Medical and Health Science and Technology Project(2017ZD001)。
文摘Causal inference prevails in the field of laparoscopic surgery.Once the causality between an intervention and outcome is established,the intervention can be applied to a target population to improve clinical outcomes.In many clinical scenarios,interventions are applied longitudinally in response to patients’conditions.Such longitudinal data comprise static variables,such as age,gender,and comorbidities;and dynamic variables,such as the treatment regime,laboratory variables,and vital signs.Some dynamic variables can act as both the confounder and mediator for the effect of an intervention on the outcome;in such cases,simple adjustment with a conventional regression model will bias the effect sizes.To address this,numerous statistical methods are being developed for causal inference;these include,but are not limited to,the structural marginal Cox regression model,dynamic treatment regime,and Cox regression model with time-varying covariates.This technical note provides a gentle introduction to such models and illustrates their use with an example in the field of laparoscopic surgery.
文摘Propensity score (PS) adjustment can control confounding effects and reduce bias when estimating treatment effects in non-randomized trials or observational studies. PS methods are becoming increasingly used to estimate causal effects, including when the sample size is small compared to the number of confounders. With numerous confounders, quasi-complete separation can easily occur in logistic regression used for estimating the PS, but this has not been addressed. We focused on a Bayesian PS method to address the limitations of quasi-complete separation faced by small trials. Bayesian methods are useful because they estimate the PS and causal effects simultaneously while considering the uncertainty of the PS by modelling it as a latent variable. In this study, we conducted simulations to evaluate the performance of Bayesian simultaneous PS estimation by considering the specification of prior distributions for model comparison. We propose a method to improve predictive performance with discrete outcomes in small trials. We found that the specification of prior distributions assigned to logistic regression coefficients was more important in the second step than in the first step, even when there was a quasi-complete separation in the first step. Assigning Cauchy (0, 2.5) to coefficients improved the predictive performance for estimating causal effects and improving the balancing properties of the confounder.