期刊文献+

一种任意分布下的隐变量因果结构学习算法 被引量:1

An Algorithm for Learning Causal Structure of Latent Variables with Arbitrary Distribution
下载PDF
导出
摘要 因果发现旨在通过观测数据挖掘变量间的因果关系,在实际应用中需要从观测数据中学习隐变量间的因果结构。现有方法主要利用观测变量间的协方差信息(如四分体约束)或引入非高斯假设(如三分体约束)来解决线性因果模型下的隐变量结构学习问题,但大多限定于分布明确的情况,而实际应用环境往往并不满足这种假设。给出任意分布下隐变量结构的识别性证明,指出在没有混淆因子影响的情况下,两个隐变量的因果方向可识别所需要的最小条件是仅需要其中一个隐变量的噪声服从非高斯分布。在此基础上,针对线性隐变量模型提出一种在任意分布下学习隐变量因果结构的算法,先利用四分体约束方法学习得到隐变量骨架图,再通过枚举骨架图的等价类并测量每一个等价类中的三分体约束来学习因果方向,同时将非高斯约束放宽到尽可能最小的变量子集,从而扩展线性隐变量模型的应用范围。实验结果表明,与MIMBuild和三分体约束方法相比,该算法得到了最佳的F1值,能够在任意分布下学习更多的隐变量因果结构信息,且具有更强的鲁棒性。 Causal discovery refers to mining the causal relationship between variables through observation data. In practical application,it needs to learn the causal structure between hidden variables from observation data.Some existing methods mainly address the problem of learning the structure of latent variables based on linear causal models using covariance information among observed variables(e. g.,Tetrad constraints) or introducing non-Gaussian assumptions(e.g.,Triad constraints).However,most of the existing methods are limited to cases with well-defined distributions,and the abovementioned assumptions are often not satisfied in practical applications. This paper provides an identification proof of a latent variable structure with arbitrary distribution and shows that when the effects of confounding factors are absent,the minimum non-Gaussian information required for identifying the causal directions of two latent variables is that only one of the latent variables contains non-Gaussian noise. On this basis,an algorithm is proposed for learning causal structure of latent variables with arbitrary distribution for linear latent variable model. The algorithm first learns the skeleton of the latent variable using the tetrad constraint-based method.Subsequently,it estimates the causal direction by enumerating the equivalence classes of the skeleton and testing the triad constraints in each equivalence class. The algorithm relaxes the non-Gaussianity requirements to a small subset of variables and then extends the application scope of the linear latent variable models.Experimental results show that compared with the MIMBuild and Triad methods,the proposed algorithm achieves the best F1 value,which signifies that it can learn more causal structure information of latent variables with arbitrary distribution and exhibits higher robustness.
作者 郝志峰 陈正鸣 谢峰 陈薇 蔡瑞初 HAO Zhifeng;CHEN Zhengming;XIE Feng;CHEN Wei;CAI Ruichu(School of Computer,Guangdong University of Technology,Guangzhou 510006,China;College of Science,Shantou University,Shantou,Guangdong 515063,China;School of Mathematical Sciences,Peking University,Beijing 100871,China)
出处 《计算机工程》 CAS CSCD 北大核心 2022年第9期121-129,共9页 Computer Engineering
基金 国家自然科学基金(61876043,61976052) 中国博士后科学基金(2020M680225)。
关键词 因果发现 因果结构 任意分布 隐变量 函数因果模型 causal discovery causal structure arbitrary distribution latent variable functional-based causal model
  • 相关文献

参考文献2

二级参考文献3

共引文献50

同被引文献2

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部