摘要
因果推理正在成为机器学习领域一个越来越受关注的研究热点,现阶段的因果发现主要是在研究某一种假设条件下,基于纯粹的观测数据推断变量之间的因果方向。然而在现实世界中观察到的数据往往是由一些假设生成,使得传统因果推断方法的识别率不高、稳定性较差。针对当前的问题,提出了一种基于神经网络来解决混合数据因果推断的方法。该方法在混合加性噪声模型(ANM-MM)的假设下,使用梯度下降法最优化改进的损失函数得到混合数据的抽象因果分布参数,然后将分布参数看作是原因变量和结果变量之间的隐变量,通过比较原因变量和分布参数之间的HilberSchmidt独立性来确定二元变量的因果方向。在理论上证明了该方法的可行性,并通过实验表明该算法在人工数据和真实数据的表现较传统的IGCI,ANM,PNL,LiNGAM,SLOPE方法具有较好的准确率和稳定性。
Causal discovery is becoming a research hotspot in the field of machine learning. At present,the causal discovery is mainly to investigate the causal direction between variables based on pure observation data under the study of a certain assumption. However,the data observed in the real world is often generated by some assumptions,which makes the traditional causal inference method less accurate and less stable. Aiming at the current problem,a method based on neural network to solve the causal inference of mixed data is proposed. Under the assumption of additive noise model-mixture model(ANM-MM),the gradient loss method is used to optimize the improved loss function to obtain the abstract causal distribution parameters of the mixed data,and then the distribution parameters are regarded as hidden variable between the cause variable and the result variable. The hidden variable determines the causal direction of the binary variable by comparing the HilberSchmidt independence between the causal variable and the distribution parameter. The feasibility of the method is proved theoretically. The experiment shows that the proposed algorithm has better accuracy and stability than the traditional methods like IGCI,ANM,PNL,LiNGAM and SLOPE.
作者
耿家兴
万亚平
李洪飞
GENG Jia-xing;WAN Ya-ping;LI Hong-fei(School of Computer Science,University of South China,Hengyang 421001,China;CNNC Key Laboratory on High Trusted Computing,Hengyang 421001,China)
出处
《计算机技术与发展》
2020年第5期26-31,共6页
Computer Technology and Development
基金
国家自然科学基金(11805093)
中央军委科技委创新特区项目(17-163-15-XJ-002-002-04)
湖南省教育重点项目(17A185)
湖南省自然科学基金资助项目(2019JJ0486)。