期刊文献+

基于偏相关性测试的递归式因果推断算法 被引量:2

Recursive Causal Inference Algorithm Based on Partial Correlation Test
下载PDF
导出
摘要 因果推断是挖掘事物间联系的一种重要方式,但在高维数据场景下,利用因果推断算法进行条件独立性(CI)测试存在冗余测试多和测试效率低的问题,这限制了因果推断在高维数据集上的应用。提出一种基于偏相关性测试的递归式因果推断算法。采用“分治”的方法对变量集进行递归式因果分割,得到更易于处理的低维子数据集,提高对数据集的处理效率。在每个子数据集上进行局部因果推断,减少每次因果推断的计算量并提升算法的运行速度。在此基础上,通过比较显著性值的合并策略整合所有子结果并得到完整的因果关系,保证总体因果结构的准确性。在“分治”过程中,采用高效的偏相关性测试避免高复杂度的核密度估算,进一步提升算法效率。基于10个经典数据集的实验结果表明,在准确率与经典推断算法CAPA持平的情况下,该算法的运算速度提升了2~10倍,且在样本量越大的数据集中提升效果越明显,证明递归式因果推断算法可以有效处理高维数据集,在保证准确率的同时提高运算效率。 Causal inference is an important tool for mining relationships between observed data points.The causal inference algorithm encounters the problems of redundant tests and low test efficiency in high-dimensional cases,which limits the application of causal inference in high-dimensional datasets.This study proposes a recursive causal inference algorithm based on partial correlation test.The strategy of‘divide and conquer’is used to perform the recursive causal segmentation of the variable set to obtain the low-dimensional sub-dataset,which is easier to handle and improves the processing efficiency of the dataset.Local causal inference is performed on each subset to reduce the computation amount for each causal inference and improve the running speed of the algorithm.Thereafter,the significant values of the merger strategy are compared to integrate all subresults and obtain a complete causal relationship to ensure the accuracy of the overall causal structure.By‘dividing and conquering’,an efficient partial correlation test is used to avoid the high complexity of kernel density estimation and further improve the efficiency of the algorithm.Experiments are performed on ten classical data sets.The results show that when the accuracy is the same as that of the classical inference algorithm,CAPA,the operation speed of this algorithm improved by two to ten times.The improvement effect is more obvious on the dataset with a larger sample size,which proves that the recursive causal inference algorithm can effectively handle high-dimensional datasets,ensure a good accuracy,and improve the operational efficiency.
作者 陈铭杰 张浩 彭昱忠 谢峰 庞悦 CHEN Mingjie;ZHANG Hao;PENG Yuzhong;XIE Feng;PANG Yue(School of Computer Science and Technology,Dongguan University of Technology,Dongguan,Guangdong 523808,China;School of Computer,Guangdong University of Petrochemical Technology,Maoming,Guangdong 525099,China;School of Computer Science,Fudan University,Shanghai 200433,China;School of Computer and Information Engineering,Nanning Normal University,Nanning 530001,China;School of Mathematical Sciences,Peking University,Beijing 100871,China;China UnionPay Post-Doctoral Research Station,Shanghai 201201,China)
出处 《计算机工程》 CAS CSCD 北大核心 2022年第10期123-129,共7页 Computer Engineering
基金 国家自然科学基金(62006051) 中国博士后科学基金(2020M680225) 广东省高校青年创新人才项目(2020KQNCX049)。
关键词 因果推断 因果网络 条件独立性测试 偏相关性测试 递归式算法 causal inference causal network Conditional Independence(CI)test partial correlation test recursive algorithm
  • 相关文献

参考文献3

二级参考文献21

  • 1PEARL J. Causality : models, reasoning and inference [ M ]. Cam- bridge: MIT Press, 2000.
  • 2SPIRTES P, GLYMOUR C, SCHEINES R. Causation, prediction, and search[ M]. Cambridge: MIT Press, 2000.
  • 3TSAMARDINOS I, BROWN L E, ALIFERIS C F. The max-rain hill- climbing Bayesian network structure learning algorithm[j]. Machine Learning, 2006, 65( 1 ) : 31-78.
  • 4CHICKERING D M. Optimal structure identieation with greedy search [J]. Journal of Machine Learning Research, 2002, 3: 507- 554.
  • 5SHIMIZU S, HOYER P O, HYVARINEN A, et al. A linear non- Gaussian aeyclic model for causal discovery[ J]. Journal of Machine Learning Research, 2006, 7 : 2003-2030.
  • 6HOYER P O, JANZING D, MOOIJ J M, et al. Nonlinear causal dis- covery with additive noise models [ C ]//Advances in Neural Informa- tion Processing Systems. Cambridze : MIT Press. 2009 .. 689- 696.
  • 7PETERS J, JANZING D, SCHOLKOPF B. Causal inference on dis- crete data using additive noise models[ J]. IEEE Trans on Pattern Analysis and Machine Intelligence, 2011,33(12) : 2436-2450.
  • 8VARGAS P, MOIOLI R, De CASTRO L N, et al. Artificial homeo- static system: a novel approach[ C]//Proc of the 8th European Con- ference. 2005 : 754-764.
  • 9JANZING D, MOOIJ J, ZHANG Kun, et al. Information- geometric approach to inferring causal directions [ J ]. Artificial Intelligence, 2012, 56( 10):5168-5194.
  • 10COVER T M, THOMAS J A. Elements of information theory [ M ]. New Jersey : Wiley-Blackwell, 2005.

共引文献15

同被引文献4

引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部