摘要
随着各行业的快速发展和对数据应用的重视,产生的数据越来越多,结构也越来越复杂,含异常值的数据和高维数据越来越多地出现在我们的视野中.传统的典型相关分析对异常值非常敏感,基于MCD估计方法的典型相关分析对异常值有一定的抵御作用,但随着数据维数的增加MCD估计的偏差不断变大,稳健性也随之降低,且在数据维数大于样本量的时候MCD估计方法失效,因此提出了基于MRCD估计方法的高维稳健典型相关分析.数值模拟和实证分析的结果表明,基于MRCD估计方法的典型相关分析能很好地抵御异常值,而且在数据维数大于样本量的情况下,基于MRCD估计方法的典型相关分析更为有效.
With the rapid development of various industries and the emphasis on data applications,more and more data are generated,and the structure of data is becoming more and more complex.More and more data containing outliers and highdimensional data appear in our field of vision.The traditional canonical correlation analysis is very sensitive to outliers,and the canonical correlation analysis based on the MCD estimation method has a certain resistance to outliers,but as the data dimension increases,the deviation of the MCD estimation continues to increase,and the robustness also decreases.And the MCD estimation method fails when the data dimension is greater than the sample size,so this paper proposes a high-dimensional robust canonical correlation analysis based on the MRCD estimation method.The results of numerical simulation and empirical analysis show that the canonical correlation analysis based on the MRCD estimation method can resist outliers well,and when the data dimension is greater than the sample size,the canonical correlation analysis based on the MRCD estimation method is more effective.
作者
姜云卢
邓罡
文诗涵
刘峻成
JIANG Yunlu;DENG Gang;WEN Shihan;LIU Juncheng(School of Economics,Jinan University,Guangzhou 510632)
出处
《系统科学与数学》
CSCD
北大核心
2021年第10期2965-2976,共12页
Journal of Systems Science and Mathematical Sciences
基金
广东省自然科学基金项目(2018A030313171,2019A1515011830)资助课题。
关键词
异常值
高维数据
MCD估计
典型相关分析
Outliers
high-dimensional data
MCD estimation
MRCD estimation
canonical correlation analysis