摘要
作为一种新型高维数据,函数型数据重在研究数据的内在本质而不是外在结构,通过非参数方法将数据拟合为函数型数据以捕捉更多信息。针对响应变量为二分类情形,建立贝叶斯框架下的函数型Logistic回归模型,引入适当的先验信息并利用MCMC算法获得参数的条件后验分布。具体解决流程为:选取由数据驱动的主成分基函数对回归系数函数和回归函数型自变量进行展开,对展开项数进行截断,利用主成分基函数的正交性,将高维数据进行低维表示;再利用Polya-Gamma变换,建立易于获得参数后验的Gibbs抽样算法,从而得到回归函数展开项系数的后验分布。蒙特卡洛模拟结果显示,该方法具有较好的分类性能。将该方法应用于Tecator实际数据,发现其分类效果优于别的方法。
As a new kind of high-dimensional data,functional data focuses on the intrinsic nature of the data rather than the external structure.More information can be captured by fitting the data into functional data through non-parametric methods.In the case that response variables are binary classified,this paper considers establishing a functional Logistic regression model under the Bayesian framework,and uses the MCMC algorithm to obtain the conditional posterior distribution of parameters by introducing appropriate prior information.The concrete solution process is as follows:firstly,the regression coefficient function and regression function type independent variable are expanded by selecting data-driven principal component basis function,and the number of expanded items is truncated.The high-dimensional data are represented in low dimension by utilizing the orthogonality of principal component basis function.Then Poyla-Gamma transformation is used to establish the Gibbs sampling algorithm that is easy to obtain parameter posterior.The posterior distribution of regression function expansion term coefficient is obtained.Monte Carlo simulation results show that this method has good classification performance.Finally,this paper applies the method to the actual data of Tecator and finds that its classification effect is better than other methods.
作者
邓楠
罗幼喜
DENG Nan;LUO Youxi(School of Sciences,Hubei Univ.of Tech.,Wuhan 430068,China)
出处
《湖北工业大学学报》
2022年第1期115-120,共6页
Journal of Hubei University of Technology
基金
国家社科基金项目(17BJY210)。