期刊文献+
共找到4篇文章
< 1 >
每页显示 20 50 100
A selective overview of feature screening for ultrahigh-dimensional data 被引量:8
1
作者 LIU JingYuan ZHONG Wei LI RunZe 《Science China Mathematics》 SCIE CSCD 2015年第10期2033-2054,共22页
High-dimensional data have frequently been collected in many scientific areas including genomewide association study, biomedical imaging, tomography, tumor classifications, and finance. Analysis of highdimensional dat... High-dimensional data have frequently been collected in many scientific areas including genomewide association study, biomedical imaging, tomography, tumor classifications, and finance. Analysis of highdimensional data poses many challenges for statisticians. Feature selection and variable selection are fundamental for high-dimensional data analysis. The sparsity principle, which assumes that only a small number of predictors contribute to the response, is frequently adopted and deemed useful in the analysis of high-dimensional data.Following this general principle, a large number of variable selection approaches via penalized least squares or likelihood have been developed in the recent literature to estimate a sparse model and select significant variables simultaneously. While the penalized variable selection methods have been successfully applied in many highdimensional analyses, modern applications in areas such as genomics and proteomics push the dimensionality of data to an even larger scale, where the dimension of data may grow exponentially with the sample size. This has been called ultrahigh-dimensional data in the literature. This work aims to present a selective overview of feature screening procedures for ultrahigh-dimensional data. We focus on insights into how to construct marginal utilities for feature screening on specific models and motivation for the need of model-free feature screening procedures. 展开更多
关键词 高维数据 特征筛选 生物医学成像 变量选择 数据分析 筛选程序 边际效用 蛋白质组学
原文传递
Robust estimation for partially linear models with large-dimensional covariates 被引量:5
2
作者 ZHU LiPing LI RunZe CUI HengJian 《Science China Mathematics》 SCIE 2013年第10期2069-2088,共20页
We are concerned with robust estimation procedures to estimate the parameters in partially linear models with large-dimensional covariates. To enhance the interpretability, we suggest implementing a nonconcave regular... We are concerned with robust estimation procedures to estimate the parameters in partially linear models with large-dimensional covariates. To enhance the interpretability, we suggest implementing a nonconcave regularization method in the robust estimation procedure to select important covariates from the linear component. We establish the consistency for both the linear and the nonlinear components when the covariate dimension diverges at the rate of o(n1/2), where n is the sample size. We show that the robust estimate of linear component performs asymptotically as well as its oracle counterpart which assumes the baseline function and the unimportant covariates were known a priori. With a consistent estimator of the linear component, we estimate the nonparametric component by a robust local linear regression. It is proved that the robust estimate of nonlinear component performs asymptotically as well as if the linear component were known in advance.Comprehensive simulation studies are carried out and an application is presented to examine the fnite-sample performance of the proposed procedures. 展开更多
关键词 部分线性模型 鲁棒估计 协变量 ORACLE 稳健估计 线性组件 参数估计 样本大小
原文传递
Model-free conditional independence feature screening for ultrahigh dimensional data 被引量:5
3
作者 WANG LuHeng LIU JingYuan +1 位作者 LI Yong LI RunZe 《Science China Mathematics》 SCIE CSCD 2017年第3期551-568,共18页
Feature screening plays an important role in ultrahigh dimensional data analysis.This paper is concerned with conditional feature screening when one is interested in detecting the association between the response and ... Feature screening plays an important role in ultrahigh dimensional data analysis.This paper is concerned with conditional feature screening when one is interested in detecting the association between the response and ultrahigh dimensional predictors(e.g.,genetic makers)given a low-dimensional exposure variable(such as clinical variables or environmental variables).To this end,we first propose a new index to measure conditional independence,and further develop a conditional screening procedure based on the newly proposed index.We systematically study the theoretical property of the proposed procedure and establish the sure screening and ranking consistency properties under some very mild conditions.The newly proposed screening procedure enjoys some appealing properties.(a)It is model-free in that its implementation does not require a specification on the model structure;(b)it is robust to heavy-tailed distributions or outliers in both directions of response and predictors;and(c)it can deal with both feature screening and the conditional screening in a unified way.We study the finite sample performance of the proposed procedure by Monte Carlo simulations and further illustrate the proposed method through two real data examples. 展开更多
关键词 特征筛选 高维数据 无模型 条件独立性 筛选程序 蒙特卡洛模拟 数据分析 环境变量
原文传递
Local Linear Regression for Data with AR Errors
4
作者 Runze Li Yan Li 《Acta Mathematicae Applicatae Sinica》 SCIE CSCD 2009年第3期427-444,共18页
In many statistical applications, data are collected over time, and they are likely correlated. In this paper, we investigate how to incorporate the correlation information into the local linear regression. Under the ... In many statistical applications, data are collected over time, and they are likely correlated. In this paper, we investigate how to incorporate the correlation information into the local linear regression. Under the assumption that the error process is an auto-regressive process, a new estimation procedure is proposed for the nonparametric regression by using local linear regression method and the profile least squares techniques. We further propose the SCAD penalized profile least squares method to determine the order of auto-regressive process. Extensive Monte Carlo simulation studies are conducted to examine the finite sample performance of the proposed procedure, and to compare the performance of the proposed procedures with the existing one. From our empirical studies, the newly proposed procedures can dramatically improve the accuracy of naive local linear regression with working-independent error structure. We illustrate the proposed methodology by an analysis of real data set. 展开更多
关键词 Auto-regressive error local linear regression partially linear model profile least squares SCAD
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部