期刊文献+
共找到5篇文章
< 1 >
每页显示 20 50 100
A tribute to Professor Xiru Chen Regularity Properties for Sparse Regression
1
作者 Edgar Dobriban Jianqing Fan 《Communications in Mathematics and Statistics》 SCIE 2016年第1期1-19,共19页
Statistical and machine learning theory has developed several conditionsensuring that popular estimators such as the Lasso or the Dantzig selector performwell in high-dimensional sparse regression,including the restri... Statistical and machine learning theory has developed several conditionsensuring that popular estimators such as the Lasso or the Dantzig selector performwell in high-dimensional sparse regression,including the restricted eigenvalue,compatibility,and lq sensitivity properties.However,some of the central aspects of theseconditions are not well understood.For instance,it is unknown if these conditions canbe checked efficiently on any given dataset.This is problematic,because they are atthe core of the theory of sparse regression.Here we provide a rigorous proof that theseconditions are NP-hard to check.This shows that the conditions are computation-ally infeasible to verify,and raises some questions about their practical applications.However,by taking an average-case perspective instead of the worst-case view of NP-hardness,we show that a particular condition,Cq sensitivity,has certain desirableproperties.This condition is weaker and more general than the others.We show thatit holds with high probability in models where the parent population is well behaved,and that it is robust to certain data processing steps.These results are desirable,as theyprovide guidance about when the condition,and more generally the theory of sparseregression,may be relevant in the analysis of high-dimensional correlated observa-tional data. 展开更多
关键词 High-dimensional statistics sparse regression Restricted eigenvalue lq sensitivity Computational complexity
原文传递
Machine learning of partial differential equations from noise data
2
作者 Wenbo Cao Weiwei Zhang 《Theoretical & Applied Mechanics Letters》 CAS CSCD 2023年第6期441-446,共6页
Machine learning of partial differential equations(PDEs)from data is a potential breakthrough for addressing the lack of physical equations in complex dynamic systems.Recently,sparse regression has emerged as an attra... Machine learning of partial differential equations(PDEs)from data is a potential breakthrough for addressing the lack of physical equations in complex dynamic systems.Recently,sparse regression has emerged as an attractive approach.However,noise presents the biggest challenge in sparse regression for identifying equations,as it relies on local derivative evaluations of noisy data.This study proposes a simple and general approach that significantly improves noise robustness by projecting the evaluated time derivative and partial differential term into a subspace with less noise.This method enables accurate reconstruction of PDEs involving high-order derivatives,even from data with considerable noise.Additionally,we discuss and compare the effects of the proposed method based on Fourier subspace and POD(proper orthogonal decomposition)subspace.Generally,the latter yields better results since it preserves the maximum amount of information. 展开更多
关键词 Partial differential equation Machine learning sparse regression Noise data
下载PDF
Mutation detection and fast identification of switching system based on data-driven method
3
作者 张钟化 徐伟 宋怡 《Chinese Physics B》 SCIE EI CAS CSCD 2023年第5期164-177,共14页
In the engineering field,switching systems have been extensively studied,where sudden changes of parameter value and structural form have a significant impact on the operational performance of the system.Therefore,it ... In the engineering field,switching systems have been extensively studied,where sudden changes of parameter value and structural form have a significant impact on the operational performance of the system.Therefore,it is important to predict the behavior of the switching system,which includes the accurate detection of mutation points and rapid reidentification of the model.However,few efforts have been contributed to accurately locating the mutation points.In this paper,we propose a new measure of mutation detection—the threshold-based switching index by analogy with the Lyapunov exponent.We give the algorithm for selecting the optimal threshold,which greatly reduces the additional data collection and the relative error of mutation detection.In the system identification part,considering the small data amount available and noise in the data,the abrupt sparse Bayesian regression(abrupt-SBR)method is proposed.This method captures the model changes by updating the previously identified model,which requires less data and is more robust to noise than identifying the new model from scratch.With two representative dynamical systems,we illustrate the application and effectiveness of the proposed methods.Our research contributes to the accurate prediction and possible control of switching system behavior. 展开更多
关键词 mutation detection switching index system identification sparse Bayesian regression
下载PDF
A two-step method for estimating high-dimensional Gaussian graphical models
4
作者 Yuehan Yang Ji Zhu 《Science China Mathematics》 SCIE CSCD 2020年第6期1203-1218,共16页
The problem of estimating high-dimensional Gaussian graphical models has gained much attention in recent years. Most existing methods can be considered as one-step approaches, being either regression-based or likeliho... The problem of estimating high-dimensional Gaussian graphical models has gained much attention in recent years. Most existing methods can be considered as one-step approaches, being either regression-based or likelihood-based. In this paper, we propose a two-step method for estimating the high-dimensional Gaussian graphical model. Specifically, the first step serves as a screening step, in which many entries of the concentration matrix are identified as zeros and thus removed from further consideration. Then in the second step, we focus on the remaining entries of the concentration matrix and perform selection and estimation for nonzero entries of the concentration matrix. Since the dimension of the parameter space is effectively reduced by the screening step,the estimation accuracy of the estimated concentration matrix can be potentially improved. We show that the proposed method enjoys desirable asymptotic properties. Numerical comparisons of the proposed method with several existing methods indicate that the proposed method works well. We also apply the proposed method to a breast cancer microarray data set and obtain some biologically meaningful results. 展开更多
关键词 covariance estimation graphical model penalized likelihood sparse regression two-step method
原文传递
High Accuracy Gene Signature for Chemosensitivity Prediction in Breast Cancer
5
作者 Wei Hu 《Tsinghua Science and Technology》 SCIE EI CAS CSCD 2015年第5期530-536,共7页
Neoadjuvant chemotherapy for breast cancer patients with large tumor size is a necessary treatment.After this treatment patients who achieve a pathologic Complete Response(p CR) usually have a favorable prognosis th... Neoadjuvant chemotherapy for breast cancer patients with large tumor size is a necessary treatment.After this treatment patients who achieve a pathologic Complete Response(p CR) usually have a favorable prognosis than those without. Therefore, p CR is now considered as the best prognosticator for patients with neoadjuvant chemotherapy. However, not all patients can benefit from this treatment. As a result, we need to find a way to predict what kind of patients can induce p CR. Various gene signatures of chemosensitivity in breast cancer have been identified, from which such predictors can be built. Nevertheless, many of them have their prediction accuracy around 80%. As such, identifying gene signatures that could be employed to build high accuracy predictors is a prerequisite for their clinical tests and applications. Furthermore, to elucidate the importance of each individual gene in a signature is another pressing need before such signature could be tested in clinical settings. In this study, Genetic Algorithm(GA) and Sparse Logistic Regression(SLR) along with t-test were employed to identify one signature. It had 28 probe sets selected by GA from the top 65 probe sets that were highly overexpressed between p CR and Residual Disease(RD) and was used to build an SLR predictor of p CR(SLR-28). This predictor tested on a training set(n = 81) and validation set(n = 52) had very precise predictions measured by accuracy,specificity, sensitivity, positive predictive value, and negative predictive value with their corresponding P value all zero. Furthermore, this predictor discovered 12 important genes in the 28 probe set signature. Our findings also demonstrated that the most discriminative genes measured by SLR as a group selected by GA were not necessarily those with the smallest P values by t-test as individual genes, highlighting the ability of GA to capture the interacting genes in p CR prediction as multivariate techniques. Our gene signature produced superior performance over a signature found in one previous study with prediction accuracy 92% vs 76%, demonstrating the potential of GA and SLR in identifying robust gene signatures in chemo response prediction in breast cancer. 展开更多
关键词 genetic algorithm gene signature breast cancer sparse logistic regression predictor chemosensitivity
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部