信用评级模型是金融机构科学评估客户违约风险的重要工具。以提升信用评级模型分类准确性和确保可解释性为目标,提出将XGBoost算法与Logistic Group Lasso模型相结合的信用评级方法,利用XGBoost算法进行特征选择来简化模型结构,构建Logi...信用评级模型是金融机构科学评估客户违约风险的重要工具。以提升信用评级模型分类准确性和确保可解释性为目标,提出将XGBoost算法与Logistic Group Lasso模型相结合的信用评级方法,利用XGBoost算法进行特征选择来简化模型结构,构建Logistic Group Lasso模型来确保模型中重要变量的可解释性。基于某商业银行小微企业信贷业务数据的实证研究表明,新方法对贷款客户的分类效果显著优于常规方法,能够有效防控客户的违约风险,为金融机构带来更多收益。展开更多
In this paper, we present a simple but powerful ensemble for robust texture classification. The proposed method uses a single type of feature descriptor, i.e. scale-invariant feature transform (SIFT), and inherits t...In this paper, we present a simple but powerful ensemble for robust texture classification. The proposed method uses a single type of feature descriptor, i.e. scale-invariant feature transform (SIFT), and inherits the spirit of the spatial pyramid matching model (SPM). In a flexible way of partitioning the original texture images, our approach can produce sufficient informative local features and thereby form a reliable feature pond or train a new class-specific dictionary. To take full advantage of this feature pond, we develop a group-collaboratively representation-based strategy (GCRS) for the final classification. It is solved by the well-known group lasso. But we go beyond of this and propose a locality-constraint method to speed up this, named local constraint-GCRS (LC-GCRS). Experimental results on three public texture datasets demonstrate the proposed approach achieves competitive outcomes and even outperforms the state-of-the-art methods. Particularly, most of methods cannot work well when only a few samples of each category are available for training, but our approach still achieves very high classification accuracy, e.g. an average accuracy of 92.1% for the Brodatz dataset when only one image is used for training, significantly higher than any other methods.展开更多
鲁棒性作为一种动态行为也是超网络领域的研究热点,对构建鲁棒网络具有重要的现实意义。尽管对超网络的研究越来越多,但对其动态研究相对较少,尤其是在神经影像领域。在现有的脑功能超网络研究中,大多是探究网络的静态拓扑属性,并没有...鲁棒性作为一种动态行为也是超网络领域的研究热点,对构建鲁棒网络具有重要的现实意义。尽管对超网络的研究越来越多,但对其动态研究相对较少,尤其是在神经影像领域。在现有的脑功能超网络研究中,大多是探究网络的静态拓扑属性,并没有相关研究对脑功能超网络的动力学特性——鲁棒性展开分析。针对这些问题,文中首先引入lasso,group lasso和sparse group lasso方法来求解稀疏线性回归模型以构建超网络;然后基于蓄意攻击中的节点度和节点介数攻击两种实验模型,利用全局效率和最大连通子图相对大小探究脑功能超网络在应对攻击时的节点失效网络的鲁棒性,最后通过实验进行对比分析,以探究更为稳定的网络。实验结果表明,在蓄意攻击模式下,group lasso和sparse group lasso方法构建的超网络的鲁棒性更强一些。同时,综合来看,group lasso方法构建的超网络最稳定。展开更多
为了利用因子排序组合的信息并保证组合权重具有一定的稀疏性,基于Sparse Group Lasso (SGLasso)和经典的均值-方差(mean-variance,MV)投资组合策略,构建了能够对高维资产数据集进行投资的SGLasso-MV策略.与Lasso和GLasso相比,SGLasso...为了利用因子排序组合的信息并保证组合权重具有一定的稀疏性,基于Sparse Group Lasso (SGLasso)和经典的均值-方差(mean-variance,MV)投资组合策略,构建了能够对高维资产数据集进行投资的SGLasso-MV策略.与Lasso和GLasso相比,SGLasso能够同时实现组内和组间的稀疏性,并利用了特征分组信息,因此适用于改进MV策略输出权重的不稳定性和高误差性问题.在实证数据方面,利用A股1997年至2019年所有可用A股股票的日际实证数据集,进行了不固定成分股的滚动投资,以避免样本选择性偏误,并将SGLasso-MV与几种经典的投资组合策略进行了比较.结果显示,相比其他同样包含期望收益率估计量的策略,SGLasso-MV的权重能够在样本外实现显著更低的标准差风险和更低的换手率.展开更多
Multiple change-points estimation for functional time series is studied in this paper.The change-point problem is first transformed into a high-dimensional sparse estimation problem via basis functions.Group least abs...Multiple change-points estimation for functional time series is studied in this paper.The change-point problem is first transformed into a high-dimensional sparse estimation problem via basis functions.Group least absolute shrinkage and selection operator(LASSO)is then applied to estimate the number and the locations of possible change points.However,the group LASSO(GLASSO)always overestimate the true points.To circumvent this problem,a further Information Criterion(IC)is applied to eliminate the redundant estimated points.It is shown that the proposed two-step procedure estimates the number and the locations of the change-points consistently.Simulations and two temperature data examples are also provided to illustrate the finite sample performance of the proposed method.展开更多
In this paper,we discuss a splitting method for group Lasso.By assuming that the sequence of the step lengths has positive lower bound and positive upper bound(unrelated to the given problem data),we prove its Q-linea...In this paper,we discuss a splitting method for group Lasso.By assuming that the sequence of the step lengths has positive lower bound and positive upper bound(unrelated to the given problem data),we prove its Q-linear rate of convergence of the distance sequence of the iterates to the solution set.Moreover,we make comparisons with convergence of the proximal gradient method analyzed very recently.展开更多
The varying-coefficient model is flexible and powerful for modeling the dynamic changes of regression coefficients. We study the problem of variable selection and estimation in this model in the sparse, high- dimensio...The varying-coefficient model is flexible and powerful for modeling the dynamic changes of regression coefficients. We study the problem of variable selection and estimation in this model in the sparse, high- dimensional case. We develop a concave group selection approach for this problem using basis function expansion and study its theoretical and empirical properties. We also apply the group Lasso for variable selection and estimation in this model and study its properties. Under appropriate conditions, we show that the group least absolute shrinkage and selection operator (Lasso) selects a model whose dimension is comparable to the underlying mode], regardless of the large number of unimportant variables. In order to improve the selection results, we show that the group minimax concave penalty (MCP) has the oracle selection property in the sense that it correctly selects important variables with probability converging to one under suitable conditions. By comparison, the group Lasso does not have the oracle selection property. In the simulation parts, we apply the group Lasso and the group MCP. At the same time, the two approaches are evaluated using simulation and demonstrated on a data example.展开更多
Model-based clustering is popularly used in statistical literature, which often models the data with a Gaussian mixture model. As a consequence, it requires estimation of a large amount of parameters, especially when ...Model-based clustering is popularly used in statistical literature, which often models the data with a Gaussian mixture model. As a consequence, it requires estimation of a large amount of parameters, especially when the data dimension is relatively large, in this paper, reduced-rank model and group-sparsity regularization are proposed to equip with the model-based clustering, which substantially reduce the number of parameters and thus facilitate the high-dimensional clustering and variable selection simultaneously. We propose an EM algorithm for this task, in which the M-step is solved using alternating minimization. One of the alternating steps involves both nonsmooth function and nonconvex constraint, and thus we propose a linearized alternating direction method of multipliers (ADMM) for solving it. This leads to an efficient algorithm whose subproblems are all easy to solve. In addition, a model selection criterion based on the concept of clustering stability is developed for tuning the clustering model. The effectiveness of the proposed method is supported in a variety of simulated and real examples, as well as its asymptotic estimation and selection consistencies.展开更多
文摘信用评级模型是金融机构科学评估客户违约风险的重要工具。以提升信用评级模型分类准确性和确保可解释性为目标,提出将XGBoost算法与Logistic Group Lasso模型相结合的信用评级方法,利用XGBoost算法进行特征选择来简化模型结构,构建Logistic Group Lasso模型来确保模型中重要变量的可解释性。基于某商业银行小微企业信贷业务数据的实证研究表明,新方法对贷款客户的分类效果显著优于常规方法,能够有效防控客户的违约风险,为金融机构带来更多收益。
文摘In this paper, we present a simple but powerful ensemble for robust texture classification. The proposed method uses a single type of feature descriptor, i.e. scale-invariant feature transform (SIFT), and inherits the spirit of the spatial pyramid matching model (SPM). In a flexible way of partitioning the original texture images, our approach can produce sufficient informative local features and thereby form a reliable feature pond or train a new class-specific dictionary. To take full advantage of this feature pond, we develop a group-collaboratively representation-based strategy (GCRS) for the final classification. It is solved by the well-known group lasso. But we go beyond of this and propose a locality-constraint method to speed up this, named local constraint-GCRS (LC-GCRS). Experimental results on three public texture datasets demonstrate the proposed approach achieves competitive outcomes and even outperforms the state-of-the-art methods. Particularly, most of methods cannot work well when only a few samples of each category are available for training, but our approach still achieves very high classification accuracy, e.g. an average accuracy of 92.1% for the Brodatz dataset when only one image is used for training, significantly higher than any other methods.
文摘鲁棒性作为一种动态行为也是超网络领域的研究热点,对构建鲁棒网络具有重要的现实意义。尽管对超网络的研究越来越多,但对其动态研究相对较少,尤其是在神经影像领域。在现有的脑功能超网络研究中,大多是探究网络的静态拓扑属性,并没有相关研究对脑功能超网络的动力学特性——鲁棒性展开分析。针对这些问题,文中首先引入lasso,group lasso和sparse group lasso方法来求解稀疏线性回归模型以构建超网络;然后基于蓄意攻击中的节点度和节点介数攻击两种实验模型,利用全局效率和最大连通子图相对大小探究脑功能超网络在应对攻击时的节点失效网络的鲁棒性,最后通过实验进行对比分析,以探究更为稳定的网络。实验结果表明,在蓄意攻击模式下,group lasso和sparse group lasso方法构建的超网络的鲁棒性更强一些。同时,综合来看,group lasso方法构建的超网络最稳定。
文摘为了利用因子排序组合的信息并保证组合权重具有一定的稀疏性,基于Sparse Group Lasso (SGLasso)和经典的均值-方差(mean-variance,MV)投资组合策略,构建了能够对高维资产数据集进行投资的SGLasso-MV策略.与Lasso和GLasso相比,SGLasso能够同时实现组内和组间的稀疏性,并利用了特征分组信息,因此适用于改进MV策略输出权重的不稳定性和高误差性问题.在实证数据方面,利用A股1997年至2019年所有可用A股股票的日际实证数据集,进行了不固定成分股的滚动投资,以避免样本选择性偏误,并将SGLasso-MV与几种经典的投资组合策略进行了比较.结果显示,相比其他同样包含期望收益率估计量的策略,SGLasso-MV的权重能够在样本外实现显著更低的标准差风险和更低的换手率.
基金NSFC(Grant No.12171427/U21A20426/11771390)Zhejiang Provincial Natural Science Foundation(Grant No.LZ21A010002)the Fundamental Research Funds for the Central Universities(Grant No.2021XZZX002)。
文摘Multiple change-points estimation for functional time series is studied in this paper.The change-point problem is first transformed into a high-dimensional sparse estimation problem via basis functions.Group least absolute shrinkage and selection operator(LASSO)is then applied to estimate the number and the locations of possible change points.However,the group LASSO(GLASSO)always overestimate the true points.To circumvent this problem,a further Information Criterion(IC)is applied to eliminate the redundant estimated points.It is shown that the proposed two-step procedure estimates the number and the locations of the change-points consistently.Simulations and two temperature data examples are also provided to illustrate the finite sample performance of the proposed method.
基金This research was supported by the National Natural Science Foundation of China(No.61179033)Collaborative Innovation Center on Beijing Society-Building and Social Governance.
文摘In this paper,we discuss a splitting method for group Lasso.By assuming that the sequence of the step lengths has positive lower bound and positive upper bound(unrelated to the given problem data),we prove its Q-linear rate of convergence of the distance sequence of the iterates to the solution set.Moreover,we make comparisons with convergence of the proximal gradient method analyzed very recently.
基金supported by National Natural Science Foundation of China(GrantNos.71271128 and 11101442)the State Key Program of National Natural Science Foundation of China(GrantNo.71331006)+2 种基金National Center for Mathematics and Interdisciplinary Sciences(NCMIS)Shanghai Leading Academic Discipline Project A,in Ranking Top of Shanghai University of Finance and Economics(IRTSHUFE)Scientific Research Innovation Fund for PhD Studies(Grant No.CXJJ-2011-434)
文摘The varying-coefficient model is flexible and powerful for modeling the dynamic changes of regression coefficients. We study the problem of variable selection and estimation in this model in the sparse, high- dimensional case. We develop a concave group selection approach for this problem using basis function expansion and study its theoretical and empirical properties. We also apply the group Lasso for variable selection and estimation in this model and study its properties. Under appropriate conditions, we show that the group least absolute shrinkage and selection operator (Lasso) selects a model whose dimension is comparable to the underlying mode], regardless of the large number of unimportant variables. In order to improve the selection results, we show that the group minimax concave penalty (MCP) has the oracle selection property in the sense that it correctly selects important variables with probability converging to one under suitable conditions. By comparison, the group Lasso does not have the oracle selection property. In the simulation parts, we apply the group Lasso and the group MCP. At the same time, the two approaches are evaluated using simulation and demonstrated on a data example.
文摘Model-based clustering is popularly used in statistical literature, which often models the data with a Gaussian mixture model. As a consequence, it requires estimation of a large amount of parameters, especially when the data dimension is relatively large, in this paper, reduced-rank model and group-sparsity regularization are proposed to equip with the model-based clustering, which substantially reduce the number of parameters and thus facilitate the high-dimensional clustering and variable selection simultaneously. We propose an EM algorithm for this task, in which the M-step is solved using alternating minimization. One of the alternating steps involves both nonsmooth function and nonconvex constraint, and thus we propose a linearized alternating direction method of multipliers (ADMM) for solving it. This leads to an efficient algorithm whose subproblems are all easy to solve. In addition, a model selection criterion based on the concept of clustering stability is developed for tuning the clustering model. The effectiveness of the proposed method is supported in a variety of simulated and real examples, as well as its asymptotic estimation and selection consistencies.