变量选择是统计建模的重要环节,选择合适的变量可以建立结构简单、预测精准的稳健模型。本文在logistic回归下提出了新的双层变量选择惩罚方法——adaptive Sparse Group Lasso(adSGL),其独特之处在于基于变量的分组结构进行筛选,实现...变量选择是统计建模的重要环节,选择合适的变量可以建立结构简单、预测精准的稳健模型。本文在logistic回归下提出了新的双层变量选择惩罚方法——adaptive Sparse Group Lasso(adSGL),其独特之处在于基于变量的分组结构进行筛选,实现了组内和组间双层选择。该方法的优点是对各单个系数和组系数采取不同程度的惩罚,避免了过度惩罚大系数,从而提高了模型的估计和预测精度。求解的难点是惩罚似然函数不是严格凸出的,因此本文基于组坐标下降法求解模型,并建立了调整参数的选取准则。模拟分析表明,对比现有代表性方法 Sparse Group Lasso、Group Lasso及Lasso,adSGL法不仅提高了双层选择精度,而且降低了模型误差。最后,本文将adSGL法应用于信用卡信用评分研究,与logistic回归相比,其具有更高的分类精度和稳健性。展开更多
To address the problems that input variables should be reduced as much as possible and explain output variables fully in building neural network model of complicated system, a variable selection method based on cluste...To address the problems that input variables should be reduced as much as possible and explain output variables fully in building neural network model of complicated system, a variable selection method based on cluster (analysis) was investigated. Similarity coefficient which describes the mutual relation of variables was defined. The methods of the highest contribution rate, part replacing whole and variable replacement are put forwarded and deduced by information theory. The software of the neural network based on cluster analysis, which can provide many kinds of methods for defining variable similarity coefficient, clustering system variable and evaluating variable cluster, was developed and applied to build neural network forecast model of cement clinker quality. The results show that all the network scale, training time and prediction accuracy are perfect. The practical application demonstrates that the method of selecting variables for neural network is feasible and effective.展开更多
为了弥补一次性建模分析的缺陷,提高小麦条锈病遥感监测模型的运行效率和精度,根据模型集群分析(Model population analysis,MPA)算法的特点,综合利用光谱区间选择算法和光谱点选择算法的优势,提出了一种联合相关系数(Correlation coeff...为了弥补一次性建模分析的缺陷,提高小麦条锈病遥感监测模型的运行效率和精度,根据模型集群分析(Model population analysis,MPA)算法的特点,综合利用光谱区间选择算法和光谱点选择算法的优势,提出了一种联合相关系数(Correlation coefficient,CC)与MPA的特征变量优选算法。在利用CC算法对全波段光谱进行特征变量选择的基础上,分别利用基于MPA思想开发的竞争性自适应重加权采样法(Competitive adaptive reweighted sampling,CARS)和变量组合集群分析法(Variable combination population analysis,VCPA)进一步优选对小麦条锈病敏感的特征变量,并利用偏最小二乘回归(Partial least squares regression,PLSR)算法构建了小麦条锈病遥感监测的CC-CARS和CC-VCPA模型。结果表明:联合CC MPA算法优选的特征变量构建的CC-CARS和CC-VCPA模型精度均高于CC、CARS和VCPA算法。3组验证集样本中,CC-CARS模型预测病情指数(Disease index,DI)与实测DI间的R^(2)_(V)较CC模型和CARS模型至少分别提高了6.78%和6.66%,RMSEV至少分别降低了15.31%和10.98%,RPD至少分别提高了18.08%和12.34%。CC VCPA模型预测DI与实测DI间的R^(2)_(V)较CC模型和VCPA模型至少分别提高了9.58%和0.73%,RMSEV至少分别降低了20.78%和3.86%,RPD至少分别提高了26.22%和4.02%。基于CC-MPA的光谱特征优选算法是一种有效的特征选择方法,尤其是利用CC-VCPA方法选择的特征变量数更少,模型预测效果更好,研究结果对光谱特征优选及提高作物病害遥感监测精度具有重要的参考价值。展开更多
文摘变量选择是统计建模的重要环节,选择合适的变量可以建立结构简单、预测精准的稳健模型。本文在logistic回归下提出了新的双层变量选择惩罚方法——adaptive Sparse Group Lasso(adSGL),其独特之处在于基于变量的分组结构进行筛选,实现了组内和组间双层选择。该方法的优点是对各单个系数和组系数采取不同程度的惩罚,避免了过度惩罚大系数,从而提高了模型的估计和预测精度。求解的难点是惩罚似然函数不是严格凸出的,因此本文基于组坐标下降法求解模型,并建立了调整参数的选取准则。模拟分析表明,对比现有代表性方法 Sparse Group Lasso、Group Lasso及Lasso,adSGL法不仅提高了双层选择精度,而且降低了模型误差。最后,本文将adSGL法应用于信用卡信用评分研究,与logistic回归相比,其具有更高的分类精度和稳健性。
文摘To address the problems that input variables should be reduced as much as possible and explain output variables fully in building neural network model of complicated system, a variable selection method based on cluster (analysis) was investigated. Similarity coefficient which describes the mutual relation of variables was defined. The methods of the highest contribution rate, part replacing whole and variable replacement are put forwarded and deduced by information theory. The software of the neural network based on cluster analysis, which can provide many kinds of methods for defining variable similarity coefficient, clustering system variable and evaluating variable cluster, was developed and applied to build neural network forecast model of cement clinker quality. The results show that all the network scale, training time and prediction accuracy are perfect. The practical application demonstrates that the method of selecting variables for neural network is feasible and effective.