摘要
特征选择应尽可能考虑特征的预测能力、特征间的相关性以及算法的计算效率等因素。由于目前Filter和Wrapper两类特征选择方法均存在着缺陷,提出了一种适用于回归的基于层次聚类算法和偏最小二乘的特征选择方法,它不但能选取出预测能力较强的特征,而且使选出的特征间的相关性低。仿真实验表明,将该方法用于盾构隧道施工地面沉降的回归预测中,所选取的最优特征子集使回归模型的精度得到提高、训练时间明显下降。
There are some important factors should be considered in feature selection, such as the predictive ablity of feature, the correlation between features and the computing cost of algorithm. Due to the insufficiencies of both filter and wrapper feature selection methods, a feature selection method is presented based on hierarchical clustering algorithm and partial least squares. It not only select some high predictive features, but also keep the low correlation of features. This method is used in the regressive prediction of ground sedimentation in shield tunneling process. The simulation experiment shows that the optimal feature subset contributes to higher precision of regression model and lower training time.
出处
《计算机工程与设计》
CSCD
北大核心
2009年第21期4931-4935,共5页
Computer Engineering and Design
基金
国家自然科学基金项目(50778109)
上海市科技攻关计划基金项目(08511501702)
上海市重点学科建设基金项目(J50103)
关键词
特征
聚类
PLS
回归
预测
feature
clustering
PLS
regression
prediction