摘要
针对传统的谱特征选择算法只考虑单特征的重要性,将特征之间的统计相关性引入到传统谱分析中,构造了基于特征相关的谱特征选择模型。首先利用Laplacian Score找出最核心的一个特征作为已选特征,然后设计了新的特征组区分能力目标函数,采用前向贪心搜索策略依次评价候选特征,并选中使目标函数最小的候选特征加入到已选特征。该算法不仅考虑了特征重要性,而且充分考虑了特征之间的关联性,最后在2个不同分类器和8个UCI数据集上的实验结果表明:该算法不仅提高了特征子集的分类性能,而且获得较高的分类精度下所需特征子集的数量较少。
In the traditional spectrum feature selection algorithm, only the importance of single features are considered. In this paper, we introduce the statistical correlation between features into traditional spectrum analysis and construct a spectral feature selection model based on feature correlation. First, the proposed model utilizes the Laplacian Score to identify the most central feature as the selected feature, then designs a new feature group discernibility objective function, and applies the forward greedy search strategy to sequentially evaluate the candidate features. Then, the candidate feature with the minimum objective function is added to the selected features. The algorithm considers both the importance of feature as well as the correlations between features. We conducted experiments on two different classifiers and eight UCI datasets, the results of which show that the algorithm effectively improves the classification performance of the feature subset and also obtains a small number of feature subsets with high classification precision.
作者
胡敏杰
林耀进
杨红和
郑荔平
傅为
HU Minjie LIN Yaojin YANG Honghe ZHENG Liping FU Wei(School of Computer Science, Minnan Normal University, Zhangzhou 363000, Chin)
出处
《智能系统学报》
CSCD
北大核心
2017年第4期519-525,共7页
CAAI Transactions on Intelligent Systems
基金
国家自然科学基金项目(61303131
61379021)
福建省高校新世纪优秀人才支持计划
福建省教育厅科技项目(JA14192)
关键词
特征选择
谱特征选择
谱图理论
特征关联
区分能力
索搜策略
拉普拉斯
分类精度
feature selection
spectral feature selection
spectral graph theory
feature relevance
discernibility
search strategy
Laplaeian score
classification performance