摘要
传统的基于样本的互信息估计方法不能直接处理离散、连续属性混合的情况.本文给出一种能够直接处理混合属性的互信息估计方法(PG 法).为了更好地考虑属性之间的关联,提出名为 HMI 的特征选择准则.结合PG 互信息估计方法和 HMI 特征选择准则,给出一种新的特征选择方法(PG-HMI).实验结果验证 PG 互信息估计法的合理性及 PG-HMI 特征选择方法的有效性.
Conventional sample-based mutual information estimation methods can't handle the mixed features directly that include both numeric attributes and nominal attributes. A Parzen window based general mutual information calculation method, PG method, is proposed in this paper, which could deal with the mixed attributes directly. A criterion named hybrid mutual information (HMI) is presented. Based on PG mutual information estimation method and HMI feature selection criterion, a feature selection algorithm (PG-HMI) is proposed. Experimental results show the correctness of PG and the effectiveness of PG-HMI.
出处
《模式识别与人工智能》
EI
CSCD
北大核心
2007年第1期55-63,共9页
Pattern Recognition and Artificial Intelligence
基金
国家自然科学基金重大项目(No.50595414)
国家重点基础研究发展规划项目(No.2004CB217904)
国家自然科学基金项目(No.50107005)
新世纪优秀人才支持计划项目
关键词
特征选择
互信息
混合互信息(HMI)
分类器
数据挖掘
Feature Selection, Mutual Information, Hybrid Mutual Information (HMI), Classifier,Data Mining