摘要
将信息论中熵的概念应用到特征选择中,定义了两种信息测度评价特征——误差熵和混叠熵,然后阐述了两种定义的不用物理意义,分析了计算熵中最关键的区间划分问题,并提出一种较好的区间划分方法。由于熵不能将相似的特征进行剔除,结合相似系数提出了一套完整的基于熵的特征选择过程,并通过仿真实验进行验证。
This paper applies the entropy concept of information theory to feature selection,and defines two information measurements overlap entropy and error entropy that are used to evaluate features.Then the physical meanings of the two definiens are explained.The crucial problem about how to divide the interzones is provided in the following.Then this paper adds a kind of similarity to form a complete feature selection process in order to involve the defect of entropy that it can not filter similar features.An emulation is made to prove the validity of this method at the end.
出处
《计算机工程与应用》
CSCD
北大核心
2009年第15期54-57,共4页
Computer Engineering and Applications
关键词
特征选择
混叠熵
误差熵
相似度
特征评价
信息测度
feature selection
overlap-entropy
erro-entropy
similarity
feature evalution
information measurement