期刊文献+

面向代价敏感的多标记不完备数据特征选择算法 被引量:5

Multi-label Feature Selection Algorithm with Incomplete Data Based on Cost Sensitivity
下载PDF
导出
摘要 代价敏感下的特征选择是机器学习和数据挖掘领域的重要研究内容,目前基于代价敏感的特征选择研究主要是面向单标记的数据,由于在许多应用领域数据往往是多标记连续型数据,且在数据获取过程中由于技术或成本限制导致数据呈现出不完备性.为解决上述问题,提出了一种基于测试代价的多标记不完备数据特征选择算法.首先,算法利用粗糙集模型计算多标记不完备数据下的邻域粒度,并用均匀分布和正态分布两种分布函数计算每个特征的特征代价;然后,提出了一种基于测试代价的特征重要性计算方法,并在核特征的基础上,设计了启发式的特征选择算法;最后,通过在Mulan数据集上的实验结果进一步验证了算法的有效性和可行性. The feature selection based on cost-sensitive is an important research issue in machine learning and data mining. At present,most feature selection work based on cost sensitivity deal with the single-label data. However,the data usually is multi-label and continuous in many applications,due to the technology or cost limitations during data collection,the data is incomplete. To alleviate this problem,a feature selection algorithm for multi-label incomplete data based on test cost is proposed in this paper. At first,the neighborhood granularities of the multi-label incomplete data are computed by the rough set model,and the test cost of each feature are generated according to the uniform distribution and normal distribution. Then,we calculate the core of the feature and put the core to the subset of the feature,at the moment,we redefine the feature criterion of significance degree based on the cost-sensitive is designed,and a heuristic algorithm of the feature selection is designed based on the core of the feature. Finally,the effectiveness and feasibility of the proposed algorithm is verified by the experimental results on the Mulan datasets.
作者 黄琴 钱文彬 王映龙 吴兵龙 HUANG Qin;QIAN Wen-bin;WANG Ying-long;WU Bing-long(School of Computer and Information Engineering,Jiangxi Agricultural University,Nanchang 330045,Chin;Key Laboratory of Agricultural Information Technology of Jiangxi Province,Nanchang 330045,China)
出处 《小型微型计算机系统》 CSCD 北大核心 2018年第12期2617-2624,共8页 Journal of Chinese Computer Systems
基金 国家自然科学基金项目(61502213 61462038 71461013) 江西省自然科学基金项目(20161BAB212049 20161BAB212047)资助
关键词 代价敏感 特征选择 属性约简 不完备数据 多标记分类 cost-sensitive feature selection attribute reduction incomplete data multi-label classification
  • 相关文献

参考文献4

二级参考文献46

共引文献326

同被引文献52

引证文献5

二级引证文献15

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部