摘要
从海量且包含噪声的数据中挑选出关键性的特征,即特征选择,一直是机器学习任务中的重要挑战.鉴于此,提出了基于多视角表征学习和注意力机制的特征选择方法.首先,该算法通过多实例生成器为每个样本生成实例包,同时采用特殊的填补方式来保证特征位置不变性.其次,多视角表征模块从多个视角挖掘特征的自身信息及交互信息,并利用注意力机制模块为这些表征计算贡献度权重,最后,分类网络利用权重化表征进行分类.实验结果表明:该模型可以为每个标签挑选出最具代表性的特征群,并在不同类型数据集上取得了性能提升.
Feature selection plays an important role in many machine learning tasks,which is always a major challenge for selecting the key features from big and noisy data.Therefore,this paper proposes a feature selection method based on multi-view representation learning and attention mechanism.Firstly,Multi-instance generator of the algorithm generates instance bags for each sample,and uses a special padding method to ensure the invariance of feature positions.Then,the multi-view representation module mines the feature’s own information and interaction information from multiple perspectives.Next,the attention mechanism module calculates the contribution weights for these representations.Finally,the classification network uses the weighted representations for classification tasks.The experimental results show that the proposed method can select the most representative feature group for each label and improve the accuracy on different datasets with diverse types.
作者
庞华鑫
韦世奎
马俊才
赵玉凤
赵耀
PANG Huaxin;WEI Shikui;MA Juncai;ZHAO Yufeng;ZHAO Yao(School of Computer and Information Technology,Beijing Jiaotong University,Beijing 100044,China)
出处
《北京交通大学学报》
CAS
CSCD
北大核心
2020年第5期70-76,共7页
JOURNAL OF BEIJING JIAOTONG UNIVERSITY
基金
国家重点研发计划(2017YFC1703503)
国家自然科学基金(61532005,61972022)。
关键词
信号与信息处理
注意力机制
多视角表征
特征选择
signal and information processing
attention mechanism
multi-view representation
feature selection