摘要
基于邻域粗糙集框架提出一种针对偏标记数据的特征选择方法,构建偏标记邻域决策系统,定义偏标记学习问题中邻域粗糙集的下近似和依赖度,建立适用于偏标记分类问题的特征选择算法。该算法能够在对特征空间进行邻域粒化的同时度量候选标记集合中标记间的相似程度,选出与标记信息相关性较强的特征子集。使用了2种不同于最常用随机方法的假阳性候选标记生成机制,在实验部分对不同偏标记生成机制进行分析和对比。最后给出了在6个真实偏标记数据集和6个受控单标记数据集上的大量实验对比结果,验证了所提特征选择方法的有效性。
A feature selection method for partial label learning based on neighborhood rough sets is proposed.A partial label neigh⁃borhood decision system is constructed,and the concepts of lower approximation and dependency of neighborhood rough sets are then defined in partial label learning.On this basis,a feature selection algorithm suitable to partial label classification is developed.This method can measure the similarity between labels in the set of candidate labels while granulating the feature space in the neigh⁃borhood,and select a subset of features with strong relevance to the label information.Two generation mechanisms for false positive candidate labels are used which are different from the most often used random method,and their impact on the results are compared and analyzed in the experiments.Finally,extensive experimental results on six real⁃world and six controlled synthetic partial label data sets are presented to demonstrate the effectiveness of the proposed feature selection method.
作者
高贺飞
李艳
王硕
GAO Hefei;LI Yan;WANG Shuo(College of Mathematics and Information Science,Hebei University,Baoding 071002,Hebei,China;School of Applied Mathematics,Beijing Normal University at Zhuhai,Zhuhai 519000,Guangdong,China)
出处
《山东大学学报(理学版)》
CAS
CSCD
北大核心
2024年第5期100-113,共14页
Journal of Shandong University(Natural Science)
基金
国家自然科学基金资助项目(61976141)
河北省自然科学基金资助项目(F2021201055)。
关键词
偏标记学习
特征选择
偏标记邻域决策系统
领域粗糙集
partial label learning
feature selection
partial label neighborhood decision system
neighborhood rough sets