摘要
多标签学习是一种非常重要的机器学习范式.传统的多标签学习方法是在监督或半监督的情况下设计的.通常情况下,它们需要对所有或部分数据进行准确的属于多个类别的标注.在许多实际应用中,拥有大量标注的标签信息往往难以获取,限制了多标签学习的推广和应用.与之相比,标签相关性作为一种常见的弱监督信息,它对标注信息的要求较低.如何利用标签相关性进行多标签学习,是一个重要但未研究的问题.提出了一种利用标签相关性作为先验的弱监督多标签学习方法(WSMLLC).该模型利用标签相关性对样本相似性进行了重述,能够有效地获取标签指示矩阵;同时,利用先验信息对数据的投影矩阵进行约束,并引入回归项对指示矩阵进行修正.与现有方法相比,WSMLLC模型的突出优势在于:仅提供标签相关性先验,就可以实现多标签样本的标签指派任务.在多个公开数据集上进行实验验证,实验结果表明:在标签矩阵完全缺失的情况下,WSMLLC与当前先进的多标签学习方法相比具有明显优势.
Multi-label learning is a very important machine learning paradigm.Traditional multi-label learning methods are designed in supervised or semi-supervised manner.Generally,they require accurate labeling of all or partial data into multiple categories.In many practical applications,it is difficult to obtain the label information with a large number of labels,which greatly restricts the promotion and application of multi-label learning.In contrast,label correlation,as a common weak supervision information,has lower requirements for labeling information.How to use label correlation for multi-label learning is an important but unstudied problem.This study proposes a method named weakly supervised multi-label learning using prior label correlation information(WSMLLC).This model restates the sample similarity by using label correlation,and can obtain label indicator matrix effectively,constrain the projection matrix of data by using prior information,and modify the indicator matrix by introducing regression terms.Compared with the existing methods,the outstanding advantage of WSMLLC model is that it can realize the label assignment of multi-label samples only by providing label correlation priors.Experimental results show that WSMLLC has obvious advantages over current advanced multi-label learning methods in the case of complete loss of label matrix.
作者
欧阳宵
陶红
范瑞东
矫媛媛
侯臣平
OUYANG Xiao;TAO Hong;FAN Rui-Dong;JIAO Yuan-Yuan;HOU Chen-Ping(College of Liberal Arts and Sciences,National University of Defense Technology,Changsha 410073,China;College of Systems Engineering,National University of Defense Technology,Changsha 410073,China)
出处
《软件学报》
EI
CSCD
北大核心
2023年第4期1732-1748,共17页
Journal of Software
基金
国家自然科学基金(61922087,61906201,62006238,62136005)
湖南省杰出青年基金(2019JJ20020)。
关键词
多标签学习
弱监督学习
标签相关性
先验信息
完全缺失标签
multi-label learning
weakly supervised learning
label correlation
prior information
completely missing labels