利用辅助信息进行矩阵补全的核方法及其在多标记学习中的应用被引量：1

Kernel method for matrix completion with side information and its application in multi-label learning

导出

摘要现实机器学习任务中一个样本通常和多个标记相关,但获取完整的标记信息需耗费大量人力物力,因此多标记学习经常会遇到标记缺失的情况.将未缺失的标记看作不完全的标记矩阵,将样本特征作为辅助信息,则可通过矩阵补全方法来解决该问题,以往研究主要针对线性可分情形,本文提出KernelMaxide方法,在处理线性不可分多标记数据中缺失的监督信息的同时,不仅能利用数据的非线性结构,还能考虑标记之间的相互关系.该方法依据矩阵核范数的表示定理,构建了基于核矩阵的核范数最小化优化目标以及相应的优化算法,并用Nystrm方法缓解核矩阵的存储和计算开销问题.实验显示出KernelMaxide的优越性能. In practical machine learning, one instance is always associated with multiple labels. However, due to high cost, it is difficult to acquire the full supervised information for multi-label data. Thus, multi-label learning faces the problem of missing supervised information. By considering missing labels as unobserved entries in a matrix and features as side information, the matrix completion algorithm can be exploited to solve the missingsuper vised-information problem in multi-label learning. While the previous research often focused on the case where data is linearly separable, in this paper, we propose the KernelMaxide algorithm, which not only exploits the nonlinear structure in the missing-super vised-information multi-label data, but also considers the correlation between labels. In particular, we construct a novel optimization objective based on the kernel matrix, using the Representer Theorem of Matrix Norm. We further use the Nystrm method to reduce the memory and computational burden on the kernel matrix. Experiments show the merit of our proposal.

作者徐淼周志华

机构地区南京大学计算机软件新技术国家重点实验室软件新技术与产业化协同创新中心

出处《中国科学：信息科学》 CSCD 北大核心 2018年第1期47-59,共13页 Scientia Sinica(Informationis)

基金国家自然科学基金(批准号:61333014)资助项目

关键词机器学习多标记学习矩阵补全核方法 Nystrm方法 machine learning, multi-label learning, matrix completion, kernel method, NystrSm method

分类号 TP181 [自动化与计算机技术—控制理论与控制工程]

引文网络
相关文献

参考文献3

1ZHUANG YueTin HAN YaHong WU Fei YANG JiaCheng.Stable multi-label boosting for image annotation with structural feature selection[J].Science China(Information Sciences),2011,54(12):2508-2521. 被引量：4
2Nengneng GAO,Sheng-Jun HUANG,Songcan CHEN.Multi-label active learning by model guided distribution matching[J].Frontiers of Computer Science,2016,10(5):845-855. 被引量：4
3SHAO Huan,LI GuoZheng,LIU GuoPing,WANG YiQin.Symptom selection for multi-label data of inquiry diagnosis in traditional Chinese medicine[J].Science China(Information Sciences),2013,56(5):233-245. 被引量：8

二级参考文献27

1王学伟,瞿海斌,王阶.一种基于数据挖掘的中医定量诊断方法[J].北京中医药大学学报,2005,28(1):4-7. 被引量：40
2李国春,李春婷,黄蓝平,单兆伟,陈启光.结构方程模型探讨慢性萎缩性胃炎证候分型规律[J].南京中医药大学学报,2006,22(4):217-220. 被引量：27
3Grangier D, Bengio S. A discriminative kernel-based approach to rank images from text queries. IEEE Trans Patt Anal Mach Intel, 2008, 30:1371-1384.
4Chen Y, Wang J Z, Geman D. Image categorization by learning and reasoning with regions. J Mach Learn Res, 2004, 5:913- 939.
5Maron O, Ratan A L. Multiple-instance learning for natural scene classification. In: Proceedings of the 15th International Conference on Machine Learning, Madison, Wisconsin, USA, 1998. 341 -349.
6Wang C, Yan S, Zhang L, et al. Multi-label sparse coding for automatic image annotation. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Miami, FL, USA, 2009. 1643-1650.
7Han Y, Wu F, Jia J, et al. Multi-task sparse discriminant analysis (MTSDA) with overlapping categories. In: Proceedings of the 24th AAAI Conference on Artificial Intelligence, Atlanta, Georgia, USA, 2010. 469- 474.
8Cao L, Luo J, Liang F, et al. Heterogeneous feature machines for visual recognition. In: Proceedings of the 12th IEEE International Conference on Computer Vision, Kyoto, Japan, 2009.
9Tibshirani R. Regression shrinkage and selection via the lasso. J Royal Stat Soc Ser B, 1996, 58:267- 288.
10Yuan M, Lin Y. Model selection and estimation in regression with gronped variables. J Royal Star Soc Ser B, 2006, 68: 49- 67.

共引文献13

1丁守鸿,黄飞跃,谢志峰,吴永坚,盛斌,马利庄.A Customized Framework to Recompress Massive Internet Images[J].Journal of Computer Science & Technology,2012,27(6):1129-1139. 被引量：2
2Jinjiang LI,Hanyi GE.New progress in geometric computing for image and video processing[J].Frontiers of Computer Science,2012,6(6):769-775. 被引量：1
3Yufeng Zhao,Qi Xie,Liyun He,Baoyan Liu,Kun Li,Xiang Zhang,Wenjing Bai,Lin Luo,Xianghong Jing,Ruili Huo.Comparsion analysis of data mining models applied to clinical research in Traditional Chinese Medicine[J].Journal of Traditional Chinese Medicine,2014,34(5):627-634. 被引量：16
4李国正,刘保延.Big Data Is Essential for Further Development of Integrative Medicine[J].Chinese Journal of Integrative Medicine,2015,21(5):323-331. 被引量：2
5潘主强,张林,颜仕星,张磊.中医临床数据疾病分类机器学习方法研究[J].计算机工程与应用,2017,53(13):146-154. 被引量：3
6Min-Ling ZHANG,Yu-Kun LI,Xu-Ying LIU,Xin GENG.Binary relevance for multi-label learning： an overview[J].Frontiers of Computer Science,2018,12(2):191-202. 被引量：26
7陈福才,李思豪,张建朋,黄瑞阳.基于标签关系改进的多标签特征选择算法[J].计算机科学,2018,45(6):228-234. 被引量：2
8马鸿超,张坤丽,赵悦淑,昝红英,庄雷.基于特征融合的产科多标记辅助诊断研究[J].中文信息学报,2018,32(5):128-136. 被引量：3
9Wen-Xiang DENG,Jian-Ping ZHU,Ying-Jiao LIU,Yi-Ge ZHANG,Hui-Yong HUANG,Wen-An ZHANG.Design of a WeChat Learning Platform for Syndrome Differentiation[J].Digital Chinese Medicine,2018,1(2):143-154. 被引量：2
10Hao SHAO.Query by diverse committee in transfer active learning[J].Frontiers of Computer Science,2019,13(2):280-291. 被引量：3

同被引文献2

1王晨曦,林耀进,唐莉,傅为,林培榕.基于信息粒化的多标记特征选择算法[J].模式识别与人工智能,2018,31(2):123-131. 被引量：20
2胡敏杰,林耀进,王晨曦,唐莉,郑荔平.基于拉普拉斯评分的多标记特征选择算法[J].计算机应用,2018,38(11):3167-3174. 被引量：5

引证文献1

1张志浩,林耀进,卢舜,郭晨,王晨曦.缺失标记下基于类属属性的多标记特征选择[J].计算机应用,2021,41(10):2849-2857. 被引量：1

二级引证文献1

1孙林,马天娇,薛占熬.基于Fisher score与模糊邻域熵的多标记特征选择算法[J].计算机应用,2023,43(12):3779-3789. 被引量：3

1魏立飞,俸秀强,李丹丹,牟紫薇.基于S^3VM模型的高光谱遥感影像分类[J].测绘通报,2017(12):43-47. 被引量：1
2蔡穗穗.如何加强医院内部审计管理[J].财经界,2018(2):114-114. 被引量：1
3谢丽辉.司法案件监督系统,让司法更“阳光”[J].人民之友,2017,0(5):30-30. 被引量：1
4郭荣超,李德玉,王素格.基于标记关系的模糊粗糙集模型[J].模式识别与人工智能,2017,30(10):952-960. 被引量：4
5郑石秋,李寿梅.生成元连续且线性增长的反射倒向随机微分方程生成元的表示定理（英文）[J].应用概率统计,2017,33(6):551-566. 被引量：1
6夏开建,靳勇.核最小模最小平方误差方法医学图像识别算法[J].中国医疗设备,2018,33(2):73-76.
7国办：推进重大项目相关政府信息公开[J].中国经济周刊,2017,0(50):8-8.
8王加楠,鲁强.基于模式的远监督关系抽取算法[J].中文信息学报,2017,31(4):122-131. 被引量：3
9徐俊,李元祥,Wei Xian,骆建华.基于核字典学习的图像分类[J].计算机应用研究,2017,34(12):3820-3824. 被引量：1
10陆辉.电子档案大数据的可视化组织和分析[J].科技通报,2017,33(12):175-178.

中国科学：信息科学

2018年第1期

浏览历史

内容加载中请稍等...

利用辅助信息进行矩阵补全的核方法及其在多标记学习中的应用被引量：1

参考文献3

二级参考文献27

共引文献13

同被引文献2

引证文献1

二级引证文献1

相关作者

相关机构

相关主题

浏览历史

利用辅助信息进行矩阵补全的核方法及其在多标记学习中的应用 被引量：1

参考文献3

二级参考文献27

共引文献13

同被引文献2

引证文献1

二级引证文献1

相关作者

相关机构

相关主题

浏览历史

利用辅助信息进行矩阵补全的核方法及其在多标记学习中的应用被引量：1