摘要
多标记学习主要用于解决单个样本同时属于多个类别的问题.传统的多标记算法在输入空间仅用单一示例表示多义性对象,过度简化了对象的复杂内涵,导致在表示阶段丢失重要信息.针对这一问题,提出一种结合类别权重及多示例的多标记学习改进算法CWMI-INSDIF.算法采用MIML(Multi-Instance Multi-Label learning)框架,在表示阶段,将学习样本分化为多示例包形式,在生成示例包过程中定义一组描述数据重要度的权重函数,并加入自适应惩罚策略,最终确定了学习样本中各部分信息的权重大小,从而在输入空间更好的描述了样本歧义性.算法给出了在公开数据集的实验结果.通过仿真分析,验证了本文提出的算法在学习性能和分类效果方面的提高.
Multi-label learning deals with the problem where each example is represented by a single instance while associated with multiple class labels. Previous multi-label algorithm indicate that inherent ambiguity of each instance is only expressed as a single in- stance in input space which oversimplified the complex connotation of instance and thus difficult to learn. In this paper, an improved algorithm for multi-label learning based on class weights and multi-instance is proposed. Our approach transforms a single instance into a bag of instances in input space using MIML framework. In this process, we introduce a set of weight function representing the impor- tance of data which adjusted by defining an adaptive penalty strategy, the weight function can determine the weight of each part of the instance and thus make the ambiguity of each instance better to be expressed. Experiments are conducted on the open data set regarding yeast gene function analysis. Experimental results show that CWMI-INSDIF is superior to other multi-label learning algorithms on learning performance and classification results.
出处
《小型微型计算机系统》
CSCD
北大核心
2017年第4期857-862,共6页
Journal of Chinese Computer Systems
基金
江苏省高校自然科学基金项目(12KJB510007)资助