摘要
通过详细分析多示例主动学习的特点,提出将多示例主动学习概括为包层、示例层以及混合层次主动学习三种模式;针对包层主动学习,将示例数目统计特征作为重要度量并与样本不确定性相结合,提出一种新的样本选择策略.在Corel数据集上进行实验,与传统的主动学习方法比较表明,该算法能够有效减少学习的样本数,显著提高学习器的效率和性能.
By extensively studying the characteristics of active learning in multiple-instance setting, the multiple instance active learning problem (MIAL) was categorized into three paradigms, i. e. bag-level active learning, instance-level active learning and mixture-level active learning. Furthermore, a novel sample selection strategy was proposed to tackle the bag-level MIAL problem, in which the statistical feature of instance number, an important factor in MIL setting, was integrated with the sample uncertainty simultaneously. Experiments were conducted on the Corel image dataset and the results show that, compared with several traditional sample selection strategies, the proposed method can effectively reduce the labor of manual annotating and improve the performance of the multi-instance learner.
关键词
包层
多示例主动学习
图像检索
bag-level
multiple instance active learning
image retrieval