摘要
在线核选择是在线核方法的重要工作,可分为过滤式、包裹式和嵌入式3种类型。已有在线核选择探索了包裹式方法和嵌入式方法,也经验地采用了过滤式方法,但迄今尚没有一个统一的框架来比较、分析并研究各种在线核选择问题。文中提出一种在线核选择的多臂赌博机模型,该模型可作为一个统一框架,同时给出在线核选择的包裹式方法和嵌入式方法。给定候选核集合,候选集中的一个核对应多臂赌博机模型中的一个臂,在线核选择的每回合依据一个概率分布重复地随机选择多个核,并应用指数加权的方法来更新该概率分布。这样,在线核选择问题本质上可归约为一个非遗忘对手环境下的对抗式多臂赌博机问题,并可应用对抗式多臂赌博机模型统一地给出在线核选择的包裹式方法和嵌入式方法。文中进一步提出一个新的在线核选择后悔的概念,理论证明包裹式方法具有关于回合数亚线性的弱期望后悔界,并且嵌入式方法具有关于回合数亚线性的期望后悔界。最后,在标准数据集上通过实验验证了所提统一框架的可行性。
Online kernel selection is an important component of online kernel methods,and it can be classified into three categories,that is,the filter,the wrapper and the embedder.Existing online kernel selection explores the wrapper and the embedder categories,and empirically adopts the filter approach.But there have been no unified frameworks yet for comparing,analyzing and investigating online kernel selection problems.This paper proposed a unified framework for online kernel selection researches via multi-armed bandits,which can model the wrapper and the embedder of online kernel selection simultaneously.Giving a set of candidate kernels,this paper corresponds each kernel to an arm in an adversarial bandit model.At each round of online kernel selection,this paper randomly chose multiple kernels according to a probability distribution,and updated the probability distribution via the exponentially weighted average method.In this way,an online kernel selection problem was reduced to an adversarial bandit problem in a non-oblivious adversary setting,and a unified framework was developed for online kernel selection researches,which can model the wrapper and the embedder uniformly.This paper further defined a new regret concept of online kernel selection,and proved that the wrapper within the framework enjoys a sub-linear weak expected regret bound and the embedder within the framework enjoys a sub-linear expected regret bound.Experimental results on benchmark datasets demonstrate the effectiveness of the proposed unified framework.
作者
李峻樊
廖士中
LI Jun-fan;LIAO Shi-zhong(College of Intelligence and Computing,Tianjin University,Tianjin 300350,China)
出处
《计算机科学》
CSCD
北大核心
2019年第1期57-63,共7页
Computer Science
基金
国家自然科学基金项目(61673293)资助
关键词
在线核选择
对抗式多臂赌博机
非遗忘对手
统一框架
Online kernel selection
Adversarial multi-armed bandit
Non-oblivious adversary
Unified framework