摘要
现有的排序学习概念假设每个训练样本都与实例和可靠的标签相关联,但这种假设对保持标签的真实性并不适用。因此,当每个训练实例都被多个可能不可靠的注释器标注时,可以通过列表排序学习从多个注释器中获得的众包标签来进行排序学习功能。结合Mallows模型和Plackett-Luce(P-L)模型,提出一种新的概率排序模型。将注释器的辅助信息作为约束函数融合到参数估计中,并使用最大似然估计方法学习得到参数集。通过最大期望算法(EM)迭代更新参数集,得到最优注释器的专业知识程度的参数集和排名函数参数。实验结果显示,最大似然估计方法明显优于直接排序方法,辅助信息的加入有助于提高注释器的排名功能和专业知识程度。
Existing learning to rank concept assumes that each training sample is associated with an instance and a reliable label.However,this assumption is not applicable to maintain the authenticity of the label.Therefore,when each training instance is marked by a plurality of potentially unreliable annotators,the sorting learning function can be implemented by learning the crowdsourcing label obtained from the plurality of annotators through list sorting.Combining Mallows model and Plackett-Luce(P-L)model,this paper proposes a new probability ordering model.The side information of the annotator was fused into the parameter estimation as a constraint function.Then,the maximum likelihood estimation method was used to learn the optimal parameter set.The parameter set was iteratively updated by the EM algorithm,and the parameter set and ranking function parameters of the professional knowledge degree of the optimal annotator were obtained.The experimental results show that the results of the maximum likelihood estimation is obviously superior to the direct ordering method,and the addition of auxiliary information is helpful to improve the ranking function and professional knowledge of the annotator.
作者
陈华烨
汪海涛
姜瑛
陈星
Chen Huaye;Wang Haitao;Jiang Ying;Chen Xing(Faculty of Information Engineering and Automation,Kunming University of'Science and Technology,Kunming 650500,Yunnan,China)
出处
《计算机应用与软件》
北大核心
2020年第2期207-212,281,共7页
Computer Applications and Software
基金
国家自然科学基金项目(61462049)。