摘要
众包工人的水平良莠不齐,质量控制是众包面临的挑战之一。目前的研究大多通过评估工人质量来保证最终答案的有效性,但是常常忽略众包任务中普遍存在的长尾现象。因此,综合考虑不同任务类型、长尾现象的特点以及工人完成任务的情况,提出构造小样本置信区间来估计工人质量,以解决工人完成任务数量普遍较少情况下的答案决策问题。首先依据黄金标准答案策略对工人质量进行预评估,根据工人质量分布分别对数值型任务和单项选择型任务采用不同的真值初始化方法;然后构造小样本置信区间以准确评估工人质量;最后进行任务答案决策并迭代更新工人质量。为了验证提出方法的有效性,实验在5个真实数据集上进行,与现有方法相比,所提方法能很好地解决长尾现象。特别是在工人完成任务数量普遍较少的情况下,提出的方法在单项选择型任务数据集中的平均准确率高达93%,相比现有方法的最好表现高出16%,且在数值型任务数据集中的MAE值和RMSE值均低于现有方法。
Crowdsourcing is an increasingly important area of computer applications,because it can address problems that difficult for computer to handle alone.For the openness of crowdsourcing,quality control becomes one of the important challenges.In order to ensure the effectiveness of truth inference,current researches leverage answers of trustful workers to infer truths by evalua-ting worker quality generally.However,most existing methods ignore the long-tail phenomena in crowdsourcing,and there is a lack of researches on the truth inference when the number of tasks completed by workers is generally small.Considering the chara cteristics of different task types,long-tail phenomenon and worker answers,this paper constructs the confidence interval of small samples to solve truth inference when the number of tasks completed by workers are generally small.Firstly,worker quality is pre-estimated according to the gold standard answer strategy,and different truth initialization methods are adopted according to the result of pre-estimated.Then,the confidence interval of small samples is constructed to evaluate worker quality accurately.Finally,task truths are inferred and worker quality is updated iteratively.In order to verify the effectiveness of the proposed me-thod,5 real datasets are selected to conduct experiments.Compared with the existing methods,the proposed method can solve the problem of the long tail phenomenon effectively,especially the number of tasks completed by each worker is generally small.The average accuracy of the proposed method for the single-choice tasks is as high as 93%,and higher than 16%of the best perfor-mance of the existing methods.Meanwhile,the values of MAE and RMSE of the proposed method for the numerical tasks are lower than that of the existing methods.
作者
张光园
王宁
ZHANG Guang-yuan;WANG Ning(School of Computer and Information Technology,Beijing Jiaotong University,Beijing 100044,China)
出处
《计算机科学》
CSCD
北大核心
2020年第10期26-31,共6页
Computer Science
基金
国家重点研发计划项目(2018YFC0809800)。
关键词
众包
长尾现象
小样本置信区间
工人质量估计
答案决策
Crowdsourcing
Long-tail phenomenon
Small sample confidence interval
Worker quality estimation
Truth inference