摘要
众包借助于网络聚集大众的群体智慧有效地完成各种任务,但在现实的众包平台中普遍存在仅为获得报酬而不认真工作的作弊用户,使得众包获得的任务数据质量不够可靠,制约了众包解决问题的能力。针对该问题,提出作弊用户自动识别方法。通过对百度众包平台参与用户的答题行为进行分析,总结出百度众包平台中存在的作弊用户类型,基于对作弊用户行为特征的分析,采用逻辑回归模型对众包用户建模,根据用户行为特征值计算获得众包用户的可靠性,进而基于用户可靠性实现作弊用户自动识别。实验结果表明,与现有的多数投票决策、标准问题集、Sp EM方法相比,该方法的识别精确度较高,可达97%。
Crowdsourcing can effectively solve a wide variety of tasks by employing the collective intelligence of distributed human population in the network. However,cheating users on crowdsourcing platforms can submit unreliable answers to obtain rewards. They degrade the quality of crowdsourcing services and restrict task resolution. Aiming at this problem, this paper proposes an automatic identification method of cheating users. It systematically analyzes cheating users' behavioral characteristics and empirically summarizes the possible spamming types in the Baidu Crowdsourcing Platform(BCP). Based on the above analysis results, a logistic regression model is constructed to obtain objective measures of user reliability. According to the user' s reliability, the cheating users can be automatically identified. Experimental results show that compared with the baseline methods of majority voting, gold question set and SpEM method, the proposed method has higher recognition accuracy,reaching 97% .
出处
《计算机工程》
CAS
CSCD
北大核心
2016年第8期139-145,152,共8页
Computer Engineering
基金
国家自然科学基金资助项目(61402399)
湛江市科技攻关计划基金资助项目(2015B01050)
岭南师范学院自然科学基金资助项目(QL1410
YL1505)
关键词
众包
作弊用户
行为特征
逻辑回归模型
可靠性
精确性
crowdsourcing
cheating user
behavior characteristics
logistic regression model- reliability- accuracy