期刊文献+

众包平台作弊用户自动识别 被引量:8

Automatic Identification of Cheating Users on Crowdsourcing Platform
下载PDF
导出
摘要 众包借助于网络聚集大众的群体智慧有效地完成各种任务,但在现实的众包平台中普遍存在仅为获得报酬而不认真工作的作弊用户,使得众包获得的任务数据质量不够可靠,制约了众包解决问题的能力。针对该问题,提出作弊用户自动识别方法。通过对百度众包平台参与用户的答题行为进行分析,总结出百度众包平台中存在的作弊用户类型,基于对作弊用户行为特征的分析,采用逻辑回归模型对众包用户建模,根据用户行为特征值计算获得众包用户的可靠性,进而基于用户可靠性实现作弊用户自动识别。实验结果表明,与现有的多数投票决策、标准问题集、Sp EM方法相比,该方法的识别精确度较高,可达97%。 Crowdsourcing can effectively solve a wide variety of tasks by employing the collective intelligence of distributed human population in the network. However,cheating users on crowdsourcing platforms can submit unreliable answers to obtain rewards. They degrade the quality of crowdsourcing services and restrict task resolution. Aiming at this problem, this paper proposes an automatic identification method of cheating users. It systematically analyzes cheating users' behavioral characteristics and empirically summarizes the possible spamming types in the Baidu Crowdsourcing Platform(BCP). Based on the above analysis results, a logistic regression model is constructed to obtain objective measures of user reliability. According to the user' s reliability, the cheating users can be automatically identified. Experimental results show that compared with the baseline methods of majority voting, gold question set and SpEM method, the proposed method has higher recognition accuracy,reaching 97% .
出处 《计算机工程》 CAS CSCD 北大核心 2016年第8期139-145,152,共8页 Computer Engineering
基金 国家自然科学基金资助项目(61402399) 湛江市科技攻关计划基金资助项目(2015B01050) 岭南师范学院自然科学基金资助项目(QL1410 YL1505)
关键词 众包 作弊用户 行为特征 逻辑回归模型 可靠性 精确性 crowdsourcing cheating user behavior characteristics logistic regression model- reliability- accuracy
  • 相关文献

参考文献15

  • 1Howe J.The Rise of Crowdsourcing[J].Wired Magazine,2006,14(6):1-4.
  • 2Difallah D E,Demartini G,Cudré-Mauroux P.Mechanical Cheat:Spamming Schemes and Adversarial Techniques on Crowdsourcing Platforms[C]//Proceedings of the1st International Workshop on Crowdsourcing Web Search.Lyon,France:[s.n.],2012:26-30.
  • 3Raykar V C,Yu Shipeng.Eliminating Spammers and Ranking Annotators for Crowdsourced Labeling Tasks[J].Journal of Machine Learning Research,2012,13(1):491-518.
  • 4Ross J,Irani L,Silberman M,et al.Who Are the Crowdlabelers?:Shifting Demographics in Mechanical Turk[C]//Proceedings of the 28th International Conference on Human Factors in Computing Systems.New York,USA:ACM Press,2010:2863-2872.
  • 5Eickhoff C,Vries A P.Increasing Cheat Robustness of Crowdsourcing Tasks[J].Information Retrieval,2013,16(2):121-137.
  • 6Kazai G,Kamps J,Koolen M,et al.Crowdsourcing for Book Search Evaluation:Impact of Hit Design on Comparative System Ranking[C]//Proceedings of the34th International ACM SIGIR Conference on Research and Development in Information Retrieval.New York,USA:ACM Press,2011:205-214.
  • 7Difallah D E,Demartini G,Cudré-Mauroux P.Pick-acrowd:Tell Me What You Like,and I’ll Tell You What to Do[C]//Proceedings of the 22nd International Conference on World Wide Web.New York,USA:ACM Press,2013:367-374.
  • 8Sheng V S,Provost F,Ipeirotis P G.Get Another Label?Improving Data Quality and Data Mining Using Multiple,Noisy Labelers[C]//Proceedings of ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.New York,USA:ACM Press,2008:614-622.
  • 9Mason W,Watts D J.Financial Incentives and the Performance of Crowds[J].ACM SIGKDD Explora-tions Newsletter,2010,11(2):100-108.
  • 10Harris C.You’re Hired!An Examination of Crowdsourcing Incentive Models in Human Resource Tasks[C]//Proceedings of the 4th ACM International Conference on Web Search and Data Mining.New York,USA:ACM Press,2011:15-18.

二级参考文献34

  • 1Howe Jeff. The rise of crowdsourcing. Wired, 2006, 14(6) : 176-183.
  • 2Callison-Burch C. Fast, cheap, and creative: Evaluating translation quality using Amazon- s mechanical turk//Pro- ceedings of of the Conference on Empirical Methods in Natu- ral Language Processing. Singapore, 2009: 286-295.
  • 3Yan Tingxin, Kumar V, Ganesan D. CrowdSearch: Exploi ting crowds for accurate real-time image search on mobile phones//Proeeedings of the International Conference on Mo- bile Systems, Applications, and Services. San Francisco, USA, 2010:77-90.
  • 4Alonso O, Rose D E, Stewart B. Crowdsoureing for rele- vance evaluation. Journal of SIGIR Forum (SIGIR), 2008, 42(2) : 9-15.
  • 5Alonso O, Mizzaro S. Can we get rid of TREC assessors? Using mechanical turk for relevance assessment//Proceedings of the SIGIR Workshop on the Future of IR Evaluation. Boston, Massachusetts, USA, 2009:15-16.
  • 6Lease M, Carvalho V R, Yilmaz E. Crowdsoureing for search and data mining. Journal of SIGIR Forum (SIGIR), 2011, 45(1): 18-24.
  • 7Kamath K Y, Caverlee J. Transient crowd discovery on the real-time social Web//Proceedings of the WSDM. Hong Kong, China, 2011:585-594.
  • 8Castillo C, Mendoza M, Poblete B. Information credibility on twitter//Proceedings of the WWW. Hyderabad, India, 2011:675-684.
  • 9Bigham J P, Jayant C, Ji H, et al. VizWiz: Nearly real-time answers to visual questions//Proceedings of the 13IST. New York City, USA, 2010. 333-342.
  • 10Hofmann T, Puzicha J. Statistical models for co-occurrence data. Massachusetts Institute of Technology Artificial Intelli- gence Laboratory, Massachusetts State of USA: Technical Report AIM- 1625, CBCL-159, 1998.

共引文献61

同被引文献78

引证文献8

二级引证文献44

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部