期刊文献+

大数据群体计算中用户主题感知的任务分配 被引量:11

Theme-Aware Task Assignment in Crowd Computing on Big Data
下载PDF
导出
摘要 大数据问题所固有的规模繁杂性、高速增长性、形式多样性、价值密度低等特点为传统计算处理方法带来了严峻的挑战.一方面,大数据的规模繁杂性和高速增长性带来了海量计算分析的需求;另一方面,形式多样性和价值密度低等特点使得大数据计算任务高度依赖复杂认知推理技术.针对大数据计算中海量计算分析和复杂认知推理需求并存的技术挑战,传统的基于计算机的算法已经无法满足日益苛刻的数据处理要求,而基于人机协作的群体计算是有效的解决途径.在大数据群体计算中,最基础的就是任务的分配方式.考虑到大量网络用户不同的专业背景、诚信程度,因此不能简单随机地将要处理的任务交给大众来完成.针对此问题,提出了一种基于用户主题感知的迭代式任务分配算法.利用已知答案的测试问题迭代地检测不同人群的专业背景和完成任务的准确率.在充分了解用户真实主题和准确率的情况下为他们分配合适的问题.通过和随机任务分配算法在模拟数据和真实数据上的对比,有效显示了基于主题感知任务分配算法的准确性. Big data has brought tremendous challenges for the traditional computing model, because of its inherent characteristics such as large volume, high velocity, high variety, low-density value. On the one hand, the large volume and high velocity require the techniques of massive data computation and analysis; on the other hand, the high variety and low-density value make big data computing tasks highly depend on the complex cognitive reasoning technology. To overcome the coexistence challenges of massive data analysis and complex cognitive reasoning, human-machine collaboration based crowd computing is an effective way to solve the big data problem. In crowd computing, task assignment is one of the 'basic pro^blems. H:owever the current cr0wdSourcing platf0rrns danndt support the active task assignment, which iteratively assigns tasks to appropriate workers based on the knowledge background or users. To address this problem, we propose an iterative theme-aware task assignment framework, and deploy it into existing crowdsourcing platforms. The framework includes two components. The first component is task modeling, which models the tasks as a graph where vertices are tasks and edges are task relationships. The second component is the iterative task assignment algorithm, which identifies the themes of the workers by their historical records, computes the workers accuracy on different themes, and assigns the tasks to the appropriate workers. Various experiments validate the effectiveness of our method.
出处 《计算机研究与发展》 EI CSCD 北大核心 2015年第2期309-317,共9页 Journal of Computer Research and Development
基金 国家自然科学基金项目(61373024 61472198) 国家"九七三"重点基础研究发展计划项目(2015CB358700)
关键词 群体计算 人类计算 大数据 众包 人机结合 crowd computing human computation big data crowdsourcing human - computer interaction
  • 相关文献

参考文献19

  • 1李国杰,程学旗.大数据研究:未来科技及经济社会发展的重大战略领域——大数据的研究现状与科学思考[J].中国科学院院刊,2012,27(6):647-657. 被引量:1587
  • 2孟小峰,慈祥.大数据管理:概念、技术与挑战[J].计算机研究与发展,2013,50(1):146-169. 被引量:2377
  • 3王珊,王会举,覃雄派,周烜.架构大数据:挑战、现状与展望[J].计算机学报,2011,34(10):1741-1752. 被引量:615
  • 4Wang J, Kraska T, Franklin M], et al. Crowder: Crowd sourcing entity resolution[J]. Proceedings of the VLDB Endowment, 2012, 5(11): 1483-1494.
  • 5Wang J, Li G, Kraska T, et al. Leveraging transi ti ve relations for crowd sourced joins[C]//Proc of the 2013 Int Conf on Management of Data. New York: ACM. 2013: 229-240.
  • 6Demartini G, Difallah D E, Cudre-Mauroux P. Zen'Crowd , Leveraging probabilistic reasoning and crowdsourcing techniques for large-scale entity linking[C]//Proc of the 21st Int Conf on World Wide Web. New York: ACM, 2012: 469-478.
  • 7Karger D R, Oh S, Shah D. Iterative learning for reliable crowdsourcing systems[C]//Advances in Neural Information Processing Systems. La Jolla: NIPS, 2011: 1953-1961.
  • 8Lindley D V. On a measure of the information provided by an experiment[J]. The Annals of Mathematical Statistics, 1956,27: 986-1005.
  • 9Ye P, EDU U M D, Doermann D. Combining preference and absolute judgements in a crowd-sourced setting[C/OL]// Proc of ICML'13 Workshop: Machine Learning Meets Crowd sourcing.[2014-11-10]' http://www. ics. uci, edu/ qliul/MLcrowd_ICML_ workshop/.
  • 10Franklin M J, Kossmann D, Kraska T, et al. CrowdDB: Answering queries with crowdsourcing[C]//Proc of the 2011 ACM SIGMOD Int Conf on Management of Data. New York: ACM, 2011: 61-72.

二级参考文献263

  • 1[OL].<http://hadoop.apache.org.>.
  • 2WinterCorp: 2005 TopTen Program Summary. http:// www. wintercorp, com/WhitePapers/WC TopTenWP. pdf.
  • 3TDWI Checklist Report: Big Data Analytics. http://tdwi. org/research/2010/08/Big-Data-Analytics, aspx.
  • 4Chaudhuri S, Dayal U. An overview of data warehousing and OLAP technology. SIGMOD Rec, 1997,26(1): 65-74.
  • 5Madden S, DeWitt D J, Stonebraker M. Database parallelism choices greatly impact scalability. DatabaseColumn Blog. http://www, databasecolumn, com/2007/10/database-parallelism-choices, html.
  • 6Dean J, Ghemawat S. MapReduce: Simplified data processing on large clusters//Proceedings of the 6th Symposium on Operating System Design and Implementation (OSDI ' 04). San Francisco, California, USA, 2004: 137-150.
  • 7DeWitt D J, Gerber R H, Graefe G, Heytens M L, Kumar K B, Muralikrishna M. GAMMA--A high performance dataflow database machine//Proceedings of the 12th International Conference on Very Large Data Bases (VLDB' 86). Kyoto, Japan, 1986:228-237.
  • 8Fushimi S, Kitsuregawa M, Tanaka H. An overview of the system software of a parallel relational database machine// Proceedings of the 12th International Conference on Very Large DataBases(VLDB'86). Kyoto, Japan, 1986:209-219.
  • 9Brewer E A. Towards robust distributed systems//Proceedings of the 19th Annual ACM Symposium on Principles of Distributed Computing (PODC' 00). Portland, Oregon, USA, 2000:7.
  • 10http: //www. dbms2, com/2008/08/26/known-applications of mapreduce/.

共引文献4195

同被引文献92

  • 1叶晨,王宏志,周小田,李建中,高宏.基于众包的电子商务数据实体分类系统[J].计算机研究与发展,2013,50(S1):405-409. 被引量:5
  • 2於志文,周兴社.社会感知计算[J].中国计算机学会通讯,2010,6(9):51-54.
  • 3刘云浩.群智感知计算[J].中国计算机学会通讯,2012,8(10):38-41.
  • 4B. Schilit and M. Theimer. Disseminating active map information to mobile host [J]. Network, IEEE, 1994, 8(5):22-32.
  • 5G. D. Abowd, A. K. Dey, P. J. Brown, et al. Towards a better understanding of context and context-awar- reness[J]. Proceedings of the 1st international sympo- sium on Handbeld and Ubiquitous Computing, 1999, 304-307.
  • 6B. Schilit, N. Adams, R. Want, et al. Context-aware computing applications [-J-]. Mobile Computing Sys- tems and Applications, 1994 : 85- 90.
  • 7HOSIO S, GONCALVES J, KOSTAKOS V, et al. Crowdsoureing public opinion using urban pervasive technologies: lessons from real- life experiments in Oulu[ J]. Policy & Internet, 2015, 7(2) : 203 - 222.
  • 8DOUGLAS V A , AULTMAN BEGKER A . Encouraging better graphic design in libraries: a creative commons crowdsourcing ap- proach[ J]. Journal of Library Administration, 2015, 55(6) : 459 - 472.
  • 9LICHMAN, M. UCI machine learning repository [ D]. Irvine, CA: University of California, School of Information and Computer Science, 2013.
  • 10ASUNCION A, WELLING M, SMYTH P, et al. On smoothing and inference for topic models[ C]// Proceedings of the Twenty- Fifth Conference on Uncertainty in Artificial Intelligence. Quebec: AUAI Press, 2009:27 -34.

引证文献11

二级引证文献28

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部