期刊文献+

大数据背景下多重抽样框方法探讨 被引量:1

Discussion on Method of Multiple Sampling Frame Under Big Data Background
下载PDF
导出
摘要 当前所获取的大数据并非都是总体数据,通常未能完全覆盖总体,因其多源异构的特性,致使传统的数据分析方法受阻。文章将抽样调查方法引入到大数据中,对大数据背景下应用多重抽样框的必要性进行剖析,并主要针对大数据中数据多源异构的难点,将每个来源数据作为一个抽样框进行处理,提出了大数据中多重抽样框的构建。进而根据大数据的数据特征进行分类,针对不同情况确定是否需要进行分阶段抽样设计,并提出运用SF估计量对基于多重抽样框的总体进行估计,此估计量较为符合大数据中多重抽样估计的需求,并能对总体有较好的估计。 The big data currently acquired are not all general data,and usually fail to cover the whole,for their multi-source and heterogeneous characteristics cause the traditional data analysis method to be blocked.This paper introduces the sampling survey method into big data and analyzes the necessity of applying multiple sampling fames in the background of big data,and also emphatically aims at the difficulty of multi-source heterogeneity in big data to treat each source data as a sampling frame,proposing the construction of multiple sampling frames in big data.Then the paper classifies them according to the characteristics of big data,and determines whether a phased sampling design is needed according to different situations,proposing to use SF estimator to make an estimation on the whole based on multiple sampling frames.This estimator is in line with the demand of multiple sampling estimation in big data,and the overall good estimation can be achieved.
作者 高詹清 刘艺璇 贺建风 Gao Zhanqing;Liu Yixuan;He Jianfeng(School of Economics,Renmin University of China,Beijing 100872,China;School of Economics,Shanghai University of Finance and Economics,Shanghai 200433,China;School of Economics and Commerce,South China University of Technology,Guangzhou 510006,China)
出处 《统计与决策》 CSSCI 北大核心 2020年第1期5-9,共5页 Statistics & Decision
基金 国家社会科学基金资助项目(19BTJ022) 广东省科技计划项目(2019A101002003 2018A070712009) 广州市“羊城青年学人”项目(17QNXR02) 华南理工大学中央高校基本科研业务费重点项目(XYZD201906)
关键词 大数据 多重抽样框 多源数据 SF估计量 big data multi-sampling frame multi-source data SF estimator
  • 相关文献

参考文献5

二级参考文献57

  • 1赵民德.统计是数据科学(下)[J].中国统计,2004,19(9):56-57. 被引量:1
  • 2陈阳,张立丰,闫玉玺.浅析社区管理信息化指标量化中的质感与美感[J].网络与信息,2007,21(6):8-8. 被引量:2
  • 3Yunus M. Building Social Business: The New Kind of Capitalism That Serves Humanity's Most Pressing Needs. Philadelphia: Public Affairs, 2011.2-17.
  • 4Leung L. Generational differences in content generation in social media: The roles of the gratifications sought and of narcissism. Computers in Human Behavior, 2013,29(3):997-1006. [doi: 10.1016/j.chb.2012.12.028].
  • 5Becchetti L, Castillo C, Donato D, Fazzone A. A comparison of sampling techniques for Web graph characterization. In: Proc. of the Workshop on Link Analysis (LinkKDD 2006). New York: ACM Press, 2006. http://ailab.ijs.si/dunja/linkkdd2006/Papers/ becchetti.pdf [doi: 10.1.1.69.1736].
  • 6Leskovec J, Faloutsos C. Sampling from large graphs. In: Proc. of the 12th ACM SIGKDD Int'l Conf. on Knowledge Discovery and Data Mining. New York: ACM Press, 2006.631-636. [doi: 10.1145/1150402.1150479].
  • 7Mislove A, Marcon M, Gummadi KP, Druschel P, Bhattacharjee B. Measurement and analysis of online social networks. In: Proc. of the 7th ACM SIGCOMM Conf. on Internet Measurement. New York: ACM Press, 2007. 29-42. [doi: 10.1145/1298306. 1298311].
  • 8Amanda LT, Peter JM, Mason AP. Social structure of Facebook networks. Physica A, 2012,391:4165-4180. [doi: 10,1016/j.physa. 2011.12.021.
  • 9Ferrara E. A large-scale community structure analysis in Facebook. EPJ Data Science, 2012,1 (1): 1-30. [doi: 10.1140/epjds 1 ].
  • 10Ahmed N, Neville J, Kompella R. Network sampling via edge-based node selection with graph induction. Computer Science Technical Reports, 11-016, 2011.1-10.

共引文献83

同被引文献4

引证文献1

二级引证文献12

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部