摘要
近年来,计算机和互联网的发展使得人类信息的拥有量达到了前所未有的程度,各类信息被保存流通起来,人类进入了大数据时代。大数据具有规模性、多样性,高速性等特点,给统计学的发展带来了新的机遇,同时也带来了新的挑战。本文回顾了统计学的发展历史,剖析了统计学的发展特点,在此基础上讨论了大数据背景下统计学的发展定位;并进一步分析统计学与计算机之间的关系,最后分析了大数据研究中存在的若干误区。
In the past decades, the development of computer science and internet techniques has enabled researchers to collect, store, and analyze data at an unparalleled speed, with which we have entered the era of big data. Big data have unique characteristics (volume, variety, velocity, and veracity ), which bring opportunities as well as challenges to statistics and statisticians. In this article, we examine the history of statistical methodological development and analyze the characteristics of statistical development, based on which we propose the positioning of statistics in the big data era and discuss the interconnections and interactions between statistics and computer seience/internet technologies. At the end, we clarify a few misunderstandings in big data analysis.
出处
《统计研究》
CSSCI
北大核心
2017年第1期5-11,共7页
Statistical Research
基金
国家社会科学基金项目“大数据的高维变量选择方法及其应用研究”(批准号13CTJ001)
国家自然科学基金面上项目“广义线性模型的组变量选择及其在信用评分中的应用”(批准号71471152)的资助
关键词
大数据计算机
因果关系
抽样
数据质量
Big Data
Computer Science
Causality
Sampling
Data Quality