摘要
研究目标:厘清P值的演进脉络及内涵,分析其局限性与误解,探析P值操纵表现及原因,提出相应的改进策略。研究方法:围绕P值开展系统解析与综合,归纳概括得到若干观点与结论;以“中国综合社会调查”(CGSS2015)数据为例开展P值改进策略效果的实证检验。研究发现:费雪显著性检验与奈曼-皮尔逊假设检验思想的结合形成当今广泛使用的零假设检验模式,P值本身固有的一些局限性造成人们的滥用、误用以及P值操纵。构建置信区间、检验统计功效、估计效应量、计算错误发现率、计算贝叶斯因子、重复性实验等可作为P值的有益补充或替代策略;实例解析显示,P值显著性结论与统计功效都受到样本量大小的影响,而效应量不受样本量的影响,依贝叶斯因子做出的假设检验判断可靠性更强。研究创新:针对P值的现实困境,明晰P值的五大认识误区、P值操纵的表现及原因,提出相应的改进策略并通过实例开展效果的实证检验。研究价值:根据P值做出判断时遵循美国统计协会提出的6项基本准则,结合补充与替代指标,构建假设检验的新模式,防范P值操纵,增强结论的可靠性。
Research Objectives:Based on the historical evolution of P-value,it is valuable to clarify the development and connotation of P-value,analyze the misunderstandings of P-value,discuss the features and reasons of p-hacking,and put forward some improvement strategies.Research Methods:According to the existing literature,systematic analysis and synthesis around P value are carried out,and some viewpoints and conclusions are summarized.The effects of improvement strategies are tested based on the data of“China Comprehensive Social Survey”(CGSS2015).Research Findings:The combination of Fisher s significance test and Neyman-Pearson s hypothesis test has formed a widely used tool-Null Hypothesis Significance Test(NHST).Because of the limitations and misunderstandings of P-value,it is often abused and misused in many fields.And some researchers use improper means to manipulate the P-value.In order to improve the reliability of research conclusions,some improvement strategies including constructing confidence intervals,testing statistical power,estimating effect size,calculating false discovery rate,computing Bayesian factors,repetitive experiments and so on are proposed.Example analysis shows that the significance conclusion and statistical power of P value are affected by sample size,while the effect size is not affected by sample size.And the reliability of hypothesis testing based on Bayesian factor is more reliable.Research Innovations:On account of the realistic dilemma of P value,five misunderstandings of P value,various manifestations and internal reasons of P-hacking are clarified.The improvement strategies of P value are put forward and the effects are tested by an example.Research Value:In the future,the use of P-value should follow the six basic criteria proposed by the American Statistical Association,and construct a new pattern of hypothesis testing by substitution or supplementary indicators to enhance the reliability of the conclusion.
作者
程开明
李泗娥
Cheng Kaiming;Li Sie(School of Statistics and Mathematics,Zhejiang Gongshang University)
出处
《数量经济技术经济研究》
CSSCI
CSCD
北大核心
2019年第7期117-136,共20页
Journal of Quantitative & Technological Economics
基金
教育部人文社科规划基金项目(18YJA630016)
2017~2018年度浙江省高校重大人文社科攻关计划规划重点项目(2018GH010)
浙江自然科学基金项目(LY18G030009)的资助
关键词
P值
假设检验
局限性
P值操纵
改进策略
P-value
Hypothesis Testing
Limitation
P-hacking
Improvement Strategy