摘要
对于分类数据,列联表无疑是最好的统计工具之一,但列联表分析也会带来Simpson悖论问题。从理论上来说,可以通过改变试验结构来消解Simpson悖论,但社会研究数据大多是观测数据,是无法通过试验来控制的,因此Simpson悖论与其说是"悖论",不如说是反映了分类数据的非线性特征,是"不可压缩"而压缩的结果,反映了列联表从高维压缩至低维时的统计信息差异,实质上是欧氏空间的降维问题。
Contingency table is a good statistical tool to categorical data.It is an important applicative problem in social statistic analysis that how to explain Simpson Paradox in contingency table.Theoretically,we can change experiment structure to eliminate the paradox,but most of the data in social research is observational data.It can't be controlled by experiment.So Simpson Paradox is not a paradox but unique nonlinearity character reflection of categorical data.It is a compressible result under incompressible condition.It is different reflection of statistic information from the upper dimensions to low dimension.It is essentially a problem of reducing dimensions in Euclidean space.
出处
《统计与信息论坛》
CSSCI
2011年第2期9-12,共4页
Journal of Statistics and Information
基金
教育部人文社会科学研究青年基金项目<群体性事件中的谣言
流言研究>(10YJC840014)
中国博士后科学基金项目<疾病与单位社会的变迁:一项医学社会学的研究>(20100470620)