摘要
对数转换的方法在生物医学和社会心理研究中处理非正态数据时被广泛应用。本文重点介绍该传统方法在处理非正态数据时存在的严重问题。尽管通常认为对数转换可以减少数据的变异性,使数据更符合正态分布,但是通常并非如此。此外,对数转换后的数据得出的标准统计测试结果往往和未转化的原始数据不相关。我们通过使用模拟数据示例来说明这些问题。我们认为如果采用数据转换,必须非常谨慎应用。我们建议研究者在大多数情况下摒弃这些处理非正态数据的传统方法,选择采用较新的不依赖于数据分布的方法:如广义估计方程(GEE)。
Summary:The log-transformaiton is widely used in biomedical and psychosocial research to deal with skewed data. This paper highlights serious problems in this classic approach for dealing with skewed data. Despite the common belief that the log transformaiton can decrease the variability of data and make data conform more closely to the normal distribuiton, this is usually not the case. Moreover, the results of standard staitsitcal tests performed on log-transformed data are otfen not relevant for the original, non-transformed data. We demonstrate these problems by presenitng examples that use simulated data. We conclude that if used at all, data transformaitons must be applied very cauitously. We recommend that in most circumstances researchers abandon these tradiitonal methods of dealing with skewed data and, instead, use newer analyitc methods that are not dependent on the distribuiton the data, such as generalized esitmaitng equaitons (GEE).
出处
《上海精神医学》
2014年第2期105-109,共5页
Shanghai Archives of Psychiatry
基金
supported in part by the Novel Bio-statistical and Epidemiologic Methodology grants from the University of Rochester Medical Center Clinical and Translational Science Institute Pilot Awards Program
关键词
假设检验
离群值
对数正态分布
正态分布
偏度
hypothesis tesitng
outliers
log-normal distribuiton
normal distribuiton
skewness