摘要
在统计预测中,如何选取物理意义明确、预报准确率又高的预报因子,几乎是统计预测成功与否的关键。在实际预测当中,常会碰到这样的情况:某些因子与预报量的单相关率甚高,但用它们构成预报方程并做预报时,却常常导致失败。事后分析发现,这些因子并非由于物理上的原因,而仅仅由于在有限样本的情况下(在实际预报中样本客量常常很小),预报因子序列偶然地与预报量序列具有很高的相关性。本文把这样的因子称为虚假因子,并把这种相关称为偶然性单相关。 为了研究的方便,作者将因子和预报量分别化为(0,1)序列,并提出了一种最佳(0,1)化算法。在因子X和预报量Y分为(0,1)两级的情况下,假定它们相互独立,于是有 P_i(x,y)=P_i(x)P_i(y)即:X和Y在i级的正相关率等于各自出现在i级的概率之积。引进统计量 式中υ_i是(X,Y)实际出现在i级内的频数,n为子样总数。用该统计量对预报因子进行显著性检验,就能有效地识别和过滤假因子。 本文最后将理论应于台风的登陆,转向预报。显著水平为0.001时,有8个预报因子通过检验。从中选取η值较大的5个,用简单的编码相关法构成预测模型,其预报准确率为94.1%。
In statistical prediction,how to select these predictors meaningful in physics and highly effective in predicting, is almost the key to the success of statistical prediction. Such cases are often encountered in practice:the correlation between a predictor and the predic-tand is very high, but the predicting by equation with these predictors frequently leads to the failure. Analyses after the event showed that this correlation is not due to physical cause, but attributed to the usage of limited sample ( sample number is usually very small in practice ) under which situation there often exists a coincidence that makes predictor and predictand well-correlated. Such a predictor is referred as sham predictor and such correlation as accidental correlation.
For convenience author transforms the predictor and the predictand into(0, 1 ) series respectively by an optimal algorithm proposed by author. Assuming then predictor X and predictand Y being independent, so we have
That implies the positive correlation ratio between X and Y within i-grade equals to the product of each probability fallen within same grade. Introduce a statistic
where γ_i is the occurrence times of(x, y ) within i-grade, n is total number of sample. Through a significance test with use of the statisitic we can distinguish effective predictor from false one.
At the end of this paper we apply the theory to the prediction of typhoon landing or returning. When significance level being 0.001, 8 predictors were still retained.We select 5 better ones to construct a predictive model by a simple coding method and have very good result with correct ratio of 94.1%.
出处
《海洋预报》
北大核心
1993年第1期1-9,共9页
Marine Forecasts
关键词
统计预测
假因子
筛选
天气预报
Statistical prediction, False predictor screening, Accidental single correlation.