Gap statistic is a well-known index of clustering validity, but its realization is difficult to be comprehended and accurately determined. A direct method is presented to improve the performance of the Gap statistic, ...Gap statistic is a well-known index of clustering validity, but its realization is difficult to be comprehended and accurately determined. A direct method is presented to improve the performance of the Gap statistic, which applies the two-order difference of within-cluster dispersion to replace the constructed null reference distribution in the Gap statistic. Hence, the realization of the Gap statistic becomes easy and is reformulated, and its uncertainty in applications is reduced. Also, the limitation of the Gap statistic is analyzed by two typical examples, that is, the Gap statistic is difficult to be applied to the dataset that contains strong-overlap or uneven-density clusters. Experiments verify the usefulness of the proposed method.展开更多
Various random models with balanced data that are relevant for analyzing practical test data are described, along with several hypothesis testing and interval estimation problems concerning variance components. In thi...Various random models with balanced data that are relevant for analyzing practical test data are described, along with several hypothesis testing and interval estimation problems concerning variance components. In this paper, we mainly consider these problems in general random effect model with balanced data. Exact tests and confidence intervals for a single variance component corresponding to random effect are developed by using generalized p-values and generalized confidence intervals. The resulting procedures are easy to compute and are applicable to small samples. Exact tests and confidence intervals are also established for comparing the random-effects variance components and the sum of random-effects variance components in two independent general random effect models with balanced data. Furthermore, we investigate the statistical properties of the resulting tests. Finally, some simulation results on the type Ⅰ error probability and power of the proposed test are reported. The simulation results indicate that exact test is extremely satisfactory for controlling type Ⅰ error probability.展开更多
基金National Natural Science Foundation of China(No.60572065, 60772080, 60532020)
文摘Gap statistic is a well-known index of clustering validity, but its realization is difficult to be comprehended and accurately determined. A direct method is presented to improve the performance of the Gap statistic, which applies the two-order difference of within-cluster dispersion to replace the constructed null reference distribution in the Gap statistic. Hence, the realization of the Gap statistic becomes easy and is reformulated, and its uncertainty in applications is reduced. Also, the limitation of the Gap statistic is analyzed by two typical examples, that is, the Gap statistic is difficult to be applied to the dataset that contains strong-overlap or uneven-density clusters. Experiments verify the usefulness of the proposed method.
文摘Various random models with balanced data that are relevant for analyzing practical test data are described, along with several hypothesis testing and interval estimation problems concerning variance components. In this paper, we mainly consider these problems in general random effect model with balanced data. Exact tests and confidence intervals for a single variance component corresponding to random effect are developed by using generalized p-values and generalized confidence intervals. The resulting procedures are easy to compute and are applicable to small samples. Exact tests and confidence intervals are also established for comparing the random-effects variance components and the sum of random-effects variance components in two independent general random effect models with balanced data. Furthermore, we investigate the statistical properties of the resulting tests. Finally, some simulation results on the type Ⅰ error probability and power of the proposed test are reported. The simulation results indicate that exact test is extremely satisfactory for controlling type Ⅰ error probability.