This study aims to establish a rationale for the Rice University rule in determining the number of bins in a histogram. It is grounded in the Scott and Freedman-Diaconis rules. Additionally, the accuracy of the empiri...This study aims to establish a rationale for the Rice University rule in determining the number of bins in a histogram. It is grounded in the Scott and Freedman-Diaconis rules. Additionally, the accuracy of the empirical histogram in reproducing the shape of the distribution is assessed with respect to three factors: the rule for determining the number of bins (square root, Sturges, Doane, Scott, Freedman-Diaconis, and Rice University), sample size, and distribution type. Three measures are utilized: the average distance between empirical and theoretical histograms, the level of recognition by an expert judge, and the accuracy index, which is composed of the two aforementioned measures. Mean comparisons are conducted with aligned rank transformation analysis of variance for three fixed-effects factors: sample size (20, 35, 50, 100, 200, 500, and 1000), distribution type (10 types), and empirical rule to determine the number of bins (6 rules). From the accuracy index, Rice’s rule improves with increasing sample size and is independent of distribution type. It outperforms the Friedman-Diaconis rule but falls short of Scott’s rule, except with the arcsine distribution. Its profile of means resembles the square root rule concerning distributions and Doane’s rule concerning sample sizes. These profiles differ from those of the Scott and Friedman-Diaconis rules, which resemble each other. Among the seven rules, Scott’s rule stands out in terms of accuracy, except for the arcsine distribution, and the square root rule is the least accurate.展开更多
文摘This study aims to establish a rationale for the Rice University rule in determining the number of bins in a histogram. It is grounded in the Scott and Freedman-Diaconis rules. Additionally, the accuracy of the empirical histogram in reproducing the shape of the distribution is assessed with respect to three factors: the rule for determining the number of bins (square root, Sturges, Doane, Scott, Freedman-Diaconis, and Rice University), sample size, and distribution type. Three measures are utilized: the average distance between empirical and theoretical histograms, the level of recognition by an expert judge, and the accuracy index, which is composed of the two aforementioned measures. Mean comparisons are conducted with aligned rank transformation analysis of variance for three fixed-effects factors: sample size (20, 35, 50, 100, 200, 500, and 1000), distribution type (10 types), and empirical rule to determine the number of bins (6 rules). From the accuracy index, Rice’s rule improves with increasing sample size and is independent of distribution type. It outperforms the Friedman-Diaconis rule but falls short of Scott’s rule, except with the arcsine distribution. Its profile of means resembles the square root rule concerning distributions and Doane’s rule concerning sample sizes. These profiles differ from those of the Scott and Friedman-Diaconis rules, which resemble each other. Among the seven rules, Scott’s rule stands out in terms of accuracy, except for the arcsine distribution, and the square root rule is the least accurate.