On the Evaluation of Sample Size Required for a Good Approximation by the Normal Curve for Some Statistics
DOI:
https://doi.org/10.15678/ZNUEK.2017.0965.0502Keywords:
sample size, central theorem, sampling scheme, computer simulation, chi-square test of goodness of fitAbstract
Testing hypotheses or evaluation confidence intervals requires knowledge of some statistics’ distributions. It is convenient if the probability distribution of the statistic converges to normal distribution when the sample size is sufficiently large. This paper examines the problem of how to evaluate sample size in order to determine that a statistic’s distribution does not depart from normal distribution by more than an assumed amount. Two procedures are proposed to evaluate the necessary sample size. The first is based on Berry-Esseen inequality while the second is based on simulation procedure. In order to evaluate the necessary sample size, the distribution of the sample mean is generated by replicating samples of a fixed size. Next, the normal distribution of the evaluated sample means is tested. The size of the generated samples is gradually increased until the hypothesis on the normality of the sample mean distribution is not rejected. This procedure is applied in the cases of statistics other than sample mean.
Downloads
References
Berger Y. G. (1998), Rate of Convergence to Normal Distribution for Horvitz-Thompson Estimator, “Journal of Statistical Planning and Inference”, vol. 67, https://doi.org/10.1016/s0378-3758(97)00107-9. DOI: https://doi.org/10.1016/S0378-3758(97)00107-9
Cassel C. M., Särndal C. E., Wretman J. H. (1977), Foundation of Inference in Survey Sampling, John Wiley & Sons, New York–London–Sydney–Toronto.
Chernick M. R., Liu C. Y. (2002), The Saw-toothed Behavior of the Power versus Sample and Software Solutions: Single Binomial Proportion Using Exact Methods, “The American Statistician”, vol. 56, https://doi.org/10.1198/000313002317572835. DOI: https://doi.org/10.1198/000313002317572835
Cochran W. G. (1952), The chi-squared Test of Goodness of Fit, “Annals of Mathematical Statistics”, vol. 23, https://doi.org/10.1214/aoms/1177729380. DOI: https://doi.org/10.1214/aoms/1177729380
Cramér H. (1946), Mathematical Methods of Statistics, Princeton University Press, Princeton. DOI: https://doi.org/10.1515/9781400883868
Drost F. C., Kallenberg W. C. M., Moore D. S., Oosterhoff J. (1989), Power Approximations to Multinomial Tests of Fit, “Journal of the American Statistical Association”, vol. 84, https://doi.org/10.2307/2289856. DOI: https://doi.org/10.2307/2289856
Edgeworth F. Y. (1907), On the Representation of a Statistical Frequency by a Series, “Journal of the Royal Statistical Society”, vol. A 70. DOI: https://doi.org/10.2307/2339504
Fuller W. A. (2009), Sampling Statistics, John Wiley & Sons, Hoboken, New Jersey.
Greselin F., Zenga M. (2006), Convergence of the Sample Mean Difference to the Normal Distribution: Simulation Results, “Statistica & Applicazioni”, vol. 4, no 1.
Hájek J. (1964), Asymptotic Theory of Rejective Sampling with Varying Probabilities from a Finite Population, “Annals of Mathematical Statistics”, vol. 35, https://doi.org/10.1214/aoms/1177700375. DOI: https://doi.org/10.1214/aoms/1177700375
Hájek J. (1981), Sampling from a Finite Population, ed. V. Dupač, Marcel Dekker, Inc., New York–Basel.
Hall P. (1992), The Bootstrap and Edgeworth Expansion, Springer-Verlag, New York. DOI: https://doi.org/10.1007/978-1-4612-4384-7
Hansen M. H., Hurvitz W. N. (1943), On the Theory of Sampling from Finite Population, “Annals of Mathematical Statistics”, vol. 14, https://doi.org/10.1214/aoms/1177731356. DOI: https://doi.org/10.1214/aoms/1177731356
Horvitz D. G., Thompson D. J. (1952), A Generalization of Sampling without Replacement from a Finite Universe, “Journal of the American Statistical Association”, vol. 47, https://doi.org/10.1080/01621459.1952.10483446. DOI: https://doi.org/10.2307/2280784
Krzyśko M. (2000), Statystyka matematyczna, Wydawnictwo Naukowe Uniwersytetu im. Adama Mickiewicza w Poznaniu, Poznań.
Lahiri D. B. (1951), A Method of Sample Selection Providing Unbiased Ratio Estimator, “Bulletin of the International Statistical Institute”, vol. 33.
Midzuno H. (1952), On the Sampling System with Probability Proportional to Sum of Size, “Annals of the Institute of Statistical Mathematics”, vol. 3, https://doi.org/10.1007/bf02949779. DOI: https://doi.org/10.1007/BF02949779
Ryan T. P. (2013), Sample Size Determination and Power, John Wiley & Sons, Hoboken, New Jersey. DOI: https://doi.org/10.1002/9781118439241
Santer T. J., Duffy D. E. (1989), The Statistical Analysis of Discrete Data, Springer-Verlag, New York. DOI: https://doi.org/10.1007/978-1-4612-1017-7
Seber G. A. F. (2013), Statistical Models for Proportions and Probabilities, Springer Briefs in Statistics, Heidelberg–New York–Dordrecht–London. DOI: https://doi.org/10.1007/978-3-642-39041-8
Sen A. R. (1953), On the Estimate of the Variance in Sampling with Varying Probabilities, “Journal of the Indian Society of Agricultural Statistics”, vol. 5. DOI: https://doi.org/10.1177/0008068319530101
Tillé Y. (2006), Sampling Algorithms, Springer, New York.
Wywiał J. L. (2016), Contributions to Testing Statistical Hypotheses in Auditing, Wydawnictwo Naukowe PWN, Warszawa.
Yates F., Grundy P. M. (1953), Selection without Replacement from Within Strata with Probability Proportional to Size, “Journal of the Royal Statistical Society”, Series B, vol. 15. DOI: https://doi.org/10.1111/j.2517-6161.1953.tb00140.x