On the Evaluation of Sample Size Required for a Good Approximation by the Normal Curve for Some Statistics

Authors

  • Janusz L. Wywiał University of Economics in Katowice, Department of Statistics, Econometrics and Mathematics

DOI:

https://doi.org/10.15678/ZNUEK.2017.0965.0502

Keywords:

sample size, central theorem, sampling scheme, computer simulation, chi-square test of goodness of fit

Abstract

Testing hypotheses or evaluation confidence intervals requires knowledge of some  statistics’ distributions. It is convenient if the probability distribution of the statistic converges to normal distribution when the sample size is sufficiently large. This paper examines the problem of how to evaluate sample size in order to determine that a statistic’s distribution does not depart from normal distribution by more than an assumed amount. Two procedures are proposed to evaluate the necessary sample size. The first is based on Berry-Esseen inequality while the second is based on simulation procedure. In order to evaluate the necessary sample size, the distribution of the sample mean is generated by replicating samples of a fixed size. Next, the normal distribution of the evaluated sample means is tested. The size of the generated samples is gradually increased until the hypothesis on the normality of the sample mean distribution is not rejected. This procedure is applied in the cases of statistics other than sample mean.

Downloads

Download data is not yet available.

References

Berger Y. G. (1998), Rate of Convergence to Normal Distribution for Horvitz-Thompson Estimator, “Journal of Statistical Planning and Inference”, vol. 67, https://doi.org/10.1016/s0378-3758(97)00107-9. DOI: https://doi.org/10.1016/S0378-3758(97)00107-9

Cassel C. M., Särndal C. E., Wretman J. H. (1977), Foundation of Inference in Survey Sampling, John Wiley & Sons, New York–London–Sydney–Toronto.

Chernick M. R., Liu C. Y. (2002), The Saw-toothed Behavior of the Power versus Sample and Software Solutions: Single Binomial Proportion Using Exact Methods, “The American Statistician”, vol. 56, https://doi.org/10.1198/000313002317572835. DOI: https://doi.org/10.1198/000313002317572835

Cochran W. G. (1952), The chi-squared Test of Goodness of Fit, “Annals of Mathematical Statistics”, vol. 23, https://doi.org/10.1214/aoms/1177729380. DOI: https://doi.org/10.1214/aoms/1177729380

Cramér H. (1946), Mathematical Methods of Statistics, Princeton University Press, Princeton. DOI: https://doi.org/10.1515/9781400883868

Drost F. C., Kallenberg W. C. M., Moore D. S., Oosterhoff J. (1989), Power Approximations to Multinomial Tests of Fit, “Journal of the American Statistical Association”, vol. 84, https://doi.org/10.2307/2289856. DOI: https://doi.org/10.2307/2289856

Edgeworth F. Y. (1907), On the Representation of a Statistical Frequency by a Series, “Journal of the Royal Statistical Society”, vol. A 70. DOI: https://doi.org/10.2307/2339504

Fuller W. A. (2009), Sampling Statistics, John Wiley & Sons, Hoboken, New Jersey.

Greselin F., Zenga M. (2006), Convergence of the Sample Mean Difference to the Normal Distribution: Simulation Results, “Statistica & Applicazioni”, vol. 4, no 1.

Hájek J. (1964), Asymptotic Theory of Rejective Sampling with Varying Probabilities from a Finite Population, “Annals of Mathematical Statistics”, vol. 35, https://doi.org/10.1214/aoms/1177700375. DOI: https://doi.org/10.1214/aoms/1177700375

Hájek J. (1981), Sampling from a Finite Population, ed. V. Dupač, Marcel Dekker, Inc., New York–Basel.

Hall P. (1992), The Bootstrap and Edgeworth Expansion, Springer-Verlag, New York. DOI: https://doi.org/10.1007/978-1-4612-4384-7

Hansen M. H., Hurvitz W. N. (1943), On the Theory of Sampling from Finite Population, “Annals of Mathematical Statistics”, vol. 14, https://doi.org/10.1214/aoms/1177731356. DOI: https://doi.org/10.1214/aoms/1177731356

Horvitz D. G., Thompson D. J. (1952), A Generalization of Sampling without Replacement from a Finite Universe, “Journal of the American Statistical Association”, vol. 47, https://doi.org/10.1080/01621459.1952.10483446. DOI: https://doi.org/10.2307/2280784

Krzyśko M. (2000), Statystyka matematyczna, Wydawnictwo Naukowe Uniwersytetu im. Adama Mickiewicza w Poznaniu, Poznań.

Lahiri D. B. (1951), A Method of Sample Selection Providing Unbiased Ratio Estimator, “Bulletin of the International Statistical Institute”, vol. 33.

Midzuno H. (1952), On the Sampling System with Probability Proportional to Sum of Size, “Annals of the Institute of Statistical Mathematics”, vol. 3, https://doi.org/10.1007/bf02949779. DOI: https://doi.org/10.1007/BF02949779

Ryan T. P. (2013), Sample Size Determination and Power, John Wiley & Sons, Hoboken, New Jersey. DOI: https://doi.org/10.1002/9781118439241

Santer T. J., Duffy D. E. (1989), The Statistical Analysis of Discrete Data, Springer-Verlag, New York. DOI: https://doi.org/10.1007/978-1-4612-1017-7

Seber G. A. F. (2013), Statistical Models for Proportions and Probabilities, Springer Briefs in Statistics, Heidelberg–New York–Dordrecht–London. DOI: https://doi.org/10.1007/978-3-642-39041-8

Sen A. R. (1953), On the Estimate of the Variance in Sampling with Varying Probabilities, “Journal of the Indian Society of Agricultural Statistics”, vol. 5. DOI: https://doi.org/10.1177/0008068319530101

Tillé Y. (2006), Sampling Algorithms, Springer, New York.

Wywiał J. L. (2016), Contributions to Testing Statistical Hypotheses in Auditing, Wydawnictwo Naukowe PWN, Warszawa.

Yates F., Grundy P. M. (1953), Selection without Replacement from Within Strata with Probability Proportional to Size, “Journal of the Royal Statistical Society”, Series B, vol. 15. DOI: https://doi.org/10.1111/j.2517-6161.1953.tb00140.x

Downloads

Published

30-11-2017

Issue

Section

Articles

How to Cite

Wywiał, J. L. (2017). On the Evaluation of Sample Size Required for a Good Approximation by the Normal Curve for Some Statistics. Krakow Review of Economics and Management Zeszyty Naukowe Uniwersytetu Ekonomicznego W Krakowie, 5(965), 17-29. https://doi.org/10.15678/ZNUEK.2017.0965.0502