On Bootstrap Algorithms in Survey Sampling
DOI:
https://doi.org/10.15678/KREM.2024.1004.0207Keywords:
survey sampling, small area estimation, bootstrap, estimation and prediction accuracyAbstract
Objective: The aim of this paper is to present bootstrap algorithms for measuring the accuracy of estimation and prediction in design-based and model-based approaches in survey sampling and small area estimation. Three proposals of prediction-mean squared error estimators are also examined.
Research Design & Methods: Various bootstrap procedures are shown and used to estimate the design- and prediction-mean squared errors based on real data. Computations are supported by two R packages.
Findings: Three prediction-mean squared error estimators are proposed.
Implications / Recommendations: The bootstrap algorithms used in the design-based approach give similar results for the considered data for the variance estimates of the considered estimator, implying that the speed of the algorithms may be important for practitioners in cases of similar properties. The proposed estimators of the prediction mean squared error produce higher estimates than other estimators in the model-based approach, indicating a positive bias that can be interpreted as a pessimistic accuracy estimate.
Contribution: All the presented bootstrap algorithms are easily applicable using two R packages available at R CRAN and GitHub. Three double bootstrap prediction-MSE estimators are proposed and analysed in the real-data application.
Downloads
References
Antal, E., & Tillé, Y. (2011). A Direct Bootstrap Method for Complex Sampling Designs from a Finite Population. Journal of the American Statistical Association, 106(494), 534–543. https://doi.org/10.1198/jasa.2011.tm09767
Antal, E., & Tillé, Y. (2014). A New Resampling Method for Sampling Designs without Replacement: The Doubled Half Bootstrap. Computational Statistics, 29(5), 1345–1363. https://doi.org/10.1007/s00180-014-0495-0
Barbiero, A., Manzi, G., & Mecatti, F. (2015). Bootstrapping Probability-proportional-to-size Samples via Calibrated Empirical Population. Journal of Statistical Computation and Simulation, 85(3), 608–620. https://doi.org/10.1080/00949655.2013.833204
Barbiero, A., & Mecatti, F. (2010). Bootstrap Algorithms for Variance Estimation in πPS Sampling. In: P. Mantovan, P. Secchi (Eds), Complex Data Modeling and Computationally Intensive Statistical Methods. Contribution to Statistics (pp. 57–69). Springer. https://doi.org/10.1007/978-88-470-1386-5_5
Beaumont, J.-F., & Patak, Z. (2012). On the Generalized Bootstrap for Sample Surveys with Special Attention to Poisson Sampling. International Statistical Review, 80(1), 127–148. https://doi.org/10.1111/j.1751-5823.2011.00166.x
Butar, F. B., & Lahiri, P. (2003). On Measures of Uncertainty of Empirical Bayes Small-area Estimators. Journal of Statistical Planning and Inference, 112(1–2), 63–76. https://doi.org/10.1016/S0378-3758(02)00323-3
Carpenter, J. R., Goldstein, H., & Rasbash, J. (2003). A Novel Bootstrap Procedure for Assessing the Relationship between Class Size and Achievement. Journal of the Royal Statistical Society: Series C (Applied Statistics), 52(4), 431–443. https://doi.org/10.1111/1467-9876.00415
Cassel, C. M., Särndal, C.-E., & Wretman, J. H. (1977). Foundations of Inference in Survey Sampling. Wiley-Interscience.
Chambers, R., & Chandra, H. (2013). A Random Effect Block Bootstrap for Clustered Data. Journal of Computational and Graphical Statistics, 22(2), 452–470. https://doi.org/10.1080/10618600.2012.681216
Chwila, A., & Żądło, T. (2020). On the Choice of the Number of Monte Carlo Iterations and Bootstrap Replicates in Empirical Best Prediction. Statistics in Transition New Series, 21(2), 35–60. https://doi.org/10.21307/stattrans-2020-013
Chwila, A., & Żądło, T. (2022). On Properties of Empirical Best Predictors. Communications in Statistics – Simulation and Computation, 51(1), 220–253. https://doi.org/10.1080/03610918.2019.1649422
Deville, J.-C., & Särndal, C.-E. (1992). Calibration Estimators in Survey Sampling. Journal of the American Statistical Association, 87(418), 376–382. https://doi.org/10.1080/01621459.1992.10475217
Efron, B. (1979). Bootstrap Methods: Another Look at the Jackknife. The Annals of Statistics, 7(1), 1–26. https://doi.org/10.1214/aos/1176344552
Erciulescu, A. L., & Fuller, W. A. (2014). Parametric Bootstrap Procedures for Small Area Prediction Variance. In: JSM 2014 – Survey Research Methods Section (pp. 3307–3318). American Statistical Association.
Hall, P., & Maiti, T. (2006). On Parametric Bootstrap Methods for Small Area Prediction. Journal of the Royal Statistical Society Series B: Statistical Methodology, 68(2), 221–238. https://doi.org/10.1111/j.1467-9868.2006.00541.x
Holmberg, A. (1988). A Bootstrap Approach to Probability Proportional-to-size Sampling. In: Proceedings of Section on Survey Research Methods (pp. 378–383). American Statistical Association.
Horvitz, D. G., & Thompson, D. J. (1952). A Generalization of Sampling without Replacement from a Finite Universe. Journal of the American Statistical Association, 47(260), 663–685. https://doi.org/10.1080/01621459.1952.10483446
Jacqmin-Gadda, H., Sibillot, S., Proust, C., Molina, J.-M., & Thiébaut, R. (2007). Robustness of the Linear Mixed Model to Misspecified Error Distribution. Computational Statistics & Data Analysis, 51(10), 5142–5154. https://doi.org/10.1016/j.csda.2006.05.021
Krzciuk, M., & Żądło, T. (2014a). On Some Tests of Fixed Effects for Linear Mixed Models. Studia Ekonomiczne, 189, 49–57.
Krzciuk, M., & Żądło, T. (2014b). On Some Tests of Variance Components for Linear Mixed Models. Studia Ekonomiczne, 189, 77–85.
Krzciuk, M. K. (2018). On the Simulation Study of Jackknife and Bootstrap MSE Estimators of a Domain Mean Predictor for Fay‑Herriot Model. Acta Universitatis Lodziensis. Folia Oeconomica, 5(331), 169–183. https://doi.org/10.18778/0208-6018.331.11
Kucharski, R., & Żądło, T. (2021). pipsboot: Bootstrap for Probability Proportional to Size Sampling, R package. Retrieved from: https://github.com/kucharsky/pipsboot (accessed: 29.03.2024).
Quatember, A. (2014). The Finite Population Bootstrap – from the Maximum Likelihood to the Horvitz-Thompson Approach. Austrian Journal of Statistics, 43(2), 93–102. https://doi.org/10.17713/ajs.v43i2.10
Ranalli, M. G., & Mecatti, F. (2012). Comparing Recent Approaches for Bootstrapping Sample Survey Data: A First Step towards a Unified Approach. In: Section on Survey Research Methods – JSM 2012 (pp. 4088–4099). American Statistical Association.
Rao, J. N. K., & Molina, I. (2015). Small Area Estimation (2nd ed.). John Wiley & Sons.
Rao, J. N. K., & Wu, C. F. J. (1988). Resampling Inference with Complex Survey Data. Journal of the American Statistical Association, 83(401), 231–241. https://doi.org/10.1080/01621459.1988.10478591
Särndal, C.-E., Swensson, B., & Wretman, J. (1992). Model Assisted Survey Sampling. Springer.
Sen, A. R. (1953). On the Estimate of the Variance in Sampling with Varying Probabilities. Journal of the Indian Society of Agricultural Statistics, 5(1194).
Sitter, R. R. (1992). Comparing Three Bootstrap Methods for Survey Data. Canadian Journal of Statistics, 20(2), 135–154. https://doi.org/10.2307/3315464
Stachurski, T. (2018). A Simulation Analysis of the Accuracy of Median Estimators for Different Sampling Designs. In: L. Váchová, V. Kratochvíl (Eds), Proceedings of the 36th International Conference Mathematical Methods in Economics MME 2018 (pp. 509–514). MatfyzPress, Publishing House of the Faculty of Mathematics and Physics Charles University.
Stachurski, T. (2021). Small Area Quantile Estimation Based on Distribution Function Using Linear Mixed Models. Economics and Business Review, 7(2), 97–114. https://doi.org/10.18559/ebr.2021.2.7
Sverchkov, M., & Pfeffermann, D. (2004). Prediction of Finite Population Totals Based on the Sample Distribution. Survey Methodology, 30(1), 79–92.
Thai, H.-T., Mentré, F., Holford, N. H. G., Veyrat-Follet, C., & Comets, E. (2013). A Comparison of Bootstrap Approaches for Estimating Uncertainty of Parameters in Linear Mixed-effects Models. Pharmaceutical Statistics, 12(3), 129–140. https://doi.org/10.1002/pst.1561
Tillé, Y. (2006). Sampling Algorithms. Springer.
Tillé, Y., & Matei, A. (2021). sampling: Survey Sampling, R package. Retrieved from: https://CRAN.R-project.org/package=sampling (accessed: 29.03.2024).
Wolny-Dominiak, A. (2017). Bootstrap Mean Squared Error of Prediction in Loss Reserving. In: K. Jajuga, L. T. Orlowski, K. Staehr (Eds), Contemporary Trends and Challenges in Finance. Springer Proceedings in Business and Economics (pp. 213–220). Springer International Publishing. https://doi.org/10.1007/978-3-319-54885-2_20
Wolny-Dominiak, A., & Żądło, T. (2022a). On Bootstrap Estimators of Some Prediction Accuracy Measures of Loss Reserves in a Non-life Insurance Company. Communications in Statistics – Simulation and Computation, 51(8), 4225–4240. https://doi.org/10.1080/03610918.2020.1740263
Wolny-Dominiak, A., & Żądło, T. (2022b). qape: Quantile of Absolute Prediction Errors, R package. Retrieved from: https://CRAN.R-project.org/package=qape (accessed: 29.03.2024).
Yates, F., & Grundy, P. M. (1953). Selection without Replacement from within Strata with Probability Proportional to Size. Journal of the Royal Statistical Society: Series B (Methodological), 15(2), 253–261. https://doi.org/10.1111/j.2517-6161.1953.tb00140.x
Żądło, T. (2015). Statystyka małych obszarów w badaniach ekonomicznych. Podejście modelowe i mieszane. Wydawnictwo Uniwersytetu Ekonomicznego w Katowicach.
Żądło, T. (2021). On the Generalisation of Quatember’s Bootstrap. Statistics in Transition New Series, 22(1), 163–178. https://doi.org/10.21307/stattrans-2021-009
Downloads
Published
Versions
- 09-09-2024 (3)
- 27-08-2024 (2)
- 01-07-2024 (1)
Issue
Section
License
Copyright (c) 2024 Krakow Review of Economics and Management/Zeszyty Naukowe Uniwersytetu Ekonomicznego w Krakowie
This work is licensed under a Creative Commons Attribution 4.0 International License.