Multiple Imputation by Fully Conditional Specification for Dealing with Missing Data in a Large Epidemiologic Study
DOI:
https://doi.org/10.6000/1929-6029.2015.04.03.7Keywords:
Missing data, multiple imputation, fully conditional specification, complete case analysis, blood utilizationAbstract
Missing data commonly occur in large epidemiologic studies. Ignoring incompleteness or handling the data inappropriately may bias study results, reduce power and efficiency, and alter important risk/benefit relationships. Standard ways of dealing with missing values, such as complete case analysis (CCA), are generally inappropriate due to the loss of precision and risk of bias. Multiple imputation by fully conditional specification (FCS MI) is a powerful and statistically valid method for creating imputations in large data sets which include both categorical and continuous variables. It specifies the multivariate imputation model on a variable-by-variable basis and offers a principled yet flexible method of addressing missing data, which is particularly useful for large data sets with complex data structures. However, FCS MI is still rarely used in epidemiology, and few practical resources exist to guide researchers in the implementation of this technique. We demonstrate the application of FCS MI in support of a large epidemiologic study evaluating national blood utilization patterns in a sub-Saharan African country. A number of practical tips and guidelines for implementing FCS MI based on this experience are described.
References
Little RJ, Rubin DB. Statistical analysis of missing data, 2nd ed. Hoboken: John Wiley & Sons; 2002. http://dx.doi.org/10.1002/9781119013563 DOI: https://doi.org/10.1002/9781119013563
He YL. Circ Cardiovasc Qual Outcomes 2010; 3: 98-105. http://dx.doi.org/10.1161/CIRCOUTCOMES.109.875658 DOI: https://doi.org/10.1161/CIRCOUTCOMES.109.875658
Pigott TD. Educ Res Eval 2001; 7(4): 353-83. http://dx.doi.org/10.1076/edre.7.4.353.8937 DOI: https://doi.org/10.1076/edre.7.4.353.8937
Graham JW. Annu Rev Psychol 2009; 60: 549-76. http://dx.doi.org/10.1146/annurev.psych.58.110405.085530 DOI: https://doi.org/10.1146/annurev.psych.58.110405.085530
White IR, Carlin JB. Statist Med 2010; 29: 2920-31. http://dx.doi.org/10.1002/sim.3944 DOI: https://doi.org/10.1002/sim.3944
Enders CK. Struct Equ Modelling 2001; 8: 128-41. http://dx.doi.org/10.1207/S15328007SEM0801_7 DOI: https://doi.org/10.1207/S15328007SEM0801_7
Oba S, Sato M, Takemasa I, Monden M, Matsubara K, Ishii S. Bioinformatics 2003; 19: 2088-96. http://dx.doi.org/10.1093/bioinformatics/btg287 DOI: https://doi.org/10.1093/bioinformatics/btg287
Patrician PA. Res Nurs Health 2002; 25: 76-84. http://dx.doi.org/10.1002/nur.10015 DOI: https://doi.org/10.1002/nur.10015
Newgard CD, Haukoos JS. Acad Emerg Med 2007; 14: 669-78. DOI: https://doi.org/10.1111/j.1553-2712.2007.tb01856.x
Buuren SV, Groothuis-Oudshoorn CG. J Stat Softw 2011; 45: 1-67. DOI: https://doi.org/10.18637/jss.v045.i03
Buuren SV, Brand JP, Groothuis-Oudshoorn CG, Rubin DB. J Stat Comput Sim 2006; 76: 1049-64. http://dx.doi.org/10.1080/10629360600810434 DOI: https://doi.org/10.1080/10629360600810434
Azur MJ, Stuart EA, Frangakis C, Leaf PJ. Int J Mehtods Psychiatr Rec 2011; 20: 40-9. http://dx.doi.org/10.1002/mpr.329 DOI: https://doi.org/10.1002/mpr.329
Bernaards CA, Belin TR, Schafer JL. Statist Med 2007; 26: 1368-82. http://dx.doi.org/10.1002/sim.2619 DOI: https://doi.org/10.1002/sim.2619
Joseph L. Schafer, John W. Graham. Psychol Methods 2002; 7: 147-77. http://dx.doi.org/10.1037/1082-989X.7.2.147 DOI: https://doi.org/10.1037/1082-989X.7.2.147
Sterne JA, White IR, Carlin JB, Spratt M, Royston P, Kenward MG, Wood AM, Carpenter JR. BMJ 2009; 339: 157-60.
Horton NJ, Kleinman KP. Am Stat 2007; 61: 79-90. http://dx.doi.org/10.1198/000313007X172556 DOI: https://doi.org/10.1198/000313007X172556
Lee KJ, Carlin JB. Am J Epidemiol 2010; 171: 624-32. http://dx.doi.org/10.1093/aje/kwp425 DOI: https://doi.org/10.1093/aje/kwp425
Lee KJ, Carlin JB. Emerg Themes Epidemiol 2012; 9: 1-10. http://dx.doi.org/10.1186/1742-7622-9-3 DOI: https://doi.org/10.1186/1742-7622-9-3
Stuart EA, Azur M, Frangakis C, Leaf P. Am J Epidemiol 2009; 169:1133-9. http://dx.doi.org/10.1093/aje/kwp026 DOI: https://doi.org/10.1093/aje/kwp026
He Y, Zaslavsky AM, Landrum MB, Harrington DP, Catalano P. Stat Methods Med Res 2010; 19: 653-70. http://dx.doi.org/10.1177/0962280208101273 DOI: https://doi.org/10.1177/0962280208101273
Schenker N, Raghunathan TE, Chiu PL, Makuc DM, Zhang GY, Cohen AJ. J Amer Statist Assoc 2006; 101: 924-33. http://dx.doi.org/10.1198/016214505000001375 DOI: https://doi.org/10.1198/016214505000001375
Meza BPL, Lohrke B, Wilkinson R, Pitman JP, Shiraishi RW, Lowrance DW, Kuehnert MJ, Mataranyika M, Basavaraju SV. Blood Transfus 2014; 12(3): 352-61.
Pitman JP, Wilkinson R, Liu Y, Finckenstein B, Sibinga CS, Lowrance DW, Marfin AA, Postma M, Mataranyika M, Basavaraju SV. Transfus Med Rev 2015; 29: 45-51. http://dx.doi.org/10.1016/j.tmrv.2014.11.003 DOI: https://doi.org/10.1016/j.tmrv.2014.11.003
Carlin BP, Louis TA. Bayesian methods for data analysis, 3nd ed. New York, NY: Springer Verlag; 2008. DOI: https://doi.org/10.1201/b14884
Glynn RJ, Laird NM, Rubin DB. J Amer Statist Assoc 1993; 88: 984-93. http://dx.doi.org/10.1080/01621459.1993.10476366 DOI: https://doi.org/10.1080/01621459.1993.10476366
Buuren SV. Stat Methods Med Res 2007; 16: 219-42. http://dx.doi.org/10.1177/0962280206074463 DOI: https://doi.org/10.1177/0962280206074463
Abayomi K, Gelman A, Levy M. Appl Statist 2008; 57: 273-91. http://dx.doi.org/10.1111/j.1467-9876.2007.00613.x DOI: https://doi.org/10.1111/j.1467-9876.2007.00613.x
Bartlett JW, Seaman SR, White IR, Carpenter JR. Multiple imputation of covariates by fully conditional specification: Accommodating the substantive model. Stat Meth Med Res 2014. http://dx.doi.org/10.1177/0962280214521348 DOI: https://doi.org/10.1177/0962280214521348
Yucel RM. J STAT SOFTW 2011; 45: 1-7. DOI: https://doi.org/10.18637/jss.v045.i01
Dziuraa JD, Posta LA, Zhao Q, Fu ZX, Peduzzi P. Yale J Biol Med 2013; 86: 343-58.
Héraud-Bousquet V, Larsen C, Carpenter J, Desenclos JC, Strat YL. BMC Med Res Methodol 2012; 12: 1-11. http://dx.doi.org/10.1186/1471-2288-12-73 DOI: https://doi.org/10.1186/1471-2288-12-73
Published
How to Cite
Issue
Section
License
Copyright (c) 2015 Yang Liu, Anindya De
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
Policy for Journals/Articles with Open Access
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are permitted and encouraged to post links to their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work
Policy for Journals / Manuscript with Paid Access
Authors who publish with this journal agree to the following terms:
- Publisher retain copyright .
- Authors are permitted and encouraged to post links to their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work .