dc.contributor.author | Nuwasiima, Afra | |
dc.date.accessioned | 2018-10-25T07:18:55Z | |
dc.date.available | 2018-10-25T07:18:55Z | |
dc.date.issued | 2018 | |
dc.identifier.citation | Master of Science in Biometry | en_US |
dc.identifier.uri | http://hdl.handle.net/11295/104395 | |
dc.description | Master of Science in Biometry | en_US |
dc.description.abstract | Background: Demographic and Health Surveys (DHS) provide data on a wide scope
of risk-factors of under-five child survival. Missing covariate data is inevitable in the
DHS under-five survival data since data is collected retrospectively and on a large
number of covariates. We studied the missing data problem on the risk-factors of
under-five child survival in DHS data sets.
Methods: Random survival forests model was first used for selecting the highly
predictive risk factors from a pool of over 400 covariates, from which a subset of
50 covariates was selected. Multiple imputation by chained equations (MICE) and
random forests were applied to handle missing covariate data. Imputed data was then
analyzed using random survival forests and Cox-regression models.
Results: The results showed that missingness in covariates was more related to the
time to event (52%) than the event status (19%) response variables. The ranking
of under-five risk factors from imputed data sets was closely related to the ranking
from the observed values, albeit, multiple imputation led to increase in the variable
importance scores. The unadjusted estimates from the Cox-regression model based
on imputed values were closely similar to the estimates from the observed values.
However, minimal discrepancies in estimates were observed in covariates with over
30% missing data. Random forests approach shown potential for producing estimates
much closer to the true estimates with high level of missing than MICE.
Conclusion: Multiple imputation shown potential to produce estimates closely similar
to the true estimates even with high missingness. Random forests imputation shown
potential to perform better than MICE imputation strategies. The current study
results may need to be validated using a larger simulation study and other non-response models for decisive conclusions to be made. | en_US |
dc.language.iso | en | en_US |
dc.publisher | School of Mathematics, University of Nairobi | en_US |
dc.rights | Attribution-NonCommercial-NoDerivs 3.0 United States | * |
dc.rights.uri | http://creativecommons.org/licenses/by-nc-nd/3.0/us/ | * |
dc.subject | Multiple Imputation | en_US |
dc.subject | Random Survival | en_US |
dc.subject | Health Survey Child Survival | en_US |
dc.title | Multiple Imputation and Random Survival Forests: Application to the Demographic and Health Survey Child Survival Data | en_US |
dc.type | Thesis | en_US |