Show simple item record

dc.contributor.authorMburu, Rachael
dc.date.accessioned2021-01-25T13:01:22Z
dc.date.available2021-01-25T13:01:22Z
dc.date.issued2020
dc.identifier.urihttp://erepository.uonbi.ac.ke/handle/11295/154085
dc.description.abstractBackground: Children with a Height-for-Age (HAZ) below-2 Standards Deviations based on the World Health Organization (WHO) child growth standards median are said to be stunted. Most stunted children are too short for their age. Stunting is determined by calculating the number of under- ve children whose z-score is below -2 SDs from the median HAZ of the WHO child growth standards divided by total number of under ve children who are measured. According to Kenya Demographic Survey (KDHS, 2014), the national prevalence of stunting among the under- ve children was 26% which was relatively higher than the average prevalence of developing countries which is 25%. Objective: This work compares Random Forest and Elastic Net in identifying determinants of under ve childhood stunting with Variable Importance as the key outcome. Methods: The Kenya Demographic Health Survey (KDHS) women and children data was used for analysis. This data was cleaned using STATA and analyzed with R software. Due to the variance in the classes of the response variable, Synthetic Minority Oversampling Technique (SMOTE) was employed to obtain a balanced class data. Missing observations were imputed using r mpute function from library randomForest in R software. Random Forest and Elastic Net algorithms were used to obtain determinants of stunting while Area Under the Curve (AUC) of the Receiver Operating Characteristic (ROC) Curve was used to compare the models. Results: The top 5 factors in terms of importance according to Random Forest are: underweight status, region, child’s age, ethnicity, and mother’s current age. According to the Elastic Net algorithm, the top 5 important coe cient variables are: underweight children, Nairobi region, 60+ months preceding birth interval, 12-23 months old children, and children from Luhya ethnicity. In terms of the ROC values, Random Forest had an AUC of 0.92 while Elastic Net had an AUC of 0.86. Conclusion: Based on our ndings, most of the top ranked important variables selected by Random Forest and Elastic Net are similar. Nevertheless, Random Forest performed better than the Elastic Net algorithm in determining the factors of under ve childhood stunting. Keywords: Stunting, Random Forest, Elastic Net, Variable Importance, Gini Index, Area Under the Curve (AUC), Receiver Operating Characteristic Curve (ROC), Missing valuesen_US
dc.language.isoenen_US
dc.publisherUniversity of Nairobien_US
dc.rightsAttribution-NonCommercial-NoDerivs 3.0 United States*
dc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/3.0/us/*
dc.subjectStunting, Random Forest, Elastic Net, Variable Importance, Gini Index, Area Under the Curve (AUC), Receiver Operating Characteristic Curve (ROC), Missing valuesen_US
dc.titleComparison of elastic net and random forest in identifying risk factors of stunting in children under five years of age in Kenyaen_US
dc.typeThesisen_US


Files in this item

Thumbnail
Thumbnail

This item appears in the following Collection(s)

Show simple item record

Attribution-NonCommercial-NoDerivs 3.0 United States
Except where otherwise noted, this item's license is described as Attribution-NonCommercial-NoDerivs 3.0 United States