Estimate the prevalence of wasting based on z-scores of weight-for-height (WFHZ)
Source:R/prev_wasting_wfhz.R
mw_estimate_prevalence_wfhz.Rd
Calculate the prevalence estimates of wasting based on z-scores of weight-for-height and/or bilateral edema. The function allows users to get the prevalence estimates calculated in accordance with the complex sample design properties; this includes applying survey weights when needed or applicable.
Before estimating, the function evaluates the quality of data by calculating and rating the standard deviation of z-scores of WFHZ. If rated as problematic, the prevalence is estimated based on the PROBIT method.
Outliers are detected based on SMART flags and get excluded prior being piped into the actual prevalence analysis workflow.
Arguments
- df
A data set object of class
data.frame
to use. This must have been wrangled using this package's wrangling function for WFHZ data. The function uses a variable name calledcluster
where the primary sampling unit IDs are stored. Make sure to rename your cluster ID variable tocluster
, otherwise the function will error and terminate the execution.- wt
A vector of class
double
of the final survey weights. Default isNULL
assuming a self weighted survey, as in the ENA for SMART software; otherwise, when a vector of weights if supplied, weighted analysis is done.- edema
A vector of class
character
of edema. Code should be "y" for presence and "n" for absence of bilateral edema. Default isNULL
.- .by
A vector of class
character
ornumeric
of the geographical areas or respective IDs for where the data was collected and for which the analysis should be summarised at.
Examples
## When .by = NULL ----
### Start off by wrangling the data ----
data <- mw_wrangle_wfhz(
df = anthro.03,
sex = sex,
weight = weight,
height = height,
.recode_sex = TRUE
)
#> ================================================================================
### Now run the prevalence function ----
mw_estimate_prevalence_wfhz(
df = data,
wt = NULL,
edema = edema,
.by = NULL
)
#> # A tibble: 1 × 16
#> gam_n gam_p gam_p_low gam_p_upp gam_p_deff sam_n sam_p sam_p_low sam_p_upp
#> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 82 0.0768 0.0571 0.0964 Inf 20 0.00973 0.00351 0.0160
#> # ℹ 7 more variables: sam_p_deff <dbl>, mam_n <dbl>, mam_p <dbl>,
#> # mam_p_low <dbl>, mam_p_upp <dbl>, mam_p_deff <dbl>, wt_pop <dbl>
## Now when .by is not set to NULL ----
mw_estimate_prevalence_wfhz(
df = data,
wt = NULL,
edema = edema,
.by = district
)
#> # A tibble: 4 × 17
#> district gam_n gam_p gam_p_low gam_p_upp gam_p_deff sam_n sam_p sam_p_low
#> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 Metuge NA 0.0251 NA NA NA NA 0.00155 NA
#> 2 Cahora-Ba… 25 0.0738 0.0348 0.113 Inf 4 0.00336 -0.00348
#> 3 Chiuta 11 0.0444 0.0129 0.0759 Inf 2 0.00444 -0.00466
#> 4 Maravia NA 0.0450 NA NA NA NA 0.00351 NA
#> # ℹ 8 more variables: sam_p_upp <dbl>, sam_p_deff <dbl>, mam_n <dbl>,
#> # mam_p <dbl>, mam_p_low <dbl>, mam_p_upp <dbl>, mam_p_deff <dbl>,
#> # wt_pop <dbl>
## When a weighted analysis is needed ----
mw_estimate_prevalence_wfhz(
df = anthro.02,
wt = wtfactor,
edema = edema,
.by = province
)
#> # A tibble: 2 × 17
#> province gam_n gam_p gam_p_low gam_p_upp gam_p_deff sam_n sam_p sam_p_low
#> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 Zambezia 41 0.0261 0.0161 0.0361 1.16 10 0.00236 -0.000255
#> 2 Nampula 80 0.0595 0.0410 0.0779 1.52 33 0.0129 0.00272
#> # ℹ 8 more variables: sam_p_upp <dbl>, sam_p_deff <dbl>, mam_n <dbl>,
#> # mam_p <dbl>, mam_p_low <dbl>, mam_p_upp <dbl>, mam_p_deff <dbl>,
#> # wt_pop <dbl>