Skip to contents

Calculate the prevalence estimates of wasting based on z-scores of weight-for-height and/or bilateral edema. The function allows users to get the prevalence estimates calculated in accordance with the complex sample design properties; this includes applying survey weights when needed or applicable.

Before estimating, the function evaluates the quality of data by calculating and rating the standard deviation of z-scores of WFHZ. If rated as problematic, the prevalence is estimated based on the PROBIT method.

Outliers are detected based on SMART flags and get excluded prior being piped into the actual prevalence analysis workflow.

Usage

mw_estimate_prevalence_wfhz(df, wt = NULL, edema = NULL, .by = NULL)

Arguments

df

A data set object of class data.frame to use. This must have been wrangled using this package's wrangling function for WFHZ data. The function uses a variable name called cluster where the primary sampling unit IDs are stored. Make sure to rename your cluster ID variable to cluster, otherwise the function will error and terminate the execution.

wt

A vector of class double of the final survey weights. Default is NULL assuming a self weighted survey, as in the ENA for SMART software; otherwise, when a vector of weights if supplied, weighted analysis is done.

edema

A vector of class character of edema. Code should be "y" for presence and "n" for absence of bilateral edema. Default is NULL.

.by

A vector of class character or numeric of the geographical areas or respective IDs for where the data was collected and for which the analysis should be summarised at.

Value

A summarised table of class data.frame of the descriptive statistics about wasting.

Examples

## When .by = NULL ----
### Start off by wrangling the data ----
data <- mw_wrangle_wfhz(
  df = anthro.03,
  sex = sex,
  weight = weight,
  height = height,
  .recode_sex = TRUE
)
#> ================================================================================

### Now run the prevalence function ----
mw_estimate_prevalence_wfhz(
  df = data,
  wt = NULL,
  edema = edema,
  .by = NULL
)
#> # A tibble: 1 × 16
#>   gam_n  gam_p gam_p_low gam_p_upp gam_p_deff sam_n   sam_p sam_p_low sam_p_upp
#>   <dbl>  <dbl>     <dbl>     <dbl>      <dbl> <dbl>   <dbl>     <dbl>     <dbl>
#> 1    82 0.0768    0.0571    0.0964        Inf    20 0.00973   0.00351    0.0160
#> # ℹ 7 more variables: sam_p_deff <dbl>, mam_n <dbl>, mam_p <dbl>,
#> #   mam_p_low <dbl>, mam_p_upp <dbl>, mam_p_deff <dbl>, wt_pop <dbl>

## Now when .by is not set to NULL ----
mw_estimate_prevalence_wfhz(
  df = data,
  wt = NULL,
  edema = edema,
  .by = district
)
#> # A tibble: 4 × 17
#>   district   gam_n  gam_p gam_p_low gam_p_upp gam_p_deff sam_n   sam_p sam_p_low
#>   <chr>      <dbl>  <dbl>     <dbl>     <dbl>      <dbl> <dbl>   <dbl>     <dbl>
#> 1 Metuge        NA 0.0251   NA        NA              NA    NA 0.00155  NA      
#> 2 Cahora-Ba…    25 0.0738    0.0348    0.113         Inf     4 0.00336  -0.00348
#> 3 Chiuta        11 0.0444    0.0129    0.0759        Inf     2 0.00444  -0.00466
#> 4 Maravia       NA 0.0450   NA        NA              NA    NA 0.00351  NA      
#> # ℹ 8 more variables: sam_p_upp <dbl>, sam_p_deff <dbl>, mam_n <dbl>,
#> #   mam_p <dbl>, mam_p_low <dbl>, mam_p_upp <dbl>, mam_p_deff <dbl>,
#> #   wt_pop <dbl>

## When a weighted analysis is needed ----
mw_estimate_prevalence_wfhz(
  df = anthro.02,
  wt = wtfactor,
  edema = edema,
  .by = province
)
#> # A tibble: 2 × 17
#>   province gam_n  gam_p gam_p_low gam_p_upp gam_p_deff sam_n   sam_p sam_p_low
#>   <chr>    <dbl>  <dbl>     <dbl>     <dbl>      <dbl> <dbl>   <dbl>     <dbl>
#> 1 Zambezia    41 0.0261    0.0161    0.0361       1.16    10 0.00236 -0.000255
#> 2 Nampula     80 0.0595    0.0410    0.0779       1.52    33 0.0129   0.00272 
#> # ℹ 8 more variables: sam_p_upp <dbl>, sam_p_deff <dbl>, mam_n <dbl>,
#> #   mam_p <dbl>, mam_p_low <dbl>, mam_p_upp <dbl>, mam_p_deff <dbl>,
#> #   wt_pop <dbl>