Skip to contents

Calculate the prevalence estimates of wasting based on z-scores of weight-for-height and/or nutritional edema. The function allows users to estimate prevalence in accordance with complex sample design properties such as accounting for survey sample weights when needed or applicable. The quality of the data is first evaluated by calculating and rating the standard deviation of WFHZ. Standard approach to prevalence estimation is calculated only when the standard deviation of MFAZ is rated as not problematic. If the standard deviation is problematic, prevalence is estimated using the PROBIT estimator. Outliers are detected based on SMART flagging criteria. Identified outliers are then excluded before prevalence estimation is performed.

Usage

mw_estimate_prevalence_wfhz(df, wt = NULL, edema = NULL, .by = NULL)

Arguments

df

A tibble object that has been produced by the mw_wrangle_wfhz() functions. The df should have a variable named cluster for the primary sampling unit identifiers.

wt

A vector of class double of the survey sampling weights. Default is NULL which assumes a self-weighted survey as is the case for a survey sample selected proportional to population size (i.e., SMART survey sample). Otherwise, a weighted analysis is implemented.

edema

A character vector for presence of nutritional edema coded as "y" for presence of nutritional edema and "n" for absence of nutritional edema. Default is NULL.

.by

A character or numeric vector of the geographical areas or identifiers for where the data was collected and for which the analysis should be summarised for.

Value

A summary tibble for the descriptive statistics about wasting.

Examples

## When .by = NULL ----
### Start off by wrangling the data ----
data <- mw_wrangle_wfhz(
  df = anthro.03,
  sex = sex,
  weight = weight,
  height = height,
  .recode_sex = TRUE
)
#> ================================================================================

### Now run the prevalence function ----
mw_estimate_prevalence_wfhz(
  df = data,
  wt = NULL,
  edema = edema,
  .by = NULL
)
#> # A tibble: 1 × 16
#>   gam_n  gam_p gam_p_low gam_p_upp gam_p_deff sam_n   sam_p sam_p_low sam_p_upp
#>   <dbl>  <dbl>     <dbl>     <dbl>      <dbl> <dbl>   <dbl>     <dbl>     <dbl>
#> 1    82 0.0768    0.0571    0.0964        Inf    20 0.00973   0.00351    0.0160
#> # ℹ 7 more variables: sam_p_deff <dbl>, mam_n <dbl>, mam_p <dbl>,
#> #   mam_p_low <dbl>, mam_p_upp <dbl>, mam_p_deff <dbl>, wt_pop <dbl>

## Now when .by is not set to NULL ----
mw_estimate_prevalence_wfhz(
  df = data,
  wt = NULL,
  edema = edema,
  .by = district
)
#> # A tibble: 4 × 17
#>   district   gam_n  gam_p gam_p_low gam_p_upp gam_p_deff sam_n   sam_p sam_p_low
#>   <chr>      <dbl>  <dbl>     <dbl>     <dbl>      <dbl> <dbl>   <dbl>     <dbl>
#> 1 Metuge        NA 0.0251   NA        NA              NA    NA 0.00155  NA      
#> 2 Cahora-Ba…    25 0.0738    0.0348    0.113         Inf     4 0.00336  -0.00348
#> 3 Chiuta        11 0.0444    0.0129    0.0759        Inf     2 0.00444  -0.00466
#> 4 Maravia       NA 0.0450   NA        NA              NA    NA 0.00351  NA      
#> # ℹ 8 more variables: sam_p_upp <dbl>, sam_p_deff <dbl>, mam_n <dbl>,
#> #   mam_p <dbl>, mam_p_low <dbl>, mam_p_upp <dbl>, mam_p_deff <dbl>,
#> #   wt_pop <dbl>

## When a weighted analysis is needed ----
mw_estimate_prevalence_wfhz(
  df = anthro.02,
  wt = wtfactor,
  edema = edema,
  .by = province
)
#> # A tibble: 2 × 17
#>   province gam_n  gam_p gam_p_low gam_p_upp gam_p_deff sam_n   sam_p sam_p_low
#>   <chr>    <dbl>  <dbl>     <dbl>     <dbl>      <dbl> <dbl>   <dbl>     <dbl>
#> 1 Zambezia    41 0.0261    0.0161    0.0361       1.16    10 0.00236 -0.000255
#> 2 Nampula     80 0.0595    0.0410    0.0779       1.52    33 0.0129   0.00272 
#> # ℹ 8 more variables: sam_p_upp <dbl>, sam_p_deff <dbl>, mam_n <dbl>,
#> #   mam_p <dbl>, mam_p_low <dbl>, mam_p_upp <dbl>, mam_p_deff <dbl>,
#> #   wt_pop <dbl>