Estimate the prevalence of combined wasting
Source:R/prev_wasting_combined.R
mw_estimate_prevalence_combined.Rd
Estimate the prevalence of wasting based on the combined case-definition of
weight-for-height z-scores (WFHZ), MUAC and/or edema. The function allows users to
get the prevalence estimates in accordance with the complex sample
design properties; this includes applying survey weights when needed or applicable.
Before estimating, the function evaluates the quality of data by calculating
and rating the standard deviation of WFHZ and MFAZ, as well as the p-value of
the age ratio test.
Prevalence will be calculated only when the rating of all test is as not
problematic concurrently. If either of them is problematic, it cancels out
the analysis and NA
s get thrown.
Outliers are detected in both WFHZ and in MUAC data set (through z-scores) based on SMART flags get excluded prior being piped into the actual prevalence analysis workflow.
Arguments
- df
A data set object of class
data.frame
to use. This must have been wrangled using this package's wrangling functions for both WFHZ and MUAC data sequentially. The order does not matter. Note that MUAC values should be converted to millimeters after using the MUAC wrangler. If this is not done, the function will stop execution and return an error message. Moreover, the function uses a variable calledcluster
where the primary sampling unit IDs are stored. Make sure to rename your cluster ID variable tocluster
, otherwise the function will error and terminate the execution.- wt
A vector of class
double
of the final survey weights. Default isNULL
assuming a self-weighted survey, as in the ENA for SMART software; otherwise a weighted analysis is computed.- edema
A vector of class
character
of edema. Code will be "y" for presence and "n" for absence of bilateral edema. Default isNULL
.- .by
A vector of class
character
ornumeric
of the geographical areas or respective IDs for where the data was collected and for which the analysis should be summarised at.
Details
A concept of "combined flags" is introduced in this function. It consists of
defining as flag any observation that is flagged in either flag_wfhz
or
flag_mfaz
vectors. A new column cflags
for combined flags is created and
added to df
. This ensures that all flagged observations from both WFHZ
and MFAZ data are excluded from the prevalence analysis.
A glimpse on how cflags
are defined:
flag_wfhz | flag_mfaz | cflags |
1 | 0 | 1 |
0 | 1 | 1 |
0 | 0 | 0 |
Examples
## When .by and wt are set to NULL ----
mw_estimate_prevalence_combined(
df = anthro.02,
wt = NULL,
edema = edema,
.by = NULL
)
#> # A tibble: 1 × 16
#> cgam_n cgam_p cgam_p_low cgam_p_upp cgam_p_deff csam_n csam_p csam_p_low
#> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 199 0.0685 0.0566 0.0804 Inf 68 0.0129 0.00770
#> # ℹ 8 more variables: csam_p_upp <dbl>, csam_p_deff <dbl>, cmam_n <dbl>,
#> # cmam_p <dbl>, cmam_p_low <dbl>, cmam_p_upp <dbl>, cmam_p_deff <dbl>,
#> # wt_pop <dbl>
## When wt is not set to NULL ----
mw_estimate_prevalence_combined(
df = anthro.02,
wt = wtfactor,
edema = edema,
.by = NULL
)
#> # A tibble: 1 × 16
#> cgam_n cgam_p cgam_p_low cgam_p_upp cgam_p_deff csam_n csam_p csam_p_low
#> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 199 0.0708 0.0563 0.0853 1.72 68 0.0151 0.00750
#> # ℹ 8 more variables: csam_p_upp <dbl>, csam_p_deff <dbl>, cmam_n <dbl>,
#> # cmam_p <dbl>, cmam_p_low <dbl>, cmam_p_upp <dbl>, cmam_p_deff <dbl>,
#> # wt_pop <dbl>