The prevalence is calculated in accordance with the complex sample design properties inherent to surveys. This includes weighting of survey data where applicable. When either the acceptability of the standard deviation of WFHZ or of the age ratio test is problematic, prevalence is not calculated.
Usage
compute_pps_based_combined_prevalence(
df,
.wt = NULL,
.edema = NULL,
.summary_by
)
compute_combined_prevalence(df, .wt = NULL, .edema = NULL, .summary_by = NULL)
Arguments
- df
An already wrangled dataset of class
data.frame
to use. Both wranglers (of WFHZ and of MUAC) need to be used sequentially, regardless of the order. Note that MUAC values should be converted to millimeters after using the MUAC wrangler.- .wt
A vector of class
double
of the final survey weights. Default isNULL
assuming a self-weighted survey, as in the ENA for SMART software; otherwise a weighted analysis is computed.- .edema
A vector of class
character
of edema. Code should be "y" for presence and "n" for absence of bilateral edema. Default isNULL
.- .summary_by
A vector of class
character
of the geographical areas where the data was collected and for which the analysis should be performed.
Details
A concept of "combined flags" is introduced in this function. It consists of
defining as flag any observation that is flagged in either flag_wfhz
or
flag_mfaz
vectors. A new column cflags
for combined flags is created and
added to df
. This ensures that all flagged observations from both WFHZ
and MFAZ data are excluded from the combined prevalence analysis.
The table below shows an overview of how cflags
are defined
flag_wfhz | flag_mfaz | cflags |
1 | 0 | 1 |
0 | 1 | 1 |
0 | 0 | 0 |
Examples
## When .summary_by and .wt are set to NULL ----
p <- compute_combined_prevalence(
df = anthro.02,
.wt = NULL,
.edema = edema,
.summary_by = NULL
)
print(p)
#> # A tibble: 1 × 16
#> cgam_n cgam_p cgam_p_low cgam_p_upp cgam_p_deff csam_n csam_p csam_p_low
#> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 199 0.0685 0.0566 0.0804 Inf 68 0.0129 0.00770
#> # ℹ 8 more variables: csam_p_upp <dbl>, csam_p_deff <dbl>, cmam_n <dbl>,
#> # cmam_p <dbl>, cmam_p_low <dbl>, cmam_p_upp <dbl>, cmam_p_deff <dbl>,
#> # wt_pop <dbl>
## When .wt is not set to NULL ----
x <- compute_combined_prevalence(
df = anthro.02,
.wt = "wtfactor",
.edema = edema,
.summary_by = NULL
)
print(x)
#> # A tibble: 1 × 16
#> cgam_n cgam_p cgam_p_low cgam_p_upp cgam_p_deff csam_n csam_p csam_p_low
#> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 199 0.0708 0.0563 0.0853 1.72 68 0.0151 0.00750
#> # ℹ 8 more variables: csam_p_upp <dbl>, csam_p_deff <dbl>, cmam_n <dbl>,
#> # cmam_p <dbl>, cmam_p_low <dbl>, cmam_p_upp <dbl>, cmam_p_deff <dbl>,
#> # wt_pop <dbl>
## When working on data frame with multiple survey areas ----
s <- anthro.03 |>
mw_wrangle_age(
dos = NULL,
dob = NULL,
age = age,
.decimals = 2
) |>
mw_wrangle_muac(
sex = sex,
muac = muac,
age = "age",
.recode_sex = TRUE,
.recode_muac = TRUE,
.to = "cm"
) |>
dplyr::mutate(muac = recode_muac(muac, .to = "mm")) |>
mw_wrangle_wfhz(
sex = sex,
weight = weight,
height = height,
.recode_sex = TRUE) |>
compute_combined_prevalence(
.edema = edema,
.summary_by = district
)
#> ================================================================================
#> ================================================================================
print(s)
#> # A tibble: 4 × 17
#> district cgam_n cgam_p cgam_p_low cgam_p_upp cgam_p_deff csam_n csam_p
#> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 Metuge 15 0.0538 0.0119 0.0956 Inf 6 0.00538
#> 2 Cahora-Bassa 27 0.0836 0.0493 0.118 Inf 2 0
#> 3 Chiuta 11 0.0359 0.00701 0.0647 Inf 4 0.00448
#> 4 Maravia NA NA NA NA NA NA NA
#> # ℹ 9 more variables: csam_p_low <dbl>, csam_p_upp <dbl>, csam_p_deff <dbl>,
#> # cmam_n <dbl>, cmam_p <dbl>, cmam_p_low <dbl>, cmam_p_upp <dbl>,
#> # cmam_p_deff <dbl>, wt_pop <dbl>