The male to female sex ratio test checks whether the ratio of the number of males to the number of females in a survey sample is similar to an expected ratio. The expected male to female sex ratio can be calculated from census or similar data. If there is no expected value then it is usually assumed that there should be equal numbers of males and females in the survey sample. This is usually true for children and young adults but may not be true for older adults.

We will retrieve a survey dataset:

```
svy <- read.table("dp.ex02.csv", header = TRUE, sep = ",")
head(svy)
```

```
#> psu age sex weight height muac oedema
#> 1 1 6 1 7.3 65.0 146 2
#> 2 1 42 2 12.5 89.5 156 2
#> 3 1 23 1 10.6 78.1 149 2
#> 4 1 18 1 12.8 81.5 160 2
#> 5 1 52 1 12.1 87.3 152 2
#> 6 1 36 2 16.9 93.0 190 2
```

The dataset `dp.ex02.csv`

is a comma-separated-value (CSV) file containing anthropometric data from a SMART survey in Kabul, Afghanistan.

It is reported that there are about 2.658 million boys and 2.508 million girls aged between zero and four years in Afghanistan (2012 estimates).

The male to female sex ratio is:

`2.658 / 2.508`

which is:

`#> [1] 1.059809`

It is often easier to work with the proportion of the population that is male:

`2.658 / (2.658 + 2.508)`

which is:

`#> [1] 0.514518`

We compare this to the proportion of the sample that is male:

`table(svy$sex)`

this gives:

```
#>
#> 1 2
#> 438 435
```

This table is more useful when the cell counts are expressed as proportions:

`prop.table(table(svy$sex))`

this gives:

```
#>
#> 1 2
#> 0.5017182 0.4982818
```

A formal test can be made:

This returns:

```
#>
#> 1-sample proportions test with continuity correction
#>
#> data: table(svy$sex), null probability 0.514518
#> X-squared = 0.5225, df = 1, p-value = 0.4698
#> alternative hypothesis: true p is not equal to 0.514518
#> 95 percent confidence interval:
#> 0.4680459 0.5353752
#> sample estimates:
#> p
#> 0.5017182
```

The male to female sex ratio (expressed as the proportion male) in the example data is not significantly different from the expected male to female sex ratio (expressed as the proportion male).

The NiPN data quality toolkit provides an R language function called `sexRatioTest()`

that performs a sex ratio test:

`sexRatioTest(svy$sex, codes = c(1, 2), pop = c(2.658, 2.508))`

which returns:

```
#>
#> Sex Ratio Test
#>
#> Expected proportion male = 0.5145
#> Observed proportion male = 0.5017
#> X-squared = 0.5225, p = 0.4698
```

The codes used in the sex variable for male and female are specified using the codes parameter. If (e.g.) `sex`

were coded using **M** and **F** then you would specify `codes = c("M", "F")`

.

Population data are specified using the `pop`

parameter (males then females). This can be specified as numbers or as a ratio. The test above could have been specified as:

```
sexRatioTest(svy$sex, codes = c(1, 2), pop = c(1.059809, 1))
#>
#> Sex Ratio Test
#>
#> Expected proportion male = 0.5145
#> Observed proportion male = 0.5017
#> X-squared = 0.5225, p = 0.4698
```

If (e.g.) you want to specify a one to one sex ratio then you would use `pop = c(1, 1)`

.

The observed sex ratio at birth is 1.06:1.00 (males to females). This could be used to assess if selective abortion or female infanticide is taking place although a large sample size (i.e. about n = 6200) is required for such a test to have sufficient power.

The sex ratio test may be performed on each age group separately. You can apply the sex ratio test to each age-group using the `by()`

function:

```
svy$ycag <- recode(svy$age, "6:17=1; 18:29=2; 30:41=3; 42:53=4; 54:59=5")
by(svy$sex, svy$ycag, sexRatioTest, codes = c(1, 2), pop = c(2.658, 2.508))
#> svy$ycag: 1
#>
#> Sex Ratio Test
#>
#> Expected proportion male = 0.5145
#> Observed proportion male = 0.4879
#> X-squared = 0.4845, p = 0.4864
#>
#> ------------------------------------------------------------
#> svy$ycag: 2
#>
#> Sex Ratio Test
#>
#> Expected proportion male = 0.5145
#> Observed proportion male = 0.5152
#> X-squared = 0.0000, p = 1.0000
#>
#> ------------------------------------------------------------
#> svy$ycag: 3
#>
#> Sex Ratio Test
#>
#> Expected proportion male = 0.5145
#> Observed proportion male = 0.5228
#> X-squared = 0.0374, p = 0.8466
#>
#> ------------------------------------------------------------
#> svy$ycag: 4
#>
#> Sex Ratio Test
#>
#> Expected proportion male = 0.5145
#> Observed proportion male = 0.4875
#> X-squared = 0.3657, p = 0.5454
#>
#> ------------------------------------------------------------
#> svy$ycag: 5
#>
#> Sex Ratio Test
#>
#> Expected proportion male = 0.5145
#> Observed proportion male = 0.4627
#> X-squared = 0.5280, p = 0.4674
```

Note that the variable `ycag`

created above holds the year-centred-age-group.

This approach assumes that the sex ratio is independent of age.

An approach that does not make this assumption is to use the numbers of male and female children in the same age-ranges in the population taken from census data.

A useful source of census data is the United States Census Bureau’s International Data Base:

https://www.census.gov/data-tools/demo/idb/informationGateway.php

This source gives the following estimates for Afghanistan in 2016:

Age | Number Males | Number Females | Proportion Male | Propotion Female | Male-to-Female Sex Ratio |
---|---|---|---|---|---|

0 | 594602 | 573956 | 0.5088 | 0.4912 | 1.04:1.00 |

1 | 550593 | 533579 | 0.5078 | 0.4922 | 1.03:1.00 |

2 | 526827 | 510479 | 0.5079 | 0.4921 | 1.03:1.00 |

3 | 509048 | 493185 | 0.5079 | 0.4921 | 1.03:1.00 |

4 | 493521 | 478137 | 0.5079 | 0.4921 | 1.03:1.00 |

We need to ensure we use the same age-ranges as the census:

`svy$ageGroup <- recode(svy$age, "0:11=0; 12:23=1; 24:35=2; 36:47=3; 48:59=4")`

and then test the sex ratio in each age group separately:

`sexRatioTest(svy$sex[svy$ageGroup == 0], pop = c(594602, 573956)) sexRatioTest(svy$sex[svy$ageGroup == 1], pop = c(550593, 533579)) sexRatioTest(svy$sex[svy$ageGroup == 2], pop = c(526827, 510479)) sexRatioTest(svy$sex[svy$ageGroup == 3], pop = c(509048, 493185)) sexRatioTest(svy$sex[svy$ageGroup == 4], pop = c(493521, 478137))`

```
#>
#> Sex Ratio Test
#>
#> Expected proportion male = 0.5088
#> Observed proportion male = 0.5047
#> X-squared = 0.0000, p = 1.0000
#>
#> Sex Ratio Test
#>
#> Expected proportion male = 0.5078
#> Observed proportion male = 0.4901
#> X-squared = 0.1885, p = 0.6642
#>
#> Sex Ratio Test
#>
#> Expected proportion male = 0.5079
#> Observed proportion male = 0.5374
#> X-squared = 0.6800, p = 0.4096
#>
#> Sex Ratio Test
#>
#> Expected proportion male = 0.5079
#> Observed proportion male = 0.5052
#> X-squared = 0.0000, p = 0.9978
#>
#> Sex Ratio Test
#>
#> Expected proportion male = 0.5079
#> Observed proportion male = 0.4552
#> X-squared = 1.4098, p = 0.2351
```

All of these tests find no significant differences between the observed and expected sex ratios. It should be noted that some (or all) of the tests might be based on small sample sizes:

```
table(svy$ageGroup)
#>
#> 0 1 2 3 4
#> 107 202 227 192 145
```

and may, therefore, be able to detect only large differences.

With data from children we usually expect something like a one to one male to female sex ratio. This will not usually be the case with adults, especially older adults.

We will retrieve a survey dataset:

```
svy <- read.table("ah.ex01.csv", header = TRUE, sep = ",")
head(svy)
```

```
#> psu camp block age sex weight height demispan muac oedema
#> 1 1 IFO A01 90 1 40.8 159.3 77.2 20.0 2
#> 2 1 IFO A01 60 2 69.8 155.3 78.3 35.3 2
#> 3 1 IFO A01 63 2 51.7 156.8 80.5 25.5 2
#> 4 1 IFO A01 74 2 61.1 158.9 83.5 27.0 2
#> 5 1 IFO A01 65 2 55.1 156.9 85.5 24.5 2
#> 6 1 IFO A01 62 2 56.7 158.1 86.3 26.1 2
```

The dataset `ah.ex01`

is a comma-separated-value (CSV) file containing anthropometry data from a Rapid Assessment Method for Older People (RAM-OP) survey in the Dadaab refugee camps in Garissa, Kenya. This is a survey of older people, defined as people aged sixty years and older.

With this type of survey it is usually possible to use camp administration data to find the expected male to female sex ratio. This information was not given in the RAM-OP survey report.

The camp population is predominantly Somali. It is reported that there are 188 thousand men and 220 thousand women aged sixty years and older in Somalia (2010 estimates). The sex ratio is:

`188 / 220`

which is:

`#> [1] 0.8545455`

The expected proportion of the population that is male is:

`188 / (188 + 220)`

which is:

`#> [1] 0.4607843`

The proportion of the sample that is male:

`prop.table(table(svy$sex))`

is:

```
#>
#> 1 2
#> 0.381113 0.618887
```

This looks to be much smaller than the expected proportion. The sex ratio test:

`sexRatioTest(svy$sex, codes = c(1, 2), pop = c(188, 220))`

reports:

```
#>
#> Sex Ratio Test
#>
#> Expected proportion male = 0.4608
#> Observed proportion male = 0.3811
#> X-squared = 14.8305, p = 0.0001
```

The proportion of males in the sample is significantly smaller than we expected.

This result could be due to the extraordinary nature of the population (e.g. the camp population could really have very many more older women than older men). It could also due to a selection bias in the survey. In this example, men were more likely than women to be away from home during the day and a household sample taken during the day would have systematically excluded the more active members of the male population.

Note that the sex ratio test only applies to population surveys. If surveys focus on (e.g.) carers of small children then the observed male to female sex ratio is likely to be strongly biased towards women. In such cases it is not sensible to apply a sex ratio test.