# How to do sampling

**How to do sampling**

If we are going to do a survey- either quantitative or qualitative it is necessary to take a sample of the total population we want to assess- that is unless it is very small and you can include everyone.

In taking a sample there are a few considerations:

- Define who to include in your survey – this is your survey population. It needs to be defined geographically – such as the area covered by your project; demographically- such as children 6 to 24 months; according to other characteristics, such as ‘people who attended specific training sessions’
- Decide what you are going to assess – such as dietary diversity or a Revaluation 6 box exercise etc.
- The timing – for example to take into consideration seasons, time of day, when people will be available.
- Decide whether you want to know the difference between different groups. This gives you the comparison groups that you want to analyse separately- these could be (for example) people within the project and people outside the project; people with a garden or without. Your subgroups will depend on what you are trying to assess.
- To be representative of the population there needs to be equal chance of everyone being included to avoid some kind of bias in who is included. Doing a random sample is the easiest way to do this. Simple and stratified sampling are 2 examples or random sampling
**Simple random sampling**. In this type of sampling everyone has equal chance of being included – and the process is like drawing numbers out of a hat. This has the disadvantage that you have to travel over a wide distance to reach possibly scattered individuals in your sample. For example, travelling to many schools for just a few sampled children.**Stratified sampling**– In this type of sampling method, population is divided into groups called strata based on certain common characteristic like attending a particular school. You can do 2 stage sampling where, for example you randomly select a certain number of schools then randomly select a number of pupils in each school. For example, if you want to sample 100 children, you first choose 10 schools then 10 children from each school (a stratified sample) rather than having 2 or 3 children in 40 schools (a simple random sample).

- A convenience sample is where the easiest to reach people are sampled. A voluntary sample is where people volunteer based on a general invitation. Neither are representative methods and should be avoided if possible.

**Quantitative and Qualitative sampling**

The idea of being representative applies both to quantitative and qualitative surveys but the sample size calculations are different.

Sample size calculation for a **quantitative survey** depends on the type of statistical tests you are hoping to perform- for example do you want to see whether there is a significant change from baseline compared to a later date. Perhaps you want to know the difference in dietary diversity score between groups or before and after your project. To do this accurately and you need to know:

- What is the variability of the parameter (e.g., dietary diversity) in your population?
- How much change in this would be meaningful as a result of the project (e.g., the project improves dietary diversity by 2 food groups)

Often, you don’t know this information, and also you plan to collect other data too and you don’t know the variability for these data either. What do you do? It is possible to conduct the survey according to the sample size that is feasible. After doing this you will know what the variability is. As a rule of thumb carrying out a survey of nutritional status requires at least 500 children and is probably not feasible for small studies. Dietary diversity can be assessed probably in the range of 50 to 100 individuals per subgroup. In the example below n1 and n2 are 72 for an example calculation.

**If you want to calculate the sample size more accurately, I would advise consulting a statistician!**

**Formulae for the calculation of sample size**

The sample size calculations are based on a normal distribution and the convention is to look for significance at the 0.05 level which means that you are 95% certain that there is a true difference between x and y estimates of your parameter.

Example of sample size calculation for the dietary diversity sample:

Sample size calculation from STATA.

sampsi 2.8 3.3, onesided power(0.8) alpha(0.05) sd1(1.2) sd2(1.2)

Estimated sample size for two-sample comparison of means

Test Ho: m1 = m2, where m1 is the mean in population 1

and m2 is the mean in population 2

Assumptions:

alpha = 0.0500 (one-sided)

power = 0.8000

m1 = 2.8

m2 = 3.3

sd1 = 1.2

sd2 = 1.2

n2/n1 = 1.00

Estimated required sample sizes:

n1 = 72

n2 = 72

A useful online Statistics tutorial is available if you want to find out more

Statistics Tutorial (tutorialspoint.com)

Resources

A very statistical resource:

Feed the Future Population-Based Survey Sampling Guide (usaid.gov)