While searching my hard drive for images representative of data collection I stumbled upon something I’ve also had in mind to look for, which is guidance on how to determine the minimum required sample size. The formulas are usually variations on this equation:

n >= (*z* • σ / MOE)^{2}

where:

n = minimum sample size

*z* = z-score (e.g., 1.96 for 95% confidence interval)

σ = sample standard deviation

MOE = measure of effectiveness (e.g., difference between sample and population means in units of whatever you’re measuring)

The initial sample population should be at least 30. In theory this method only applies to data that is normally distributed, which a lot of data aren’t. Process times for many activities tends to be skewed left, where most of the values cluster to the low end with a long tail of higher values.

Other forms of this calculation are easily located via search.

A form of this calculation will be added to the simulation framework as the data collection capabilities are implemented. The data collection interface will provide ongoing estimates of required sample size as the sample data are collected, and will let the user know when the minimum number of data points is reached.