Chebyshev had a theorem. He said that for any set of observations, the minimum proportion of values that lie withing k standard deviations of the mean is 1- (1/k^2), as long as k is greater than 1. If k is 3, 89% of the observations lie withing the region and if k is 4, 94% of observations lie within the region.
For a normal probability distribution we need to use a normal curve, or a bell curve. It has a single peak in the centre of the distribution. This centre point is where the mean equals the median equals the mode. We can now introduce a new concept of z-values. A z-value is the distance between a selected value (Xi) and the population mean, divided by the population standard deviation. Another note on the normal curve is that is has a Kurtosis of 0. A higher Kurtosis means it's peak is higher and more pointy, a lower Kurtosis means it's a flatter shape.
Back to the z scores. They link together the theoretical normal distribution to the observed observations. It tells us how many standard deviations away from the mean an observation lies. So, we need to calculate the z-score. When calculating the z score it is essentially converting your data into a distribution with a mean of 0 and a normal curve shape. The formula is as follows: