|
|
Measures of dispersal/spread |
||||||||||||||||||||||||
|
|
|||||||||||||||||||||||||
|
Characteristics of Distributions Dispersion (Spread or Scatter) ![]() The Range For example: 2.3 4.1 5.2 6.9 8.8 9.4 have range = 9.4 - 2.3 = 7.1.Semi-Interquartile Range This is defined as: siqr = ½ (upper quartile - lower quartile) Again this is fairly easy to calculate, is not so easily affected by freak results and is useful for comparing the dispersion of similarly shaped distributions. Variance and Standard Deviation Consider the following two simple distributions: both have the same mean value yet the Nothing is gained by considering the mean of these differences as a measure of spread since This can be overcome by considering the mean of the numerical deviations of the observations from their mean, ie ignoring whether these deviations are negative or positive, defining
mean deviation = for the above distributions mean deviation for the However, this quantity is not suitable for algebraic manipulation and the elimination of the negative signs of the deviations is best achieved by squaring and then finding the mean of these squares, ie defining:
variance To obtain a measure of dispersion having the same units as the original variable we define
standard deviation Standard deviation for the standard deviation for the Again the statistics facilities of your calculator can be used to find the standard deviation. However, most calculators have two versions for the standard deviation. These are: (the one we have already seen) and Expression (1) is the standard deviation of a set of data values which constitute the totality of those values in which we are interested, ie the population. As already mentioned we are rarely able to study the population exhaustively so s can not often be calculated. Calculating s from all possible samples from a given population and then finding their average produces a value which is smaller than the population standard deviation. Consequently expression (1) is said to produce a biased estimate of the population standard deviation. It can be shown that changing the divisor n in expression (1) to n-1 to give expression (2) produces an estimate the standard deviation of a population, of which the n data values are a random sample, which is unbiased. Consequently s is the value usually calculated. Some texts use Below is a demonstration of bias in which 100 samples are taken from a uniform (0,1) distribution and ![]() Example Consider again the data on the thickness of the magnetic coating on the flexible disc, ie 973 975 976 977 976 980 981 977 979 976 Use your calculator to confirm that s, the estimated standard deviation of the population from which this sample is taken is = 2.40 microns. For grouped data the expressions for the standard deviation become and
Example
Use your calculator to obtain The Coefficient of Variation As a measure of variability the standard deviation has magnitude which depends on the magnitude of the data. The COEFFICIENT OF VARIATION expresses sample variability relative to
the mean of the sample. Since s and Example Summer 25.1 27.2 24.8 29.5 22.7 28.3 23.2 24.6 Winter 43.2 37.5 52.8 61.0 41.7 39.8 65.4 38.1 For summer For winter
|
|||||||||||||||||||||||||
| from SHU Science & Maths, 1998 | |||||||||||||||||||||||||