Measures of central tendency
Characteristics of Distributions
Central Tendency (or Location)
If the observations are grouped, ie occurs times, occurs time, etc then
This situation arises in two cases:
(i) when the classes are individual discrete values as in the radiation data:
The easiest way to perform the calculation is to use the statistical facilities of a scientific calculator. Find the "mean" button on your calculator, usually marked , and make sure you know how to use it! Consider the following examples.
Thickness 973 975 976 977 976 980 981 977 979 976
The mean thickness is given by 977 microns
(ii) We have already met the following data on the heights of trees
In this case 172.7 cm.
(a) Discrete Distribution
2, 5, 6, 8, 13, 15, 19, 22, 38 have median 13(b) Continuous Distributions
The median divides the area under the frequency distribution diagram (ie the histogram) into two equal parts. Whilst there is a special formula for calculating the median for a grouped frequency distribution it is probably easiest to estimate it from the ogive by drawing a line across at the 50% level on the % cumulative frequency (vertical) scale to the curve and then down to the horizontal scale to read off the estimate (see the picture above of the ogive).
The main difference between the median and the mean is that the median is insensitive to extreme values since, unlike the arithmetic mean, it does not take into account the actual value of each observation, but only considers the rank of each measurement.
The median is useful in such areas as lifetime testing of components.
For a distribution on a continuous variable the quartiles are, like the median, easy to estimate from the ogive by drawing lines across at 25% and 75% on the vertival scale and then down to the horizontal scale and reading them off (again see the picture above of the ogive).
For example, for the data on the height of nine year old trees, the modal class is 169.5 -199.5(cm) so that 184.5 cm would be taken as the mode.
The disadvantage with the mode as a measure of location is that it is not always unique, ie a distribution can have more than one mode.
For grouped data the mode is not uniquely defined, since changing the class intervals may give different maximum frequencies.
Dispersion (Spread or Scatter)
For example: 2.3 4.1 5.2 6.9 8.8 9.4 have range = 9.4 - 2.3 = 7.1.Semi-Interquartile Range
This is defined as: siqr = ½ (upper quartile - lower quartile)
Again this is fairly easy to calculate, is not so easily affected by freak results and is useful for comparing the dispersion of similarly shaped distributions.
|from SHU Science & Maths, 1998|