|
Average most often refers to the
arithmetic mean, but is actually ambiguous and may be used to also refer to the mode, median, or midrange. |
You should always clarify which average is being used, preferrably by using a more specific term. Averages give us information about a typical element of a data set. They are measures of central tendency.
|
Mean most often refers to the
arithmetic mean, but is also ambiguous. Unless specified otherwise, we will assume arithmetic mean whenever the term mean is used. |
| The
Arithmetic Mean is
obtained by summing all elements of the data set and dividing by the number of elements. |
Symbolically, the arithmetic mean is expressed as
where
(pronounced "x-bar") is the arithmetic mean for a sample and
is the capital Greek letter sigma and
indicates summation.
xi refers to each element of the data set as i ranges
from 1 to n. n is the number of elements in the data set.
The equation is essentially the same for finding a population mean;
however, the symbol for the population mean is the
small Greek letter µ (mu).
As we will also see in lesson 5,
Roman letters usually represent sample statistics,
whereas Greek letters usually represent population parameters.
| Sample Size is the number of elements in a sample. It is referred to by the symbol n. |
Be sure to use a lower case n for sample size. An upper case N refers to Population Size, unless being used in the context of a normally distributed population.
| Mode is the data element which occurs most frequently. |
A useful mnemonic is to alliterate the words mode and most. Alliterations start with the same sound like: "seven slippery slimy snakes...".
Some data sets contain no repeated elements. In this case, there is no mode (or the mode is the empty set). It is also possible for two or more elements to be repeated with the same frequency. In these cases, there are two or more modes and the data set is said to be bimodal or multimodal. In the rare instance of a uniform or nearly uniform distribution, one where each element is repeated the same or nearly the same number of times, one could term it multimodal, but some authors invoke subjectivity by specifying multimodality only when separate, distinct, and fairly high peaks (ignoring fluctuations due to randomness) occur.
| The Median is the middle element when the data set is arranged in order of magnitude. |
A useful mnemonic is to remember that the median is the grassy strip (in the rural area of the midwest where I come from) that divides opposing lanes in a highway. It is in the middle.
If there are an odd number of data elements, the median is a member of the data set. If there are an even number of data elements, the median is computed as the arithmetic mean of the middle two.
The median has other names which will be studied in
lesson 7.
The symbol
(pronounced "x-tilde") is sometimes used for the median,
but will not be used here.
| The Midrange is the arithmetic mean of the highest and lowest data elements. |
Midrange is a type of average. Range is a measure of dispersion and will be studied in lesson 5. A common mistake is to confuse the two.
Symbolically, midrange is computed as (xmax+xmin)/2
Some basic facts regarding averages are as follows.
The midrange and possibly the median are the arithmetic mean of two data set elements. One additional significant digit may be necessary to accurately convey this information.
The number of significant digits for the mean should conform to one of the following rules.
| Presenting more than five significant digits is probably a joke and points will be deducted! |
In 1894 the physicist Michelson apparently quoting Kelvin said: "it seems probable that most of the grand underlying principles have now been firmly established and that further advances are to be sought chiefly in the rigorous application of these principles to all the phenomena which come under our notice....future truths of physical science are to be looked for in the sixth place of decimals." Relativity and quantum mechanics soon revolutionalized physics and we soon were looking at details in the ninth place! My dissertation, reported results of the cesium D1 transition centroid frequency as: 335 116 048 748.2(2.4) kHz.
17. What is the average of: 1, 1, 2, 4, 7?
As we have seen in this lecture, this is a rather ambiguous question and the answers 1 (mode), 2 (median), 3.0 (mean), and 4.0 (midrange) are all possible and correct!
Example: A sample of size 5 (n=5) is taken of student quiz scores with the following results: 1, 7, 8, 9, 10.
Answer: The mean is (1+7+8+9+10)/5 = 35/5 = 7.0 (note one more decimal place is given).
All scores occur only once, hence there is no mode. The median score is 8 (not 8.0). The midrange is (10+1)/2 = 5.5 (note the extra decimal place is required).
An extreme score (1) distorts the mean so perhaps the median is a better measure of central tendency. For a larger data set, this could be further defined in terms of skewness (median and generally mean to the left of (negatively skewed), right of (positively skewed), or same as (zero skewness) the mode) and symmetry of the data set. It is more common to be positively skewed, since exceptionally large values are easier to obtain due to lower limits. A case in point would be annual earnings. Our left tail is cut off by zero, whereas our right tail is extremely skewed by the likes of Bill Gates and Warren Buffett.
Further examples involving the TI-83+ graphing calculator will be given with the data presented as Stem-and-Leaf Diagrams and Frequency Distribution Tables.
| BACK | HOMEWORK | ACTIVITY | CONTINUE |
|---|