Back to the Table of Contents
An Introduction to Statistics
Review of Statistics Lessons 5 and 6
Lesson 5: Measures of Dispersion
- Dispersion is how a data set is distributed.
- Common measures of dispersion are range, standard deviation, and variance.
- Range is the difference between the highest and lowest data element.
- Range is easily distorted, due to its use of but two elements.
- Standard deviation is by far the most important measure of dispersion.
- Standard deviation is the average distance of each data element from the mean.
- The formula for standard deviation varies depending on whether it is for a sample or a population.
- Sample standard deviation is denoted by s, whereas population standard deviation is denoted by
.
- This use of Roman characters for sample and Greek charcters for population is standard.
- The sample standard deviation is slightly larger because of the dependance on the sample mean.
- Degrees of freedom is an important statistic in any statistical study.
- Standard deviation comes as the square root of the variance.
- Standard deviation has the same units as the data so can be easier to understand.
- In general, the range of a sample is about four times its standard deviation (range rule of thumb).
- Three is the smallest sample size where standard deviation is meaningful.
- Variance is a primary statistic, standard deviation is derived, be careful with precision/accuracy.
Lesson 6: The Normal, Bell-shaped, Gaussian Distribution
- The Normal Distribution has two other names: Gaussian, Bell-shaped.
- Error distributions and many other phenomena tend toward a normal distribution.
- The normal distribution is symmetric.
- A standard normal distribution has an area of 1, mean of 0, and standard deviation of 1.
- The empirical rule is based on the normal distribution of 68%-95%-99.7% of
a data set being within 1, 2 or 3 standard deviations of the mean.
- IQ scores with mean of 100 and standard deviation of 15 are a common nonstandard example.
- The thin parts of a distribution are called tails.
- Statistics can be interested in one tail, the left tail or the right tail, or both.
- The Math & Science Center draws students from the upper tail of the IQ curve.
- Whereas Blossomland draws students from the lower tail of the IQ curve.
- In theory, the tails are of infinite extent.
- In practice, the tails are especially difficult to measure.
- Chebyshev's Theorem applies to any distribution.
- Chebyshev's Theorem guarantees that 1-1/K2 of the data to be within K
standard deviations of the mean, for K > 1.