Back to the Table of Contents

Statistics: Probabilities and Distributions - Lesson 8

Distributions in General and Expected Value

Lesson Overview

Discrete vs Continuous

In our lessons on Descriptive Statistics we noted various measures of how a data set was distributed. Special emphasis was give to measures of central tendancy (averages and/or means) and measures of dispersion or how a data set is spread. Assumptions we can make about these important measures are useful in determining how correct the inferences we might try to make about a population are. A study of the common distributions is then in order and will occur over the next several lessons.

In general, distributions often have an overall shape, center, and spread. There may be outliers, or not. Tails (wings) may be thick or thin. The distribution may be skewed to the right or left. The purpose of descriptive statistics and exploratory data analysis was to quantify and/or get a feel for these distribution shapes. The normal distribution is often the "gold standard" by which data sets are compared.

Specifically, whether or not an observation is an outliers is, to some extent, a matter of judgment. An outlier is an individual observation that deviates from or falls outside the overall pattern. Outliers, like the old supreme court definition of pornography ("You know it when you see it.") can be hard to define. Distributions are commonly symmetric. That is, the right and left sides are approximately mirror images of each other. Even uniform or multimodal distributions can be symmetric. If they are not symmetric they are typically heap shaped or mound-shaped. We term a distribution skewed to the right if the right side extends much further out than the left side (usually the mean would then be to the right of the median) and skewed to the left if the left side extends much further out than the right side (usually the mean would then be to the left of the median). We wish not to get involved with the technical definition of skewness in terms of the third moment, or catalog exceptions to the mean/median heuristic above, but instead refer you to this site for details on both.

In Statistics lesson 1 we also noted that data can be discrete or continuous. Again, it can be hard to differentiate between the two due to quantum mechanics and uncertainties about measurement accuracy. Hence discrete distributions are commonly encountered and continuous distributions are at least possible mathematically. The normal distribution is the most important continuous distributions and whenever n is sufficiently large (generally over 30), we often make assumptions about a discrete distribution derived from the normal distribution but shown to be accurate enough.

First, all probabilities are between 0 and 1     (0 P(x) 1).
Second, all probabilities in a distribution sum to 1     ( P(x) = 1)
(i.e. it is certain your outcome is in the sample space).

We note above two fundamental rules regarding distributions.

Example: Test the following function to determine whether or not it is a probability distribution.
P(x) = (5 - x)/10 when x = 1, 2, 3, 4.
Solution:
x P(x)
1     2/5   = 0.40
2 3/10 = 0.30
3 1/5   = 0.20
4 1/10 = 0.10

It works! All probabilities are between zero and one and summing the last column gives 10/10 = 1.00

Probability Distributions of a Discrete Random Variable

Consider the probability distribution of two tossed (fair) coins (where heads is x). Note the mound shape.
x P(x)
0     ¼
1 ½
2 ¼

Consider further the pips displayed on a (fair) die:
x P(x)
1     1/6
2 1/6
3 1/6
4 1/6
5 1/6
6 1/6
This is a constant function or uniform probability distribution.

Normal Distributions

We will continue by assuming the student remembers several fact about the normal distribution which were reviewed before this series of lessons. Specifically, the normal distribution is also known as the bell-shaped or gaussian distribution. It is symmetric. If the mean is 0 and the standard deviation is 1, we have a standard normal distribution. If the mean is not 0 or the standard deviation is not 1, we have a non-standard normal distribution. IQ values with a mean of 100 and standard deviation of 15 are a typical example of a non-standard, approximately normal distribution which we often treat as if it were normally distributed. We use z-scores to convert non-standard normal distributions to the standard normal distribution. The empirical rule states that 68% of normally distributed data falls within 1 standard deviation of the mean, 95% falls within 2 standard deviations of the mean, and 99.7% falls within 3 standard deviations of the mean. In fact, your TI-83+ graphing calculator has the "error function" (erf) programmed in under DISTR (2nd VARS) and normalcdf(lower,upper), where lower and upper are the limits of the region of interest. Tables of values are also commonly available and the ability to read and interpret them is important as well.

The table below gives values for the area between z=0 and z=?, where the final z is initially read down, then the value at the top of the column is added. Alternately, the value at the top of the column can be viewed as the second digit. Such tables may clarify why z scores are so typically reported to two decimal places! Warning: Although every effort has been made to verify these numbers (on a TI-83 graphing calculator), errors may still be present. Also, the table is somewhat incomplete due to lack of space.

zx.x0x.x1x.x2x.x3x.x4x.x5x.x6x.x7x.x8x.x9
0.0x.0000.0040.0080.0120.0160.0199.0239.0279.0319.0359
0.1x.0398.0438.0478.0517.0557.0596.0636.0675.0714.0753
0.2x.0793.0832.0871.0910.0948.0987.1026.1064.1103.1141
0.3x.1179.1217.1255.1293.1331.1368.1406.1443.1480.1517
0.4x.1554.1591.1628.1664.1700.1736.1772.1808.1844.1879
0.5x.1915.1950.1985.2019.2054.2088.2123.2157.2190.2224
0.6x.2257.2291.2324.2357.2389.2422.2454.2486.2517.2549
0.7x.2580.2611.2642.2673.2704.2734.2764.2794.2823.2852
0.8x.2881.2910.2939.2967.2995.3023.3051.3078.3106.3133
0.9x.3159.3186.3212.3238.3264.3289.3315.3340.3365.3389
 
1.0x.3413.3438.3461.3485.3508.3531.3554.3577.3599.3621
1.1x.3643.3665.3686.3708.3729.3749.3770.3790.3810.3830
1.2x.3849.3869.3888.3907.3925.3944.3962.3980.3997.4015
1.3x.4032.4049.4066.4082.4099.4115.4131.4147.4162.4177
1.4x.4192.4207.4222.4236.4251.4265.4279.4292.4306.4319
1.5x.4332.4345.4357.4370.4382.4394.4406.4418.4429.4441
1.6x.4452.4463.4474.4484.4495.4505.4515.4525.4535.4545
1.7x.4554.4564.4573.4582.4591.4599.4608.4616.4625.4633
1.8x.4641.4649.4656.4664.4671.4678.4686.4693.4699.4706
1.9x.4713.4719.4726.4732.4738.4744.4750.4756.4761.4767
 
2.0x.4772.4778.4783.4788.4793.4798.4803.4808.4812.4817
3.0x.4987.4987.4987.4988.4988.4989.4989.4989.4990.4990

Example: Find the probability for a data value to fall between the mean (z=0.00) and one standard deviation (z=1.00) above the mean, assuming the population is normally distributed.
Solution: The table above gives the value 0.3413 or 34.13%. This is the same as what the empirical rule gives (68÷2).

Example: Find the probability for IQ values between 75 and 130, assuming a normal distribution, mean = 100 and std = 15.
Solution: An IQ of 75 corresponds with a z score of -1.67 and an IQ of 130 corresponds with a z score of 2.00. We can read the value for -1.67 by remembering that the normal distribution is symmetric and then reading the value of .4525 off the table. For 2.00 we find .4772. The probability of an IQ between 75 and 130 is the same as the probability of an IQ between 75 and 100 plus the probability of an IQ between 100 and 130 or between 100 and 125 (75) plus the probability of an IQ between 100 and 130 or .4525+.4772=.9297. Including a sketch like in Statistics lesson 6 would be appropriate.

Expected Value

Let's look at a specific distribution so we can introduce the topic of expected value. Consider the discrete random variable of the sum of pips on two rolled dies. If a random sample is taken, as our sample becomes larger, it becomes clear that the random variable, x takes on any integer value between two and twelve, inclusive. The distribution is likely to become mound-shaped, especially if n is larger than, say, 100. In theory, we would expect our distribution to approach the distribution of 1/36 for two, 2/36 for three, up to 6/36 for seven, 5/36 for eight, and down to 1/36 for twelve. See lesson 1 for the complete sample space. In practice, however, the dies could be weighted on one side, out of square, or even slightly rounded to skew the results. It would be easier, of course, to roll each die separately and verify that its distribution is uniform, with each value occurring with a probability close to 1/6--see example above. However, such separation of variables is not always possible in the real world. This is often referred to as confounding variables. However, the question might arise, as to how big a sample must be taken before we can be "sure" something is amiss. Of course, random variables being what they are, in theory one could roll a million sixes in a row. But also in theory, the probability of this occurring is rather microscopically vanishing. Hence we typically set a threshhold as to how often we would like to be right. 95% is a typical threshhold for non-life threatening situations, whereas 99% or higher is a typical threshhold if more confidence is needed. We refer to these as a 95% confidence level or a 99% confidence level. These correspond with alpha=.05 and alpha=.01. More on this topic later. As important as these concepts are, we have wandered away from our goal.

We can calculate the expected value for total pips by summing the product of the value with the frequency. Thus 2•1/36+3•2/36+...12•1/36 = 252/36 = 7.00. The value we obtain is the expected value. In this case, it is also the mode.

Example: Find the expected value given the two coin distribution discussed above.
Solution: x takes on the values 0, 1, or 2 with frequency ¼, ½, and ¼. E=0•¼ + 1•½ + 2•¼ = 0+½+½=1.00. Thus we expect one head when throwing two coins.

Example: Find the expected value given the one die distribution discussed above.
Solution: x takes on the values one through six with equal probability of 1/6. (1+2+3+4+5+6)•1/6=21/6=3.5. Thus we expect 3.5 pips when throwing a fair, six-sided die. Obviously, since pips are discrete, we can't expect 3.5 pips on any one roll!

T. OF CONTENTS HOMEWORK SOLUTIONS ACTIVITY CONTINUE