Back to the Table of Contents

Statistics: Probabilities and Distributions - Lesson 9

The Binomial Distribution and Experiments

Lesson Overview

Probability distributions may be either discrete or continuous. The normal (Gaussian) and Lorentzian distributions are good examples of continuous distributions—the random variable can take on any value. Examples of discrete distributions include the Binomial, the Hypergeometric, and the Poisson. We will concentrate on the Binomial and its cousin the Hypergeometric today, and defer discussion on its distant relative, the Poisson, until later.

What Makes a Binomial Experiment?

The requirements to be a binomial experiments are as follows:
  1. There must be a fixed number of trials.
  2. Trials must be independent. One trial's outcome cannot affect the probabilities of other trials.
  3. All outcomes of trials must be in one of two categories.
  4. Probabilities must remain constant for each trial.

The prefix bi- has the usual meaning of two in this context, just like bicycle, bifocal, and bigamist. This distribution is related to what happens when you study the expansion of the binomial (1+x)n. Here it means there are two and only two distinct categories. For instance, students either pass or they fail a test. In dining out at fast food restaurants, people either have or haven't eaten at McDonald's. Requirement 2 specifically implies with replacement if we are selecting something, unless the change of not replacing it is slight.

Some notation has become very standard when working with binomial distributions. S (success) and F (failure) denote possible categories for all outcomes; whereas, p and q=1-p denote the probabilities P(S) and P(F), respectively. The term success may not necessarily be what you would call a desirable result. For example, you may want to find the probability of finding a defective chip, given the probability 0.2 that a chip is defective. Here the term success might actually represent the process of selecting a defective chip. The important thing here is to correlate P(S) with p. Some authors avoid q, but the formulae seem clearer using it rather than the awkward expression 1-p.

  • P(S) = p.
  • P(F) = q = 1-p.
  • n indicates the fixed number of trials.
  • x indicates the number of successes (any whole number [0,n]).
  • p indicates the probability of success for any one trial.
  • q indicates the probability of failure (not success) for any one trial.
  • P(x) indicate the probability of getting exactly x successes in n trials.

The Binomial Formula.

The formula for calculating P(x) is as follows:
P(x) = nCxpxqn-x where x = 0, 1, 2,..., n

Here nCx has the usual definition as entries from Pascal's Triangle and can be defined in terms of n! divided by (x! • (n-x)!). The symbol !, the factorial symbol, has already been introduced as shorthand for the product of all the natural numbers up to that number. Thus, 4!=4 · 3 · 2 · 1 = 24. By definition and convention, 0!=1. Note that if p=q=½, the distribution will be symmetric due to the symmetry in Pascal's Triangle.

Example: Find the probability of having five left-handed students in a class of twenty-five, given p=0.1 (n = 25, x = 5, p = 0.1).
Solution: P(5) = (25! ÷(20! · 5!)) •(0.1)5 • (0.9)20 = 0.064593.

Thus, the probability that 5 of the 25 students will be left-handed is about 6%. You should all already have the program BINOMIAL on your calculator and be able to recognize and calculate such probabilities. As usual, it is important to set up your solution logically. Carefully identify the important values (n, x, p, etc.) before cranking out the numbers and presenting your answer. The TI-83/84 series calculators also have BINOMPDF which, if given the two arguments of n and p, in that order, will output a list of n+1 probabilities for each value of x, with the first one being for x=0. BINOMCDF is similar but gives cumulative frequency. Both are under the 2nd VARS or DISTR button (entries 0 and A, so you may need to scroll down).

More Formulas for the Binomial Distribution.

It can be shown that the mean, variance, and standard deviation of a binomial distribution can be expressed in simple formulae as follows:
  • mean: [mu]=np
  • variance: [sigma]2 = npq
  • std. dev.: [sigma] = [square root] (npq)

Example: 20 coins are flipped and each coin has a probability of 50% of coming up heads. Find the mean and standard deviation for this binomial experiment.
Solution: n=20, p=½, so q=½. [mu]=n · p = 20 · ½ = 10. This is as expected, we expect heads to come up about half the time. [sigma] = [square root] (n · p · q) = [square root] (20 · ½ · ½) = [square root] 5 [sigma] 2.236.

Example: Again assume 20 coins are flipped and each coin has a probability of 50% of coming up heads. This time calculate the probability of getting exactly 10 heads.
Solution: n=20, p=½, so q=½, and x=10. P(x)=20C10/220 = 184756/1048576 [sigma] 0.1762.

The Hypergeometric Distribution.

Often sampling will be done without replacement from a small finite population. A classic example might be a lottery where 6 different numbers from 54 are selected. Because of the lack of replacement we no longer have independence, thus our probabilities are not constant for each trial. However, the other conditions of the binomial are met. This is a classic application of the hypergeometric distribution.

If a population has A objects of one type and B objects of the other type, and if n objects are sampled without replacement, then the probability of getting x objects of type A and n-x objects of type B is:
P(x) = ACx · BCn-x ÷ A+BCn

We already encountered this formula when we found the probability for left-handers and soup cans in lesson 3!

Example: A typical state lottery allows a person to select 6 different numbers from 1 to 54 inclusive. Later, a 6-number combination is selected as winning. Various similar results are also awarded prizes. To get the probability of matching all 6 winning numbers, set A=6; B=48; n=6; and x=6. To find the probability of matching exactly 5 winning numbers, leave A and B unchanged and set x=5. The probability of not matching any numbers would be similar with x=0.
Solution: is left as homework questions.

If the population is large compared to the sample size (maybe more than 10 times, that is to say, the sample is less than 10% of the population), the hypergeometric is usually approximated by the binomial and approximated well.

Normal Approximation for the Binomial Distribution.

For n > 69, one quickly finds 70! exceeds 1099 or the limit on many calculators. Historically, n > 57 exceeds 1663 or the limit on most mainframe computers during the 1960's and 1970's. Thus alternatives were often used when calculating probabilities for such large values of n. In the precomputer age, large tables were constructed to look up probabilities.

It is instructive to examine the binomial distribution for large n and note how it compares with the normal distribution, especially when p=q= ½. As n increases, the probability distribution for values of p and q even further away from ½ looks approximately normal. The common rule is that you can approximate the binomial with the normal when np and nq both exceed some magic number. That magic number is variously stated as 5, 10 or 15, depending on the conservative nature of the statistician, the higher the magic number, the more conservative the statistician. For these notes we will adopt the value 10.

Approximate a binomial distribution by the normal when both np > 10 and nq > 10.

Since the normal distribution is continuous and the binomial distribution is discrete, we often must apply a continuity correction. That is to say, x is no longer represented by a single value, but takes on a range of values from x-0.5 to x+0.5.

Example: This first example is on the edge of our magic number. Calculate the probability of getting 10 heads when 20 fair coins are flipped, but using the normal approximation to the binomial.
Solution: n=20, x=10, p=q=½ and as noted above, np=nq=20×½=10. Using the continuity correction: x-0.5 to x+0.5 and values for the mean (10) and standard deviation (2.236) calculated above, we find z-scores of -0.2236 and +0.2236. Using either (-0.2236,0.2236) under DISTR on your TI-83+ graphing calculators, or a table of values, we obtain an answer of 0.1769 or 0.1742, which compare favorable with the 0.1762 we obtained before. ( gives the cumulative area under the normal distribution function between the two z-values given.

Example: Based on U.S. Census data, 12% of U.S. men have earned bachelor's degrees. If 150 U.S. men are randomly selected, find the probability that at least 25 of them have a bachelor's degree.
Solution: n=150; p=0.12; x > 24.5. Thus the mean is np=150 · 0.12 = 18; and the standard deviation is [square root](150 · 0.12 · 0.88) = 3.98. We quickly note that nq is bigger than np since q is bigger than p and note that both are larger than 10. We can thus calculate a z-score of: (24.5 - 18) ÷ 3.98 = 1.63. It is because of the continuity correction that 24.5 is used. We can thus calculate the area under the normal curve by (1.63,9E99) as 0.05. It can be accessed via DISTR (2nd VARS) on the TI-83+ calculator.

A JAVA applet to run further examples (and read someone else's notes) can be found here.

T. OF CONTENTS HOMEWORK SOLUTION ACTIVITY CONTINUE