Back to the Table of Contents

Statistics: Probabilities and Distributions - Lesson 1

Fundamental Definitions for Probability

Lesson Overview

[Random] Experiment

An experiment is a method by which observations are made.

A famous example of an experiment is when Benjamin Franklin, famous American statesman and scientist, determined whether electricity is conducted. The experiment involved flying a kite in a thunder (and lightning) storm with a wire from the kite to a key in a bottle. (Don't try this at home!) (Also, questions have arisen as to whether or not he actually performed this experiment. It seems others did it earlier, only his son may have been present, and his journals don't support well this event occurring.) The experimental method is now the basis of the scientific method. In statistics we often refer to a random experiment, one for which there is no way of telling beforehand what the outcome will be.

The act of rolling a fair die, flipping an honest coin, or randomly selecting a card from a deck
are all considered random experiments.

An interesting part of mathematics is the use of common language to describe mathematical concepts. One such example is the word event. Normally, event conjures up images of special moments: the prom, banquets, fairs, weddings, births, .... In dealing with probability, event has a very precise meaning.

An event is the set of outcomes from a random experiment.
A simple event is an outcome which cannot be broken down.
The sample space is the set of all possible outcomes for a given experiment.

\     T         H    
 T   TT  HT 
 H   TH  HH 
As indicated above, flipping an honest coin is a random experiment—one has no way beforehand of predicting the outcome. The sample space is a set which contains all possible outcomes. For one flip the possible outcomes are heads (H) or tails (T). For one flip the sample space contains only these two outcomes. For two flips the four possible outcomes are HH, HT, TH, or TT. Thus the sample space is {HH, HT, TH, TT}, containing four elements. Notice the difference between the events HT (heads first) and TH (tails first). The outcome of a single flip is a simple event, whereas the outcome from more than one flip is a compound event.

Rolling a standard six-sided (fair) die once would have a sample space with six outcomes: 1, 2, 3, 4, 5, and 6. Rolling a pair of dice would have a sample space of six times six (62) or 36 possible outcomes. Let's construct below right the sample space of rolling a pair of dice. In each grid location (square) we must place both the indicated outcome of the green AND the indicated outcome of the red die.

\     1         2         3         4         5         6    
 1            
 2            
 3        (4,3   
 4      (3,4     
 5            
 6            
Notice that green=3 and red=4 differs from green=4 and red=3. These are like ordered pairs, with the first coordinate the horizontal component (green die) and the second coordinate the vertical (red die). [Note: this convention is in conflict with the convention of (row,column). Please be sure to generate these consistant with those already in the table.] Your homework will make further use of the outcome of the activity below by tying it in with the definition below. You will calculate various probabilities regarding the sum of pips (dots) on the two dice.

For some interactive web sites involving rolling dices, flipping or spinning coins check out these links. Be forewarned, however, that if cards or a roulette wheel are involved your internet search is likely to lead you to gambling sites (casinos) whose legality on the web has been and is being challenged due to its addictive nature and those many lives which have been ruined thereby.

Probability

Probability is denoted by P and specific events by A, B, or C.
The shorthand notation used to indicate the probability that event B occurs is P(B).
 
Empirical (Experimental) Definition of Probability:
P(A) = number of times A occurred divided by the times the experiment was repeated.

Classical Definition of Probability:
P(A) = number of event A outcomes divided by the size of the sample space.

The probability of something occurring is related to its frequency. Specifically, when a coin is flipped twice in succession, in 1 of the 4 possible outcomes heads appeared both times. Thus the probability was ¼ or 0.25. It is important to remember that the probability of A occurring is less than or equal to one. We have tacitly assumed here that the probability of heads is equal to that of tails. Experiments have been conducted to test this. In such a case, the probability would then be an experimental rather than a theoretical result.

An event with a probability of 0 is impossible.
An event with a probability of 1 is certain.
0 P(A) 1 for any event A.

Probabilities for random events might be computed exactly. In such case we express them as fractions. Other probabilities are obtained by experiment and are thus approximations which are typically expressed to three significant digits unless there are compelling reasons for more or less precision. Probabilities are often given as percentages. In such a case, certainty corresponds with 100% and impossibility with 0%. Be sure to include the percentage (%) symbol.

Probability can be approximated by frequency:
P(A) = number of times A occurred divided by number of times experiment is repeated.

We used the term fair above to describe coins or dies yielding an equal likelihood for any outcome. Thus a fair coin has a 50% of turning up heads and a 50% chance of turning up tails. This is often expressed at 50-50. Each of the two outcomes is equally likely and thus had a probability of ½. On rare occasions a coin might end up on its side, but generally we exclude such events from the set of outcomes we are considering. We would thus expect a six sided die to have a 1/6 probability for any face to be on top. Again, the rare chance of balancing on an edge or corner will generally be excluded, as will be outcomes where the result cannot be determined (such as the die falling into a black hole or sewer grate).

The Law of Large Numbers

If an experiment is repeated over and over, then the empirical probability approaches the actual probability.

The above statement is often stated as a theorem known as the Law of Large Numbers. Determining sample size is an exercise in optimizing tradeoffs in cost and accuracy. Large samples should be more accurate but will be more costly, whereas smaller samples cost less but provide less accuracy. Those who have not studied statistics tend to scoff at the idea that a survey of only 1000 (0.001%) people in this country of 100 million voters can give a good estimate of how many favor a particular candidate or position. Of course, if your sample is not random, biases will creep in, and accuracy will suffer. Later lessons will explore these concepts in greater detail.

Random Sample

In a random sample each element of the population has an equal chance of being chosen.
The term random sample is also used to denote a collection of outcomes that were selected through a representative process. Random samples and the concept of random selection is very important to inferential statistics. Impartial and unbiased sampling often requires careful and thoughtful planning. Such planning is extremely important—bad sample design has delayed many a degree completion.

BACK (TOC) HOMEWORK SOLUTIONS QUIZ NEXT