Basic Probability Rules
- Rules of Probability
- Probability Rule One (For any event A, 0 ≤ P(A) ≤ 1)
- Probability Rule Two (The sum of the probabilities of all possible outcomes is 1)
- Probability Rule Three (The Complement Rule)
- Probabilities Involving Multiple Events
- Probability Rule Four (Addition Rule for Disjoint Events)
- Finding P(A and B) using Logic
- Probability Rule Five (The General Addition Rule)
- Rounding Rule of Thumb for Probability
- Let’s Summarize
In the previous section, we introduced probability as a way to quantify the uncertainty that arises from conducting experiments using a random sample from the population of interest.
We saw that the probability of an event (for example, the event that a randomly chosen person has blood type O) can be estimated by the relative frequency with which the event occurs in a long series of trials. So we would collect data from lots of individuals to estimate the probability of someone having blood type O.
In this section, we will establish the basic methods and principles for finding probabilities of events.
We will also cover some of the basic rules of probability which can be used to calculate probabilities.
We will begin with a classical probability example of tossing a fair coin three times.
Since heads and tails are equally likely for each toss in this scenario, each of the possibilities which can result from three tosses will also be equally likely so that we can list all possible values and use this list to calculate probabilities.
Since our focus in this course is on data and statistics (not theoretical probability), in most of our future problems we will use a summarized dataset, usually a frequency table or two-way table, to calculate probabilities.
- Note that in event C, “Getting at least one head” there is only one possible outcome which is missing, “Getting NO heads” = TTT. We will address this again when we talk about probability rules, in particular the complement rule. At this point, we just want you to think about how these two events are “opposites” in this scenario.
It is VERY important to realize that just because we can list out the possible outcomes, this does not imply that each outcome is equally likely.
This is the (funny) message in the Daily Show clip we provided on the previous page. But let’s think about this again. In that clip, Walter is claiming that since there are two possible outcomes, the probability is 0.5. The two possible outcomes are
- The world will be destroyed due to use of the large hadron collider
- The world will NOT be destroyed due to use of the large hadron collider
Hopefully it is clear that these two outcomes are not equally likely!!
Let’s consider a more common example.
Now we move on to learning some of the basic rules of probability.
Fortunately, these rules are very intuitive, and as long as they are applied systematically, they will let us solve more complicated problems; in particular, those problems for which our intuition might be inadequate.
Since most of the probabilities you will be asked to find can be calculated using both
- logic and counting
- the rules we will be learning,
we give the following advice as a principle.
Our first rule simply reminds us of the basic property of probability that we’ve already learned.
The probability of an event, which informs us of the likelihood of it occurring, can range anywhere from 0 (indicating that the event will never occur) to 1 (indicating that the event is certain).
NOTE: One practical use of this rule is that it can be used to identify any probability calculation that comes out to be more than 1 (or less than 0) as incorrect.
Before moving on to the other rules, let’s first look at an example that will provide a context for illustrating the next several rules.
This example illustrates our second rule, which tells us that the probability of all possible outcomes together must be 1.
This is a good place to compare and contrast what we’re doing here with what we learned in the Exploratory Data Analysis (EDA) section.
- Notice that in this problem we are essentially focusing on a single categorical variable: blood type.
- We summarized this variable above, as we summarized single categorical variables in the EDA section, by listing what values the variable takes and how often it takes them.
- In EDA we used percentages, and here we’re using probabilities, but the two convey the same information.
- In the EDA section, we learned that a pie chart provides an appropriate display when a single categorical variable is involved, and similarly we can use it here (using percentages instead of probabilities):
Probability Rule Three
In probability and in its applications, we are frequently interested in finding out the probability that a certain event will not occur.
- Such a visual display is called a “Venn diagram.” A Venn diagram is a simple way to visualize events and the relationships between them using rectangles and circles.
Rule 3 deals with the relationship between the probability of an event and the probability of its complement event.
Given that event A and event “not A” together make up all possible outcomes, and since rule 2 tells us that the sum of the probabilities of all possible outcomes is 1, the following rule should be quite intuitive:
- Note that the Complement Rule, P(not A) = 1 – P(A) can be re-formulated as P(A) = 1 – P(not A).
- P(not A) = 1 – P(A)
- can be re-formulated as P(A) = 1 – P(not A).
- This seemingly trivial algebraic manipulation has an important application, and actually captures the strength of the complement rule.
- In some cases, when finding P(A) directly is very complicated, it might be much easier to find P(not A) and then just subtract it from 1 to get the desired P(A).
- We will come back to this comment soon and provide additional examples.
- The complement rule can be useful whenever it is easier to calculate the probability of the complement of the event rather than the event itself.
- Notice, we again used the phrase “at least one.”
- Now we have seen that the complement of “at least one …” is “none … ” or “no ….” (as we mentioned previously in terms of the events being “opposites”).
- In the above activity we see that
- P(NONE of these two side effects) = 1 – P(at least one of these two side effects )
- This is a common application of the complement rule which you can often recognize by the phrase “at least one” in the problem.
We will often be interested in finding probabilities involving multiple events such as
- P(A or B) = P(event A occurs or event B occurs or both occur)
- P(A and B)= P(both event A occurs and event B occurs)
A common issue with terminology relates to how we usually think of “or” in our daily life. For example, when a parent says to his or her child in a toy store “Do you want toy A or toy B?”, this means that the child is going to get only one toy and he or she has to choose between them. Getting both toys is usually not an option.
Having said that, it should be noted that there are some cases where it is simply impossible for the two events to both occur at the same time.
The distinction between events that can happen together and those that cannot is an important one.
Here are two examples:
On the other hand …
The Venn diagrams suggest that another way to think about disjoint versus not disjoint events is that disjoint events do not overlap. They do not share any of the possible outcomes, and therefore cannot happen together.
On the other hand, events that are not disjoint are overlapping in the sense that they share some of the possible outcomes and therefore can occur at the same time.
We now begin with a simple rule for finding P(A or B) for disjoint events.
- When dealing with probabilities, the word “or” will always be associated with the operation of addition; hence the name of this rule, “The Addition Rule.”
- The Addition Rule for Disjoint Events can naturally be extended to more than two disjoint events. Let’s take three, for example. If A, B and C are three disjoint events
then P(A or B or C) = P(A) + P(B) + P(C). The rule is the same for any number of disjoint events.
We are now finished with the first version of the Addition Rule (Rule four) which is the version restricted to disjoint events. Before covering the second version, we must first discuss P(A and B).
We now turn to calculating
- P(A and B)= P(both event A occurs and event B occurs)
Later, we will discuss the rules for calculating P(A and B).
First, we want to illustrate that a rule is not needed whenever you can determine the answer through logic and counting.
There is one special case for which we know what P(A and B) equals without applying any rule.
So, if events A and B are disjoint, then (by definition) P(A and B)= 0. But what if the events are not disjoint?
Recall that rule 4, the Addition Rule, has two versions. One is restricted to disjoint events, which we’ve already covered, and we’ll deal with the more general version later in this module. The same will be true of probabilities involving AND
However, except in special cases, we will rely on LOGIC to find P(A and B) in this course.
Before covering any formal rules, let’s look at an example where the events are not disjoint.
We like to ask probability questions similar to the previous example (using a two-way table based upon data) as this allows you to make connections between these topics and helps you keep some of what you have learned about data fresh in your mind.
We are now ready to move on to the extended version of the Addition Rule.
In this section, we will learn how to find P(A or B) when A and B are not necessarily disjoint.
- We’ll call this extended version the “General Addition Rule” and state it as Probability Rule Five.
We will begin by stating the rule and providing an example similar to the types of problems we generally ask in this course. Then we will present a more another example where we do not have the raw data from a sample to work from.
As we witnessed in previous examples, when the two events are not disjoint, there is some overlap between the events.
- If we simply add the two probabilities together, we will get the wrong answer because we have counted some “probability” twice!
- Thus, we must subtract out this “extra” probability to arrive at the correct answer. The Venn diagram and the two-way tables are helpful in visualizing this idea.
This rule is more general since it works for any pair of events (even disjoint events). Our advice is still to try to answer the question using logic and counting whenever possible, otherwise, we must be extremely careful to choose the correct rule for the problem.
Notice that, if A and B are disjoint, then P(A and B) = 0 and rule 5 reduces to rule 4 for this special case.
Let’s revisit the last example:
Let’s look at one final example to illustrate Probability Rule 5 when the rule is needed – i.e. when we don’t have actual data.
Follow the following general guidelines in this course. If in doubt carry more decimal places. If we specify give exactly what is requested.
Many computer packages might display extremely small values using scientific notation such as
- 58×10-5 or 1.58 E-5 to represent 0.0000158
So far in our study of probability, you have been introduced to the sometimes counter-intuitive nature of probability and the fundamentals that underlie probability, such as a relative frequency.
We also gave you some tools to help you find the probabilities of events — namely the probability rules.
You probably noticed that the probability section was significantly different from the two previous sections; it has a much larger technical/mathematical component, so the results tend to be more of the “right or wrong” nature.
In the Exploratory Data Analysis section, for the most part, the computer took care of the technical aspect of things, and our tasks were to tell it to do the right thing and then interpret the results.
In probability, we do the work from beginning to end, from choosing the right tool (rule) to use, to using it correctly, to interpreting the results.
Here is a summary of the rules we have presented so far.
1. Probability Rule #1 states:
- For any event A, 0 ≤ P(A) ≤ 1
2. Probability Rule #2 states:
- The sum of the probabilities of all possible outcomes is 1
3. The Complement Rule (#3) states that
- P(not A) = 1 – P(A)
or when rearranged
- P(A) = 1 – P(not A)
The latter representation of the Complement Rule is especially useful when we need to find probabilities of events of the sort “at least one of …”
4. The General Addition Rule (#5) states that for any two events,
- P(A or B) = P(A) + P(B) – P(A and B),
where, by P(A or B) we mean P(A occurs or B occurs or both).
In the special case of disjoint events, events that cannot occur together, the General Addition Rule can be reduced to the Addition Rule for Disjoint Events (#4), which is
- P(A or B) = P(A) + P(B). *
*ONLY use when you are CONVINCED the events are disjoint (they do NOT overlap)
5. The restricted version of the addition rule (for disjoint events) can be easily extended to more than two events.
6. So far, we have only found P(A and B) using logic and counting in simple examples