Review: Unit 1 Case C-C
In the last section, we established some of the basic rules of probability, which included:
In order to complete our set of rules, we still require two Multiplication Rules for finding P(A and B) and the important concepts of independent events and conditional probability.
We’ll first introduce the idea of independent events, then introduce the Multiplication Rule for independent events which gives a way to find P(A and B) in cases when the events A and B are independent.
Next we will define conditional probability and use it to formalize our definition of independent events, which is initially presented only in an intuitive way.
We will then develop the General Multiplication Rule, a rule that will tell us how to find P(A and B) in cases when the events A and B are not necessarily independent.
We’ll conclude with a discussion of probability applications in the health sciences.
We begin with a verbal definition of independent events (later we will use probability notation to define this more precisely).
Independent Events:
Here are a few examples:
A woman’s pocket contains two quarters and two nickels.
She randomly extracts one of the coins and, after looking at it, replaces it before picking a second coin.
Let Q1 be the event that the first coin is a quarter and Q2 be the event that the second coin is a quarter.
Are Q1 and Q2 independent events?
Since the first coin that was selected is replaced, whether or not Q1 occurred (i.e., whether the first coin was a quarter) has no effect on the probability that the second coin will be a quarter, P(Q2).
In either case (whether Q1 occurred or not), when she is selecting the second coin, she has in her pocket:
and therefore the P(Q2) = 2/4 = 1/2 regardless of whether Q1 occurred.
A woman’s pocket contains two quarters and two nickels.
She randomly extracts one of the coins, and without placing it back into her pocket, she picks a second coin.
As before, let Q1 be the event that the first coin is a quarter, and Q2 be the event that the second coin is a quarter.
Are Q1 and Q2 independent events?
Since the first coin that was selected is not replaced, whether Q1 occurred (i.e., whether the first coin was a quarter) does affect the probability that the second coin is a quarter, P(Q2).
If Q1 occurred (i.e., the first coin was a quarter), then when the woman is selecting the second coin, she has in her pocket:
However, if Q1 has not occurred (i.e., the first coin was not a quarter, but a nickel), then when the woman is selecting the second coin, she has in her pocket:
In these last two examples, we could actually have done some calculation in order to check whether or not the two events are independent or not.
Sometimes we can just use common sense to guide us as to whether two events are independent. Here is an example.
Two people are selected simultaneously and at random from all people in the United States.
Let B1 be the event that one of the people has blue eyes and B2 be the event that the other person has blue eyes.
In this case, since they were chosen at random, whether one of them has blue eyes has no effect on the likelihood that the other one has blue eyes, and therefore B1 and B2 are independent.
On the other hand …
A family has 4 children, two of whom are selected at random.
Let B1 be the event that one child has blue eyes, and B2 be the event that the other chosen child has blue eyes.
In this case, B1 and B2 are not independent, since we know that eye color is hereditary.
Thus, whether or not one child is blue-eyed will increase or decrease the chances that the other child has blue eyes, respectively.
Comments:
The idea of disjoint events is about whether or not it is possible for the events to occur at the same time (see the examples on the page for Basic Probability Rules).
The idea of independent events is about whether or not the events affect each other in the sense that the occurrence of one event affects the probability of the occurrence of the other (see the examples above).
The following activity deals with the distinction between these concepts.
The purpose of this activity is to help you strengthen your understanding about the concepts of disjoint events and independent events, and the distinction between them.
Let’s summarize the three parts of the activity:
Why did we leave out the case when the events are disjoint and independent?
The reason is that this case DOES NOT EXIST!
A and B Independent | A and B Not Independent | |
A and B Disjoint | DOES NOT EXIST | Example 3 |
A and B Not Disjoint | Example 1 | Example 2 |
Why is that?
Now that we understand the idea of independent events, we can finally get to rules for finding P(A and B) in the special case in which the events A and B are independent.
Later we will present a more general version for use when the events are not necessarily independent.
We now turn to rules for calculating
beginning with the multiplication rule for independent events.
Using a Venn diagram, we can visualize “A and B,” which is represented by the overlap between events A and B:
Probability Rule Six (The Multiplication Rule for Independent Events):
Comment:
Recall the blood type example:
Two people are selected simultaneously and at random from all people in the United States.
What is the probability that both have blood type O?
We need to find P(O1 and O2)
Since they were chosen simultaneously and at random, the blood type of one has no effect on the blood type of the other. Therefore, O1 and O2 are independent, and we may apply Rule 6:
Comments:
P(A or B) = P(A) + P(B) for disjoint events,
and a Multiplication Rule that says
P(A and B) = P(A) * P(B) for independent events.
The purpose of this comment is to point out the magnitude of P(A or B) and of P(A and B) relative to either one of the individual probabilities.
Since probabilities are never negative, the probability of one event or another is always at least as large as either of the individual probabilities.
Since probabilities are never more than 1, the probability of one event and another generally involves multiplying numbers that are less than 1, therefore can never be more than either of the individual probabilities.
Here is an example:
Consider the event A that a randomly chosen person has blood type A.
Modify it to a more general event — that a randomly chosen person has blood type A or B — and the probability increases.
Modify it to a more specific (or restrictive) event — that not just one randomly chosen person has blood type A, but that out of two simultaneously randomly chosen people, person 1 will have type A and person 2 will have type B — and the probability decreases.
It is important to mention this in order to root out a common misconception.
Practically, you can use this comment to check yourself when solving problems.
For example, if you solve a problem that involves “or,” and the resulting probability is smaller than either one of the individual probabilities, then you know you have made a mistake somewhere.
Comment:
As you’ve seen, the last three rules that we’ve introduced (the Complement Rule, the Addition Rules, and the Multiplication Rule for Independent Events) are frequently used in solving problems.
Before we move on to our next rule, here are two comments that will help you use these rules in broader types of problems and more effectively.
Comment:
Three people are chosen simultaneously and at random.
What is the probability that all three have blood type B?
We’ll use the usual notation of B1, B2 and B3 for the events that persons 1, 2 and 3 have blood type B, respectively.
We need to find P(B1 and B2 and B3). Let’s solve this one together:
Here is another example that might be quite surprising.
A fair coin is tossed 10 times. Which of the following two outcomes is more likely?
(a) HHHHHHHHHH
(b) HTTHHTHTTH
In fact, they are equally likely. The 10 tosses are independent, so we’ll use the Multiplication Rule for Independent Events:
Here is the idea:
Our random experiment here is tossing a coin 10 times.
Therefore,
since there is only one possible outcome which gives all heads
and many possible outcomes which give 5 heads and 5 tails
IMPORTANT Comments:
Now we will introduce the concept of conditional probability.
The idea here is that the probabilities of certain events may be affected by whether or not other events have occurred.
The term “conditional” refers to the fact that we will have additional conditions, restrictions, or other information when we are asked to calculate this type of probability.
Let’s illustrate this idea with a simple example:
All the students in a certain high school were surveyed, then classified according to gender and whether they had either of their ears pierced:
(Note that this is a two-way table of counts that was first introduced when we talked about the relationship between two categorical variables.
It is not surprising that we are using it again in this example, since we indeed have two categorical variables here:
Suppose a student is selected at random from the school.
Since a student is chosen at random from the group of 500 students, out of which 324 are pierced,
Since a student is chosen at random from the group of 500 students, out of which 180 are male,
Since a student is chosen at random from the group of 500 students out of which 36 are male and have their ear(s) pierced,
Now something new:
At this point, new notation is required, to express the probability of a certain event given that another event holds.
We will write
A word about this new notation:
We call this probability the
Now to find the probability, we observe that choosing from only the males in the school essentially alters the sample space from all students in the school to all male students in the school.
The total number of possible outcomes is no longer 500, but has changed to 180.
Out of those 180 males, 36 have ear(s) pierced, and thus:
A good visual illustration of this conditional probability is provided by the two-way table:
which shows us that conditional probability in this example is the same as the conditional percents we calculated back in section 1. In the above visual illustration, it is clear we are calculating a row percent.
Consider the piercing example, where the following two-way table is given,
Recall also that M represents the event of being a male (“not M” represents being a female), and E represents the event of having one or both ears pierced.
Another way to visualize conditional probability is using a Venn diagram:
In both the two-way table and the Venn diagram,
The two-way table illustrates the idea via counts, while the Venn diagram converts the counts to probabilities, which are presented as regions rather than cells.
We may work with counts, as presented in the two-way table, to write
Or we can work with probabilities, as presented in the Venn diagram, by writing
We will want, however, to write our formal expression for conditional probabilities in terms of other, ordinary, probabilities and therefore the definition of conditional probability will grow out of the Venn diagram.
Notice that
Probability Rule Seven (Conditional Probability Rule):
Comments:
Let’s see how we can use this formula in practice:
On the “Information for the Patient” label of a certain antidepressant, it is claimed that based on some clinical trials,
(a) Suppose that the patient experiences insomnia; what is the probability that the patient will also experience headache?
Since we know (or it is given) that the patient experienced insomnia, we are looking for P(H | I).
According to the definition of conditional probability:
(b) Suppose the drug induces headache in a patient; what is the probability that it also induces insomnia?
Here, we are given that the patient experienced headache, so we are looking for P(I | H).
Using the definition
Comment:
Now that we have introduced conditional probability, try the interactive demonstration below which uses a Venn diagram to illustrate the basic probabilities we have been discussing.
Now you can investigate the conditional probabilities as well.
As we saw in the Exploratory Data Analysis section, whenever a situation involves more than one variable, it is generally of interest to determine whether or not the variables are related.
In probability, we talk about independent events, and earlier we said that two events A and B are independent if event A occurring does not affect the probability that event B will occur.
Now that we’ve introduced conditional probability, we can formalize the definition of independence of events and develop four simple ways to check whether two events are independent or not.
We will introduce these “independence checks” using examples, and then summarize.
Consider again the two-way table for all 500 students in a particular high school, classified according to gender and whether or not they have one or both ears pierced.
Would you expect those two variables to be related?
To answer this, we may compare the overall probability of having pierced ears to the conditional probability of having pierced ears, given that a student is male.
Our intuition would tell us that the latter should be lower:
Indeed, for students in general, the probability of having pierced ears (event E) is
But the probability of having pierced ears given that a student is male is only
As we anticipated, P(E | M) is lower than P(E).
The probability of a student having pierced ears changes (in this case, gets lower) when we know that the student is male, and therefore the events E and M are dependent.
Remember, if E and M were independent, knowing or not knowing that the student is male would not have made a difference … but it did.
The previous example illustrates that one method for determining whether two events are independent is to compare P(B | A) and P(B).
Similarly, using the same reasoning, we can compare P(A | B) and P(A).
Recall the side effects activity (from the bottom of the page Basic Probability Rules.).
On the “Information for the Patient” label of a certain antidepressant, it is claimed that based on some clinical trials,
Are the two side effects independent of each other?
To check whether the two side effects are independent, let’s compare P(H | I) and P(H).
In the previous part of this section, we found that
Knowing that a patient experienced insomnia increases the likelihood that he/she will also experience headache from 0.26 to 0.357.
The conclusion therefore is that the two side effects are not independent, they are dependent.
Alternatively, we could have compared P(I | H) to P(I).
Again, since the two are not equal, we can conclude that the two side effects I and H are dependent.
Comment:
An alternative method of checking for dependence would be to compare P(E | M) with P(E | not M) [same as P(E | F)].
In our case, P(E | M) = 36/180 = 0.2, while P(E | not M) = 288/320 = 0.9, and since the two are very different, we can say that the events E and M are not independent.
In general, another method for checking the independence of events A and B is to compare P(B | A) and P(B | not A).
In other words, two events are independent if the probability of one event does not change whether we know that the other event has occurred or we know that the other event has not occurred.
It can be shown that P(B | A) and P(B | not A) would differ whenever P(B) and P(B | A) differ, so this is another perfectly legitimate way to establish dependence or independence.
Before we establish a general rule for independence, let’s consider an example that will illustrate another method that we can use to check whether two events are independent:
A group of 100 college students were surveyed about their gender and whether they had decided on a major.
Offhand, we wouldn’t necessarily have any compelling reason to expect that deciding on a major would depend on a student’s gender.
We can check for independence by comparing the overall probability of being decided to the probability of being decided given that a student is female:
The fact that the two are equal tells us that, as we might expect, deciding on a major is independent of gender.
Now let’s approach the issue of independence in a different way: first, we may note that the overall probability of being decided is 45/100 = 0.45.
And the overall probability of being female is 60/100 = 0.60.
If being decided is independent of gender, then 45% of the 60% of the class who are female should have a decided major;
in other words, the probability of being female and decided should equal the probability of being female multiplied by the probability of being decided.
If the events F and D are independent, we should have P(F and D) = P(F) * P(D).
In fact, P(F and D) = 27/100 = 0.27 = P(F) * P(D) = 0.45 * 0.60.
This confirms our alternate verification of independence.
In general, another method for checking the independence of events A and B is to
Let’s summarize all the possible methods we’ve seen for checking the independence of events in one rule:
Tests for Independent Events: Two events A and B are independent if any one of the following hold:
Comment:
The purpose of the next activity is to practice checking the independence of two events using the four different possible methods that we’ve provided, and see that all of them will lead us to the same conclusion, regardless of which of the four methods we use.
Now that we have an understanding of conditional probabilities and can express them with concise notation, and have a more formal understanding of what it means for two events to be independent, we can finally establish the General Multiplication Rule, a formal rule for finding P(A and B) that applies to any two events, whether they are independent or dependent.
We begin with an example that contrasts P(A and B) for independent and dependent cases.
Suppose you pick two cards at random from four cards consisting of one of each suit: club, diamond, heart, and spade, where the first card is replaced before the second card is picked.
What is the probability of picking a club and then a diamond?
Because the sampling is done with replacement, whether or not a diamond is picked on the second selection is independent of whether or not a club has been picked on the first selection.
Rule 6, the multiplication rule for independent events, tells us that:
Here we denote the event “club picked on first selection” as C1 and the event “diamond picked on second selection” as D2.
The display below shows that 1/4 of the time we’ll pick a club first, and of these times, 1/4 will result in a diamond on the second pick: 1/4 * 1/4 = 1/16 of the selections will have a club first and then a diamond.
Suppose you pick two cards at random from four cards consisting of one of each suit: club, diamond, heart, and spade, without replacing the first card before the second card is picked.
What is the probability of picking a club and then a diamond?
The probability in this case is not 1/4 * 1/4 = 1/16.
The probability of a club and then a diamond is 1/4*1/3=1/12.
Using the notation of conditional probabilities, we can write
For independent events A and B, we had the rule P(A and B) = P(A) * P(B).
Due to independence, to find the probability of A and B, we could multiply the probability of A by the simple probability of B, because the occurrence of A would have no effect on the probability of B occurring.
Now, for events A and B that may be dependent, to find the probability of A and B, we multiply the probability of A by the conditional probability of B, taking into account that A has occurred.
Thus, our general multiplication rule is stated as follows:
General Multiplication Rule – Probability Rule Eight:
Comments:
Let’s look at another, more realistic example:
In a certain region, one in every thousand people (0.001) is infected by the HIV virus that causes AIDS.
Let H denote the event of having HIV, and T the event of testing positive.
(a) Express the information that is given in the problem in terms of the events H and T.
(b) Use the General Multiplication Rule to find the probability that someone chosen at random from the population has HIV and tests positive.
(c) If someone has HIV, what is the probability of testing negative? Here we need to find P(not T | H).
The purpose of the next activity is to give you guided practice in expressing information in terms of conditional probabilities, and in using the General Multiplication Rule.
This section introduced you to the fundamental concepts of independent events and conditional probability — the probability of an event given that another event has occurred.
We saw that sometimes the knowledge that another event has occurred has no impact on the probability (when the two events are independent), and sometimes it does (when the two events are not independent).
We further discussed the idea of independence and discussed different ways to check whether two events are independent or not.
Understanding the concept of conditional probability also allowed us to introduce our final probability rule, the General Multiplication Rule.
The General Multiplication Rule tells us how to find P(A and B) when A and B are not necessarily independent.
]]>In the previous section, we introduced probability as a way to quantify the uncertainty that arises from conducting experiments using a random sample from the population of interest.
We saw that the probability of an event (for example, the event that a randomly chosen person has blood type O) can be estimated by the relative frequency with which the event occurs in a long series of trials. So we would collect data from lots of individuals to estimate the probability of someone having blood type O.
In this section, we will establish the basic methods and principles for finding probabilities of events.
We will also cover some of the basic rules of probability which can be used to calculate probabilities.
We will begin with a classical probability example of tossing a fair coin three times.
Since heads and tails are equally likely for each toss in this scenario, each of the possibilities which can result from three tosses will also be equally likely so that we can list all possible values and use this list to calculate probabilities.
Since our focus in this course is on data and statistics (not theoretical probability), in most of our future problems we will use a summarized dataset, usually a frequency table or two-way table, to calculate probabilities.
Let’s list each possible outcome (or possible result):
{HHH, THH, HTH, HHT, HTT, THT, TTH, TTT}
Now let’s define the following events:
Event A: “Getting no H”
Event B: “Getting exactly one H”
Event C: “Getting at least one H”
Note that each event is indeed a statement about the outcome that the experiment is going to produce. In practice, each event corresponds to some collection (subset) of the possible outcomes.
Event A: “Getting no H” → TTT
Event B: “Getting exactly one H” → HTT, THT, TTH
Event C: “Getting at least one H” → HTT, THT, TTH, THH, HTH, HHT, HHH
Here is a visual representation of events A, B and C.
From this visual representation of the events, it is easy to see that event B is totally included in event C, in the sense that every outcome in event B is also an outcome in event C. Also, note that event A stands apart from events B and C, in the sense that they have no outcome in common, or no overlap. At this point these are only noteworthy observations, but as you’ll discover later, they are very important ones.
What if we added the new event:
Event D: “Getting a T on the first toss” → THH, THT, TTH, TTT
How would it look if we added event D to the diagram above? (Link to the answer)
Remember, since H and T are equally likely on each toss, and since there are 8 possible outcomes, the probability of each outcome is 1/8.
See if you can answer the following questions using the diagrams and/or the list of outcomes for each event along with what you have learned so far about probability.
If you were able to answer those questions correctly, you likely have a good instinct for calculating probability! Read on to learn how we will apply this knowledge.
If not, we will try to help you develop this skill in this section.
Comment:
It is VERY important to realize that just because we can list out the possible outcomes, this does not imply that each outcome is equally likely.
This is the (funny) message in the Daily Show clip we provided on the previous page. But let’s think about this again. In that clip, Walter is claiming that since there are two possible outcomes, the probability is 0.5. The two possible outcomes are
Hopefully it is clear that these two outcomes are not equally likely!!
Let’s consider a more common example.
Suppose we randomly select three children and we are interested in the probability that none of the children have any birth defects.
We use the notation D to represent a child was born with a birth defect and N to represent the child born with NO birth defect. We can list the possible outcomes just as we did for the coin toss, they are:
{DDD, NDD, DND, DDN, DNN, NDN, NND, NNN}
Are the events DDD (all three children are born with birth defects) and NNN (none of the children are born with birth defects) equally likely?
It should be reasonable to you that P(NNN) is much larger than P(DDD).
This is because P(N) and P(D) are not equally likely events.
It is rare (certainly not 50%) for a randomly selected child to be born with a birth defect.
Now we move on to learning some of the basic rules of probability.
Fortunately, these rules are very intuitive, and as long as they are applied systematically, they will let us solve more complicated problems; in particular, those problems for which our intuition might be inadequate.
Since most of the probabilities you will be asked to find can be calculated using both
and
we give the following advice as a principle.
PRINCIPLE:
If you can calculate a probability using logic and counting you do not NEED a probability rule (although the correct rule can always be applied)
Our first rule simply reminds us of the basic property of probability that we’ve already learned.
The probability of an event, which informs us of the likelihood of it occurring, can range anywhere from 0 (indicating that the event will never occur) to 1 (indicating that the event is certain).
Probability Rule One:
NOTE: One practical use of this rule is that it can be used to identify any probability calculation that comes out to be more than 1 (or less than 0) as incorrect.
Before moving on to the other rules, let’s first look at an example that will provide a context for illustrating the next several rules.
As previously discussed, all human blood can be typed as O, A, B or AB.
In addition, the frequency of the occurrence of these blood types varies by ethnic and racial groups.
According to Stanford University’s Blood Center (bloodcenter.stanford.edu), these are the probabilities of human blood types in the United States (the probability for type A has been omitted on purpose):
Motivating question for rule 2: A person in the United States is chosen at random. What is the probability of the person having blood type A?
Answer: Our intuition tells us that since the four blood types O, A, B, and AB exhaust all the possibilities, their probabilities together must sum to 1, which is the probability of a “certain” event (a person has one of these 4 blood types for certain).
Since the probabilities of O, B, and AB together sum to 0.44 + 0.1 + 0.04 = 0.58, the probability of type A must be the remaining 0.42 (1 – 0.58 = 0.42):
This example illustrates our second rule, which tells us that the probability of all possible outcomes together must be 1.
Probability Rule Two:
The sum of the probabilities of all possible outcomes is 1.
This is a good place to compare and contrast what we’re doing here with what we learned in the Exploratory Data Analysis (EDA) section.
Even though what we’re doing here is indeed similar to what we’ve done in the EDA section, there is a subtle but important difference between the underlying situations
In probability and in its applications, we are frequently interested in finding out the probability that a certain event will not occur.
An important point to understand here is that “event A does not occur” is a separate event that consists of all the possible outcomes that are not in A and is called “the complement event of A.”
Notation: we will write “not A” to denote the event that A does not occur. Here is a visual representation of how event A and its complement event “not A” together represent all possible outcomes.
Comment:
Rule 3 deals with the relationship between the probability of an event and the probability of its complement event.
Given that event A and event “not A” together make up all possible outcomes, and since rule 2 tells us that the sum of the probabilities of all possible outcomes is 1, the following rule should be quite intuitive:
Probability Rule Three (The Complement Rule):
Back to the blood type example:
Here is some additional information:
What is the probability that a randomly chosen person cannot donate blood to everyone? In other words, what is the probability that a randomly chosen person does not have blood type O? We need to find P(not O). Using the Complement Rule, P(not O) = 1 – P(O) = 1 – 0.44 = 0.56. In other words, 56% of the U.S. population does not have blood type O:
Clearly, we could also find P(not O) directly by adding the probabilities of B, AB, and A.
Comment:
Comments:
We will often be interested in finding probabilities involving multiple events such as
A common issue with terminology relates to how we usually think of “or” in our daily life. For example, when a parent says to his or her child in a toy store “Do you want toy A or toy B?”, this means that the child is going to get only one toy and he or she has to choose between them. Getting both toys is usually not an option.
In contrast:
In probability, “OR” means either one or the other or both.
and so P(A or B) = P(event A occurs or event B occurs or BOTH occur)
Having said that, it should be noted that there are some cases where it is simply impossible for the two events to both occur at the same time.
The distinction between events that can happen together and those that cannot is an important one.
Disjoint: Two events that cannot occur at the same time are called disjoint or mutually exclusive. (We will use disjoint.)
It should be clear from the picture that
Here are two examples:
Consider the following two events:
A — a randomly chosen person has blood type A, and
B — a randomly chosen person has blood type B.
In rare cases, it is possible for a person to have more than one type of blood flowing through his or her veins, but for our purposes, we are going to assume that each person can have only one blood type. Therefore, it is impossible for the events A and B to occur together.
On the other hand …
Consider the following two events:
A — a randomly chosen person has blood type A
B — a randomly chosen person is a woman.
In this case, it is possible for events A and B to occur together.
The Venn diagrams suggest that another way to think about disjoint versus not disjoint events is that disjoint events do not overlap. They do not share any of the possible outcomes, and therefore cannot happen together.
On the other hand, events that are not disjoint are overlapping in the sense that they share some of the possible outcomes and therefore can occur at the same time.
We now begin with a simple rule for finding P(A or B) for disjoint events.
Probability Rule Four (The Addition Rule for Disjoint Events):
Comment:
Recall the blood type example:
Here is some additional information
What is the probability that a randomly chosen person is a potential donor for a person with blood type A?
From the information given, we know that being a potential donor for a person with blood type A means having blood type A or O.
We therefore need to find P(A or O). Since the events A and O are disjoint, we can use the addition rule for disjoint events to get:
It is easy to see why adding the probability actually makes sense.
If 42% of the population has blood type A and 44% of the population has blood type O,
This reasoning about why the addition rule makes sense can be visualized using the pie chart below:
Comment:
then P(A or B or C) = P(A) + P(B) + P(C). The rule is the same for any number of disjoint events.
We are now finished with the first version of the Addition Rule (Rule four) which is the version restricted to disjoint events. Before covering the second version, we must first discuss P(A and B).
We now turn to calculating
Later, we will discuss the rules for calculating P(A and B).
First, we want to illustrate that a rule is not needed whenever you can determine the answer through logic and counting.
Special Case:
There is one special case for which we know what P(A and B) equals without applying any rule.
So, if events A and B are disjoint, then (by definition) P(A and B)= 0. But what if the events are not disjoint?
Recall that rule 4, the Addition Rule, has two versions. One is restricted to disjoint events, which we’ve already covered, and we’ll deal with the more general version later in this module. The same will be true of probabilities involving AND
However, except in special cases, we will rely on LOGIC to find P(A and B) in this course.
Before covering any formal rules, let’s look at an example where the events are not disjoint.
We like to ask probability questions similar to the previous example (using a two-way table based upon data) as this allows you to make connections between these topics and helps you keep some of what you have learned about data fresh in your mind.
We are now ready to move on to the extended version of the Addition Rule.
In this section, we will learn how to find P(A or B) when A and B are not necessarily disjoint.
We will begin by stating the rule and providing an example similar to the types of problems we generally ask in this course. Then we will present a more another example where we do not have the raw data from a sample to work from.
Probability Rule Five:
NOTE: It is best to use logic to find P(A and B), not another formula.
A VERY common error is incorrectly applying the multiplication rule for independent events covered on the next page. This will only be correct if A and B are independent (see definitions to follow) which is rarely the case in data presented in two-way tables.
As we witnessed in previous examples, when the two events are not disjoint, there is some overlap between the events.
This rule is more general since it works for any pair of events (even disjoint events). Our advice is still to try to answer the question using logic and counting whenever possible, otherwise, we must be extremely careful to choose the correct rule for the problem.
PRINCIPLE:
If you can calculate a probability using logic and counting you do not NEED a probability rule (although the correct rule can always be applied)
Notice that, if A and B are disjoint, then P(A and B) = 0 and rule 5 reduces to rule 4 for this special case.
Let’s revisit the last example:
Consider randomly selecting one individual from those represented in the following table regarding the periodontal status of individuals and their gender. Periodontal status refers to gum disease where individuals are classified as either healthy, have gingivitis, or have periodontal disease.
Let’s review what we have learned so far. We can calculate any probability in this scenario if we can determine how many individuals satisfy the event or combination of events.
We also previously found that
Recall rule 5, P(A or B) = P(A) + P(B) – P(A and B). We now use this rule to calculate P(Male OR Healthy)
We solved this question earlier by simply counting how many individuals are either Male or Healthy or both. The picture below illustrates the values we need to combine. We need to count
Using this logical approach we would find
We have a minor difference in our answers in the last decimal place due the rounding that occurred when we calculated P(Male), P(Healthy), and P(Male and Healthy) and then applied rule 5.
Clearly the answer is effectively the same, about 70%. If we carried our answers to more decimal places or if we used the original fractions, we could eliminate this small discrepancy entirely.
Let’s look at one final example to illustrate Probability Rule 5 when the rule is needed – i.e. when we don’t have actual data.
It is vital that a certain document reach its destination within one day. To maximize the chances of on-time delivery, two copies of the document are sent using two services, service A and service B. It is known that the probabilities of on-time delivery are:
The Venn diagrams below illustrate the probabilities P(A), P(B), and P(A and B) [not drawn to scale]:
In the context of this problem, the obvious question of interest is:
The document will reach its destination on time as long as it is delivered on time by service A or by service B or by both services. In other words, when event A occurs or event B occurs or both occur. so….
P(on time delivery using this strategy)= P(A or B), which is represented the by the shaded region in the diagram below:
We can now
This is shown in the following image:
If we apply this to our example, we find that:
So our strategy of using two delivery services increases our probability of on-time delivery to 0.95.
While the Venn diagrams were great to visualize the General Addition Rule, in cases like these it is much easier to display the information in and work with a two-way table of probabilities, much as we examined the relationship between two categorical variables in the Exploratory Data Analysis section.
We will simply show you the table, not how we derive it as you won’t be asked to do this for us. You should be able to see that some logic and simple addition/subtraction is all we used to fill in the table below.
When using a two-way table, we must remember to look at the entire row or column to find overall probabilities involving only A or only B.
Comment
Follow the following general guidelines in this course. If in doubt carry more decimal places. If we specify give exactly what is requested.
Many computer packages might display extremely small values using scientific notation such as
So far in our study of probability, you have been introduced to the sometimes counter-intuitive nature of probability and the fundamentals that underlie probability, such as a relative frequency.
We also gave you some tools to help you find the probabilities of events — namely the probability rules.
You probably noticed that the probability section was significantly different from the two previous sections; it has a much larger technical/mathematical component, so the results tend to be more of the “right or wrong” nature.
In the Exploratory Data Analysis section, for the most part, the computer took care of the technical aspect of things, and our tasks were to tell it to do the right thing and then interpret the results.
In probability, we do the work from beginning to end, from choosing the right tool (rule) to use, to using it correctly, to interpreting the results.
Here is a summary of the rules we have presented so far.
1. Probability Rule #1 states:
2. Probability Rule #2 states:
3. The Complement Rule (#3) states that
or when rearranged
The latter representation of the Complement Rule is especially useful when we need to find probabilities of events of the sort “at least one of …”
4. The General Addition Rule (#5) states that for any two events,
where, by P(A or B) we mean P(A occurs or B occurs or both).
In the special case of disjoint events, events that cannot occur together, the General Addition Rule can be reduced to the Addition Rule for Disjoint Events (#4), which is
*ONLY use when you are CONVINCED the events are disjoint (they do NOT overlap)
5. The restricted version of the addition rule (for disjoint events) can be easily extended to more than two events.
6. So far, we have only found P(A and B) using logic and counting in simple examples
]]>Now that we understand how probability fits into the Big Picture as a key element behind statistical inference, we are ready to learn more about it. Our first goal is to introduce some fundamental terminology (the language) and notation that is used when discussing probability.
Although most of the probability calculations we will conduct will be rather intuitive due to their simplicity, we start with two fun examples that will illustrate the interesting and sometimes complex nature of probability.
Often, relying only on our intuition is not enough to determine probability, so we’ll need some tools to work with, which is exactly what we’ll study in this section.
Here is the first of two motivating examples:
“Let’s Make a Deal” was the name of a popular television game show, which first aired in the 1960s. The “Let’s Make a Deal” Paradox is named after that show. In the show, the contestant had to choose between three doors. One of the doors had a big prize behind it such as a car or a lot of cash, and the other two were empty. (Actually, for entertainment’s sake, each of the other two doors had some stupid gift behind it, like a goat or a chicken, but we’ll refer to them here as empty.)
The contestant had to choose one of the three doors, but instead of revealing the chosen door, the host revealed one of the two unchosen doors to be empty. At this point of the game, there were two unopened doors (one of which had the prize behind it) — the door that the contestant had originally chosen and the remaining unchosen door.
The contestant was given the option either to stay with the door that he or she had initially chosen, or switch to the other door.
What do you think the contestant should do, stay or switch? What do you think is the probability that you will win the big prize if you stay? What about if you switch?
In order for you to gain a feel for this game, you can play it a few times using an applet.
Now, what do you think a contestant should do?
The intuition of most people is that the chance of winning is equal whether we stay or switch — that there is a 50-50 chance of winning with either selection. This, however, is not the case.
Actually, there is a 67% chance — or a probability of 2/3 (2 out of three) — of winning by switching, and only a 33% chance — or a probability of 1/3 (1 out of 3) — of winning by staying with the door that was originally chosen.
This means that a contestant is twice as likely to win if he/she switches to the unchosen door. Isn’t this a bit counterintuitive and confusing? Most people think so, when they are first faced with this problem.
We will now try to explain this paradox to you in two different ways:
If you are still not convinced (or even if you are), here is a different way of explaining the paradox:
If this example still did not persuade you that probability is not always intuitive, the next example should definitely do the trick.
Suppose that you are at a party with 59 other people (for a total of 60). What are the chances (or, what is the probability) that at least 2 of the 60 guests share the same birthday?
To clarify, by “share the same birthday,” we mean that 2 people were born on the same date, not necessarily in the same year. Also, for the sake of simplicity, ignore leap years, and assume that there are 365 days in each year.
Indeed, there is a 99.4% chance that at least 2 of the 60 guests share the same birthday. In other words, it is almost certain that at least 2 of the guests share the same birthday. This is very counterintuitive.
Unlike the “Let’s Make a Deal” example, for this scenario, we don’t really have a good step-by-step explanation that will give you insight into this surprising answer.
From these two examples, (maybe) you have seen that your original hunches cannot always be counted upon to give you correct predictions of probabilities.
We won’t think any more about these examples as they are from the “harder” end of the complexity spectrum but hopefully they have motivated you to learn more about probability and you do not need to be convinced of their solution to continue!
In general, probability is not always intuitive.
Watch this (funny) video which has an excellent point about “how probability DOES NOT work”: clip from the Daily Show with Jon Stewart about the Large Hadron Collider (5:58).
It is possible viewers in other countries may not be able to view the clip from this source. You may or may not be able to find it online through searching. Here is the transcript summary I sometimes use in class to get the point across (it isn’t quite as funny but I think you can still figure out what is wrong here):
And … John Oliver is correct! :-)
Eventually we will need to develop a more formal approach to probability, but we will begin with an informal discussion of what probability is.
Probability is a mathematical description of randomness and uncertainty. It is a way to measure or quantify uncertainty. Another way to think about probability is that it is the official name for “chance.”
One way to think of probability is that it is the likelihood that something will occur.
Probability is used to answer the following types of questions:
Each of these examples has some uncertainty. For some, the chances are quite good, so the probability would be quite high. For others, the chances are not very good, so the probability is quite low (especially winning the lottery).
Certainly, the chance of rain is different each day, and is higher during some seasons. Your chance of having a heart attack, or of living longer than 70 years, depends on things like your current age, your family history, and your lifestyle. However, you could use your intuition to predict some of those probabilities fairly accurately, while others you might have no instinct about at all.
We think you will agree that the word probability is a bit long to include in equations, graphs and charts, so it is customary to use some simplified notation instead of the entire word.
If we wish to indicate “the probability it will rain tomorrow,” we use the notation “P(rain tomorrow).” We can abbreviate the probability of anything. If we let A represent what we wish to find the probability of, then P(A) would represent that probability.
We can think of “A” as an “event.”
NOTATION | MEANING |
P(win lottery) | the probability that a person who has a lottery ticket will win that lottery |
P(A) | the probability that event A will occur |
P(B) | the probability that event B will occur |
What values can the probability of an event take, and what does the value tell us about the likelihood of the event occurring?
Many people prefer to express probability in percentages. Since all probabilities are decimals, each can be changed to an equivalent percentage. Thus, the latest principle is equivalent to saying, “The chance that an event will occur is between 0% and 100%.”
Probabilities can be determined in two fundamental ways. Keep reading to find out what they are.
There are 2 fundamental ways in which we can determine probability:
Classical methods are used for games of chance, such as flipping coins, rolling dice, spinning spinners, roulette wheels, or lotteries.
The probabilities in this case are determined by the game (or scenario) itself and are often found relatively easily using logic and/or probability rules.
Although we will not focus on this type of probability in this course, we will mention a few examples to get you thinking about probability and how it works.
A coin has two sides; we usually call them “heads” and “tails.”
For a “fair” coin (one that is not unevenly weighted, and does not have identical images on both sides) the chances that a “flip” will result in either side facing up are equally likely.
Thus, P(heads) = P(tails) = 1/2 or 0.5.
Letting H represent “heads,” we can abbreviate the probability: P(H) = 0.5.
Classical probabilities can also be used for more realistic and useful situations.
A practical use of a coin flip would be for you and your roommate to decide randomly who will go pick up the pizza you ordered for dinner. A common expression is “Let’s flip for it.” This is because a coin can be used to make a random choice with two options. Many sporting events begin with a coin flip to determine which side of the field or court each team will play on, or which team will have control of the ball first.
Each traditional (cube-shaped) die has six sides, marked in dots with the numbers 1 through 6.
On a “fair” die, these numbers are equally likely to end up face-up when the die is rolled.
Thus, P(1) = P(2) = P(3) = P(4) = P(5) = P(6) = 1/6 or about 0.167.
Here, again, is a practical use of classical probability.
Suppose six people go out to dinner. You want to randomly decide who will pick up the check and pay for everyone. Again, the P(each person) = 1/6.
This particular spinner has three colors, but each color is not equally likely to be the result of a spin, since the portions are not the same size.
Since the blue is half of the spinner, P(blue) = 1/2. The red and yellow make up the other half of the spinner and are the same size. Thus, P(red) = P(yellow) = 1/4.
Suppose there are 2 freshmen, 1 sophomore, and one junior in a study group. You want to select one person. The P(F) = 2/4 = 1/2; P(S) = 1/4; and P(J) = 1/4, just like the spinner.
Suppose we had three students and wished to select one of them randomly. To do this you might have each person write his/her name on a (same-sized) piece of paper, then put the three papers in a hat, and select one paper from the hat without looking.
Since we are selecting randomly, each is equally likely to be chosen. Thus, each has a probability of 1/3 of being chosen.
A slightly more complicated, but more interesting, probability question would be to propose selecting 2 of the students pictured above, and ask, “What is the probability that the two students selected will be different genders?”
We will now shift our discussion to empirical ways to determine probabilities.
A single flip of a coin has an uncertain outcome. So, every time a coin is flipped, the outcome of that flip is unknown until the flip occurs.
However, if you flip a fair coin over and over again, would you expect P(H) to be exactly 0.5? In other words, would you expect there to be the same number of results of “heads” as there are “tails”?
The following activity will allow you to discover the answer.
The above Learn by Doing activity was our first example of the second way of determining probability: Empirical (Observational) methods. In the activity, we determined that the probability of getting the result “heads” is 0.5 by flipping a fair coin many, many times.
After doing this experiment, an important question naturally comes to mind. How would we know if the coin was not fair? Certainly, classical probability methods would never be able to answer this question. In addition, classical methods could never tell us the actual P(H). The only way to answer this question is to perform another experiment.
The next activity will allow you to do just that.
So, these types of experiments can verify classical probabilities and they can also determine when games of chance are not following fair practices. However, their real importance is to answer probability questions that arise when we are faced with a situation that does not follow any pattern and cannot be predetermined. In reality, most of the probabilities of interest to us fit the latter description.
If we toss a coin, roll a die, or spin a spinner many times, we hardly ever achieve the exact theoretical probabilities that we know we should get, but we can get pretty close. When we run a simulation or when we use a random sample and record the results, we are using empirical probability. This is often called the Relative Frequency definition of probability.
Here is a realistic example where the relative frequency method was used to find the probabilities:
Researchers discovered at the beginning of the 20th century that human blood comes in various types (A, B, AB, and O), and that some types are more common than others. How could researchers determine the probability of a particular blood type, say O?
Just looking at one or two or a handful of people would not be very helpful in determining the overall chance that a randomly chosen person would have blood type O. But sampling many people at random, and finding the relative frequency of blood type O occurring, provides an adequate estimate.
For example, it is now well known that the probability of blood type O among white people in the United States is 0.45. This was found by sampling many (say, 100,000) white people in the country, finding that roughly 45,000 of them had blood type O, and then using the relative frequency: 45,000 / 100,000 = 0.45 as the estimate for the probability for the event “having blood type O.”
(Comment: Note that there are racial and ethnic differences in the probabilities of blood types. For example, the probability of blood type O among black people in the United States is 0.49, and the probability that a randomly chosen Japanese person has blood type O is only 0.3).
Let’s review the relative frequency method for finding probabilities:
To estimate the probability of event A, written P(A), we may repeat the random experiment many times and count the number of times event A occurs. Then P(A) is estimated by the ratio of the number of times A occurs to the number of repetitions, which is called the relative frequency of event A.
So, we’ve seen how the relative frequency idea works, and hopefully the activities have convinced you that the relative frequency of an event does indeed approach the theoretical probability of that event as the number of repetitions increases. This is called the Law of Large Numbers.
The Law of Large Numbers states that as the number of trials increases, the relative frequency becomes the actual probability. So, using this law, as the number of trials increases, the empirical probability gets closer and closer to the theoretical probability.
Comments:
Probability is a way of quantifying uncertainty. In this section, we defined probability as the likelihood or chance that something will occur and introduced the basic notation of probability such as P(win lottery).
You have seen that all probabilities are values between 0 and 1, where an event with no chance of occurring has a probability of 0 and an event which will always occur has a probability of 1.
We have discussed the two primary methods of calculating probabilities
In our course we will focus on Empirical probability and will often calculate probabilities from a sample using relative frequencies.
This is useful in practice since the Law of Large Numbers allows us to estimate the actual (or true) probability of an event by the relative frequency with which the event occurs in a long series of trials. We can collect this information as data and we can analyze this data using statistics.
]]>