Part 2 (13:18)
This document is linked from Conditional Probability and Independence.
]]>(a) HHHHHHHHHH
(b) HTTHHTHTTH
http://phhp-faculty-cantrell.sites.medinfo.ufl.edu/files/2012/12/LBD_07004_109.swf
This document is linked from Conditional Probability and Independence.
]]>
http://phhp-faculty-cantrell.sites.medinfo.ufl.edu/files/2012/12/LBD_07003_109.swf
This document is linked from Conditional Probability and Independence.
]]>
http://phhp-faculty-cantrell.sites.medinfo.ufl.edu/files/2012/12/DIG_07001_108.swf
http://phhp-faculty-cantrell.sites.medinfo.ufl.edu/files/2012/12/DIG_07002_108.swf
This document is linked from Conditional Probability and Independence.
]]>Review: Unit 1 Case C-C
In the last section, we established some of the basic rules of probability, which included:
In order to complete our set of rules, we still require two Multiplication Rules for finding P(A and B) and the important concepts of independent events and conditional probability.
We’ll first introduce the idea of independent events, then introduce the Multiplication Rule for independent events which gives a way to find P(A and B) in cases when the events A and B are independent.
Next we will define conditional probability and use it to formalize our definition of independent events, which is initially presented only in an intuitive way.
We will then develop the General Multiplication Rule, a rule that will tell us how to find P(A and B) in cases when the events A and B are not necessarily independent.
We’ll conclude with a discussion of probability applications in the health sciences.
We begin with a verbal definition of independent events (later we will use probability notation to define this more precisely).
Independent Events:
Here are a few examples:
A woman’s pocket contains two quarters and two nickels.
She randomly extracts one of the coins and, after looking at it, replaces it before picking a second coin.
Let Q1 be the event that the first coin is a quarter and Q2 be the event that the second coin is a quarter.
Are Q1 and Q2 independent events?
Since the first coin that was selected is replaced, whether or not Q1 occurred (i.e., whether the first coin was a quarter) has no effect on the probability that the second coin will be a quarter, P(Q2).
In either case (whether Q1 occurred or not), when she is selecting the second coin, she has in her pocket:
and therefore the P(Q2) = 2/4 = 1/2 regardless of whether Q1 occurred.
A woman’s pocket contains two quarters and two nickels.
She randomly extracts one of the coins, and without placing it back into her pocket, she picks a second coin.
As before, let Q1 be the event that the first coin is a quarter, and Q2 be the event that the second coin is a quarter.
Are Q1 and Q2 independent events?
Since the first coin that was selected is not replaced, whether Q1 occurred (i.e., whether the first coin was a quarter) does affect the probability that the second coin is a quarter, P(Q2).
If Q1 occurred (i.e., the first coin was a quarter), then when the woman is selecting the second coin, she has in her pocket:
However, if Q1 has not occurred (i.e., the first coin was not a quarter, but a nickel), then when the woman is selecting the second coin, she has in her pocket:
In these last two examples, we could actually have done some calculation in order to check whether or not the two events are independent or not.
Sometimes we can just use common sense to guide us as to whether two events are independent. Here is an example.
Two people are selected simultaneously and at random from all people in the United States.
Let B1 be the event that one of the people has blue eyes and B2 be the event that the other person has blue eyes.
In this case, since they were chosen at random, whether one of them has blue eyes has no effect on the likelihood that the other one has blue eyes, and therefore B1 and B2 are independent.
On the other hand …
A family has 4 children, two of whom are selected at random.
Let B1 be the event that one child has blue eyes, and B2 be the event that the other chosen child has blue eyes.
In this case, B1 and B2 are not independent, since we know that eye color is hereditary.
Thus, whether or not one child is blue-eyed will increase or decrease the chances that the other child has blue eyes, respectively.
Comments:
The idea of disjoint events is about whether or not it is possible for the events to occur at the same time (see the examples on the page for Basic Probability Rules).
The idea of independent events is about whether or not the events affect each other in the sense that the occurrence of one event affects the probability of the occurrence of the other (see the examples above).
The following activity deals with the distinction between these concepts.
The purpose of this activity is to help you strengthen your understanding about the concepts of disjoint events and independent events, and the distinction between them.
Let’s summarize the three parts of the activity:
Why did we leave out the case when the events are disjoint and independent?
The reason is that this case DOES NOT EXIST!
A and B Independent | A and B Not Independent | |
A and B Disjoint | DOES NOT EXIST | Example 3 |
A and B Not Disjoint | Example 1 | Example 2 |
Why is that?
Now that we understand the idea of independent events, we can finally get to rules for finding P(A and B) in the special case in which the events A and B are independent.
Later we will present a more general version for use when the events are not necessarily independent.
We now turn to rules for calculating
beginning with the multiplication rule for independent events.
Using a Venn diagram, we can visualize “A and B,” which is represented by the overlap between events A and B:
Probability Rule Six (The Multiplication Rule for Independent Events):
Comment:
Recall the blood type example:
Two people are selected simultaneously and at random from all people in the United States.
What is the probability that both have blood type O?
We need to find P(O1 and O2)
Since they were chosen simultaneously and at random, the blood type of one has no effect on the blood type of the other. Therefore, O1 and O2 are independent, and we may apply Rule 6:
Comments:
P(A or B) = P(A) + P(B) for disjoint events,
and a Multiplication Rule that says
P(A and B) = P(A) * P(B) for independent events.
The purpose of this comment is to point out the magnitude of P(A or B) and of P(A and B) relative to either one of the individual probabilities.
Since probabilities are never negative, the probability of one event or another is always at least as large as either of the individual probabilities.
Since probabilities are never more than 1, the probability of one event and another generally involves multiplying numbers that are less than 1, therefore can never be more than either of the individual probabilities.
Here is an example:
Consider the event A that a randomly chosen person has blood type A.
Modify it to a more general event — that a randomly chosen person has blood type A or B — and the probability increases.
Modify it to a more specific (or restrictive) event — that not just one randomly chosen person has blood type A, but that out of two simultaneously randomly chosen people, person 1 will have type A and person 2 will have type B — and the probability decreases.
It is important to mention this in order to root out a common misconception.
Practically, you can use this comment to check yourself when solving problems.
For example, if you solve a problem that involves “or,” and the resulting probability is smaller than either one of the individual probabilities, then you know you have made a mistake somewhere.
Comment:
As you’ve seen, the last three rules that we’ve introduced (the Complement Rule, the Addition Rules, and the Multiplication Rule for Independent Events) are frequently used in solving problems.
Before we move on to our next rule, here are two comments that will help you use these rules in broader types of problems and more effectively.
Comment:
Three people are chosen simultaneously and at random.
What is the probability that all three have blood type B?
We’ll use the usual notation of B1, B2 and B3 for the events that persons 1, 2 and 3 have blood type B, respectively.
We need to find P(B1 and B2 and B3). Let’s solve this one together:
Here is another example that might be quite surprising.
A fair coin is tossed 10 times. Which of the following two outcomes is more likely?
(a) HHHHHHHHHH
(b) HTTHHTHTTH
In fact, they are equally likely. The 10 tosses are independent, so we’ll use the Multiplication Rule for Independent Events:
Here is the idea:
Our random experiment here is tossing a coin 10 times.
Therefore,
since there is only one possible outcome which gives all heads
and many possible outcomes which give 5 heads and 5 tails
IMPORTANT Comments:
Now we will introduce the concept of conditional probability.
The idea here is that the probabilities of certain events may be affected by whether or not other events have occurred.
The term “conditional” refers to the fact that we will have additional conditions, restrictions, or other information when we are asked to calculate this type of probability.
Let’s illustrate this idea with a simple example:
All the students in a certain high school were surveyed, then classified according to gender and whether they had either of their ears pierced:
(Note that this is a two-way table of counts that was first introduced when we talked about the relationship between two categorical variables.
It is not surprising that we are using it again in this example, since we indeed have two categorical variables here:
Suppose a student is selected at random from the school.
Since a student is chosen at random from the group of 500 students, out of which 324 are pierced,
Since a student is chosen at random from the group of 500 students, out of which 180 are male,
Since a student is chosen at random from the group of 500 students out of which 36 are male and have their ear(s) pierced,
Now something new:
At this point, new notation is required, to express the probability of a certain event given that another event holds.
We will write
A word about this new notation:
We call this probability the
Now to find the probability, we observe that choosing from only the males in the school essentially alters the sample space from all students in the school to all male students in the school.
The total number of possible outcomes is no longer 500, but has changed to 180.
Out of those 180 males, 36 have ear(s) pierced, and thus:
A good visual illustration of this conditional probability is provided by the two-way table:
which shows us that conditional probability in this example is the same as the conditional percents we calculated back in section 1. In the above visual illustration, it is clear we are calculating a row percent.
Consider the piercing example, where the following two-way table is given,
Recall also that M represents the event of being a male (“not M” represents being a female), and E represents the event of having one or both ears pierced.
Another way to visualize conditional probability is using a Venn diagram:
In both the two-way table and the Venn diagram,
The two-way table illustrates the idea via counts, while the Venn diagram converts the counts to probabilities, which are presented as regions rather than cells.
We may work with counts, as presented in the two-way table, to write
Or we can work with probabilities, as presented in the Venn diagram, by writing
We will want, however, to write our formal expression for conditional probabilities in terms of other, ordinary, probabilities and therefore the definition of conditional probability will grow out of the Venn diagram.
Notice that
Probability Rule Seven (Conditional Probability Rule):
Comments:
Let’s see how we can use this formula in practice:
On the “Information for the Patient” label of a certain antidepressant, it is claimed that based on some clinical trials,
(a) Suppose that the patient experiences insomnia; what is the probability that the patient will also experience headache?
Since we know (or it is given) that the patient experienced insomnia, we are looking for P(H | I).
According to the definition of conditional probability:
(b) Suppose the drug induces headache in a patient; what is the probability that it also induces insomnia?
Here, we are given that the patient experienced headache, so we are looking for P(I | H).
Using the definition
Comment:
Now that we have introduced conditional probability, try the interactive demonstration below which uses a Venn diagram to illustrate the basic probabilities we have been discussing.
Now you can investigate the conditional probabilities as well.
As we saw in the Exploratory Data Analysis section, whenever a situation involves more than one variable, it is generally of interest to determine whether or not the variables are related.
In probability, we talk about independent events, and earlier we said that two events A and B are independent if event A occurring does not affect the probability that event B will occur.
Now that we’ve introduced conditional probability, we can formalize the definition of independence of events and develop four simple ways to check whether two events are independent or not.
We will introduce these “independence checks” using examples, and then summarize.
Consider again the two-way table for all 500 students in a particular high school, classified according to gender and whether or not they have one or both ears pierced.
Would you expect those two variables to be related?
To answer this, we may compare the overall probability of having pierced ears to the conditional probability of having pierced ears, given that a student is male.
Our intuition would tell us that the latter should be lower:
Indeed, for students in general, the probability of having pierced ears (event E) is
But the probability of having pierced ears given that a student is male is only
As we anticipated, P(E | M) is lower than P(E).
The probability of a student having pierced ears changes (in this case, gets lower) when we know that the student is male, and therefore the events E and M are dependent.
Remember, if E and M were independent, knowing or not knowing that the student is male would not have made a difference … but it did.
The previous example illustrates that one method for determining whether two events are independent is to compare P(B | A) and P(B).
Similarly, using the same reasoning, we can compare P(A | B) and P(A).
Recall the side effects activity (from the bottom of the page Basic Probability Rules.).
On the “Information for the Patient” label of a certain antidepressant, it is claimed that based on some clinical trials,
Are the two side effects independent of each other?
To check whether the two side effects are independent, let’s compare P(H | I) and P(H).
In the previous part of this section, we found that
Knowing that a patient experienced insomnia increases the likelihood that he/she will also experience headache from 0.26 to 0.357.
The conclusion therefore is that the two side effects are not independent, they are dependent.
Alternatively, we could have compared P(I | H) to P(I).
Again, since the two are not equal, we can conclude that the two side effects I and H are dependent.
Comment:
An alternative method of checking for dependence would be to compare P(E | M) with P(E | not M) [same as P(E | F)].
In our case, P(E | M) = 36/180 = 0.2, while P(E | not M) = 288/320 = 0.9, and since the two are very different, we can say that the events E and M are not independent.
In general, another method for checking the independence of events A and B is to compare P(B | A) and P(B | not A).
In other words, two events are independent if the probability of one event does not change whether we know that the other event has occurred or we know that the other event has not occurred.
It can be shown that P(B | A) and P(B | not A) would differ whenever P(B) and P(B | A) differ, so this is another perfectly legitimate way to establish dependence or independence.
Before we establish a general rule for independence, let’s consider an example that will illustrate another method that we can use to check whether two events are independent:
A group of 100 college students were surveyed about their gender and whether they had decided on a major.
Offhand, we wouldn’t necessarily have any compelling reason to expect that deciding on a major would depend on a student’s gender.
We can check for independence by comparing the overall probability of being decided to the probability of being decided given that a student is female:
The fact that the two are equal tells us that, as we might expect, deciding on a major is independent of gender.
Now let’s approach the issue of independence in a different way: first, we may note that the overall probability of being decided is 45/100 = 0.45.
And the overall probability of being female is 60/100 = 0.60.
If being decided is independent of gender, then 45% of the 60% of the class who are female should have a decided major;
in other words, the probability of being female and decided should equal the probability of being female multiplied by the probability of being decided.
If the events F and D are independent, we should have P(F and D) = P(F) * P(D).
In fact, P(F and D) = 27/100 = 0.27 = P(F) * P(D) = 0.45 * 0.60.
This confirms our alternate verification of independence.
In general, another method for checking the independence of events A and B is to
Let’s summarize all the possible methods we’ve seen for checking the independence of events in one rule:
Tests for Independent Events: Two events A and B are independent if any one of the following hold:
Comment:
The purpose of the next activity is to practice checking the independence of two events using the four different possible methods that we’ve provided, and see that all of them will lead us to the same conclusion, regardless of which of the four methods we use.
Now that we have an understanding of conditional probabilities and can express them with concise notation, and have a more formal understanding of what it means for two events to be independent, we can finally establish the General Multiplication Rule, a formal rule for finding P(A and B) that applies to any two events, whether they are independent or dependent.
We begin with an example that contrasts P(A and B) for independent and dependent cases.
Suppose you pick two cards at random from four cards consisting of one of each suit: club, diamond, heart, and spade, where the first card is replaced before the second card is picked.
What is the probability of picking a club and then a diamond?
Because the sampling is done with replacement, whether or not a diamond is picked on the second selection is independent of whether or not a club has been picked on the first selection.
Rule 6, the multiplication rule for independent events, tells us that:
Here we denote the event “club picked on first selection” as C1 and the event “diamond picked on second selection” as D2.
The display below shows that 1/4 of the time we’ll pick a club first, and of these times, 1/4 will result in a diamond on the second pick: 1/4 * 1/4 = 1/16 of the selections will have a club first and then a diamond.
Suppose you pick two cards at random from four cards consisting of one of each suit: club, diamond, heart, and spade, without replacing the first card before the second card is picked.
What is the probability of picking a club and then a diamond?
The probability in this case is not 1/4 * 1/4 = 1/16.
The probability of a club and then a diamond is 1/4*1/3=1/12.
Using the notation of conditional probabilities, we can write
For independent events A and B, we had the rule P(A and B) = P(A) * P(B).
Due to independence, to find the probability of A and B, we could multiply the probability of A by the simple probability of B, because the occurrence of A would have no effect on the probability of B occurring.
Now, for events A and B that may be dependent, to find the probability of A and B, we multiply the probability of A by the conditional probability of B, taking into account that A has occurred.
Thus, our general multiplication rule is stated as follows:
General Multiplication Rule – Probability Rule Eight:
Comments:
Let’s look at another, more realistic example:
In a certain region, one in every thousand people (0.001) is infected by the HIV virus that causes AIDS.
Let H denote the event of having HIV, and T the event of testing positive.
(a) Express the information that is given in the problem in terms of the events H and T.
(b) Use the General Multiplication Rule to find the probability that someone chosen at random from the population has HIV and tests positive.
(c) If someone has HIV, what is the probability of testing negative? Here we need to find P(not T | H).
The purpose of the next activity is to give you guided practice in expressing information in terms of conditional probabilities, and in using the General Multiplication Rule.
This section introduced you to the fundamental concepts of independent events and conditional probability — the probability of an event given that another event has occurred.
We saw that sometimes the knowledge that another event has occurred has no impact on the probability (when the two events are independent), and sometimes it does (when the two events are not independent).
We further discussed the idea of independence and discussed different ways to check whether two events are independent or not.
Understanding the concept of conditional probability also allowed us to introduce our final probability rule, the General Multiplication Rule.
The General Multiplication Rule tells us how to find P(A and B) when A and B are not necessarily independent.
]]>