Proportions (Introduction & Step 1)
- Review: Types of Variables
- One Sample Z-Test for a Population Proportion
- Step 1. Stating the Hypotheses
Now that we understand the process of hypothesis testing and the logic behind it, we are ready to start learning about specific statistical tests (also known as significance tests).
The first test we are going to learn is the test about the population proportion (p).
We will understand later where the “z-test” part is coming from.
This will be the only type of problem you will complete entirely “by-hand” in this course. Our goal is to use this example to give you the tools you need to understand how this process works. After working a few problems, you should review the earlier material again. You will likely need to review the terminology and concepts a few times before you fully understand the process.
In reality, you will often be conducting more complex statistical tests and allowing software to provide the p-value. In these settings it will be important to know what test to apply for a given situation and to be able to explain the results in context.
When we conduct a test about a population proportion, we are working with a categorical variable. Later in the course, after we have learned a variety of hypothesis tests, we will need to be able to identify which test is appropriate for which situation. Identifying the variable as categorical or quantitative is an important component of choosing an appropriate hypothesis test.
In this part of our discussion on hypothesis testing, we will go into details that we did not go into before. More specifically, we will use this test to introduce the idea of a test statistic, and details about how p-values are calculated.
Let’s start by introducing the three examples, which will be the leading examples in our discussion. Each example is followed by a figure illustrating the information provided, as well as the question of interest.
Recall that there are basically 4 steps in the process of hypothesis testing:
- STEP 1: State the appropriate null and alternative hypotheses, Ho and Ha.
- STEP 2: Obtain a random sample, collect relevant data, and check whether the data meet the conditions under which the test can be used. If the conditions are met, summarize the data using a test statistic.
- STEP 3: Find the p-value of the test.
- STEP 4: Based on the p-value, decide whether or not the results are statistically significant and draw your conclusions in context.
- Note: In practice, we should always consider the practical significance of the results as well as the statistical significance.
We are now going to go through these steps as they apply to the hypothesis testing for the population proportion p. It should be noted that even though the details will be specific to this particular test, some of the ideas that we will add apply to hypothesis testing in general.
Here again are the three set of hypotheses that are being tested in each of our three examples:
The null hypothesis always takes the form:
- Ho: p = some value
and the alternative hypothesis takes one of the following three forms:
- Ha: p < that value (like in example 1) or
- Ha: p > that value (like in example 2) or
- Ha: p ≠ that value (like in example 3).
Note that it was quite clear from the context which form of the alternative hypothesis would be appropriate. The value that is specified in the null hypothesis is called the null value, and is generally denoted by p0. We can say, therefore, that in general the null hypothesis about the population proportion (p) would take the form:
- Ho: p = p0
We write Ho: p = p0 to say that we are making the hypothesis that the population proportion has the value of p0. In other words, p is the unknown population proportion and p0 is the number we think p might be for the given situation.
The alternative hypothesis takes one of the following three forms (depending on the context):
- Ha: p < p0 (one-sided)
- Ha: p > p0 (one-sided)
- Ha: p ≠ p0 (two-sided)
The first two possible forms of the alternatives (where the = sign in Ho is challenged by < or >) are called one-sided alternatives, and the third form of alternative (where the = sign in Ho is challenged by ≠) is called a two-sided alternative. To understand the intuition behind these names let’s go back to our examples.
Example 3 (death penalty) is a case where we have a two-sided alternative:
- Ho: p = 0.64 (No change from 2003).
- Ha: p ≠ 0.64 (Some change since 2003).
In this case, in order to reject Ho and accept Ha we will need to get a sample proportion of death penalty supporters which is very different from 0.64 in either direction, either much larger or much smaller than 0.64.
In example 2 (marijuana use) we have a one-sided alternative:
- Ho: p = 0.157 (same as among all college students in the country).
- Ha: p > 0.157 (higher than the national figure).
Here, in order to reject Ho and accept Ha we will need to get a sample proportion of marijuana users which is much higher than 0.157.
Similarly, in example 1 (defective products), where we are testing:
- Ho: p = 0.20 (No change; the repair did not help).
- Ha: p < 0.20 (The repair was effective at reducing the proportion of defective parts).
in order to reject Ho and accept Ha, we will need to get a sample proportion of defective products which is much smaller than 0.20.