Means (All Steps)
NOTE: Beginning on this page, the Learn By Doing and Did I Get This activities are presented as interactive PDF files. The interactivity may not work on mobile devices or with certain PDF viewers. Use an official ADOBE product such as ADOBE READER.
- Tests About μ (mu) When σ (sigma) is Unknown – The t-test for a Population Mean
- Step 1: State the hypotheses
- Step 2: Obtain data, check conditions, and summarize data
- Step 3: Find the p-value of the test by using the test statistic as follows
- Step 4: Conclusion
- The t-Distribution
So far we have talked about the logic behind hypothesis testing and then illustrated how this process proceeds in practice, using the z-test for the population proportion (p).
We are now moving on to discuss testing for the population mean (μ, mu), which is the parameter of interest when the variable of interest is quantitative.
A few comments about the structure of this section:
- The basic groundwork for carrying out hypothesis tests has already been laid in our general discussion and in our presentation of tests about proportions.
Therefore we can easily modify the four steps to carry out tests about means instead, without going into all of the details again.
We will use this approach for all future tests so be sure to go back to the discussion in general and for proportions to review the concepts in more detail.
- In our discussion about confidence intervals for the population mean, we made the distinction between whether the population standard deviation, σ (sigma) was known or if we needed to estimate this value using the sample standard deviation, s.
In this section, we will only discuss the second case as in most realistic settings we do not know the population standard deviation.
In this case we need to use the t-distribution instead of the standard normal distribution for the probability aspects of confidence intervals (choosing table values) and hypothesis tests (finding p-values).
- Although we will discuss some theoretical or conceptual details for some of the analyses we will learn, from this point on we will rely on software to conduct tests and calculate confidence intervals for us, while we focus on understanding which methods are used for which situations and what the results say in context.
If you are interested in more information about the z-test, where we assume the population standard deviation σ (sigma) is known, you can review the Carnegie Mellon Open Learning Statistics Course (you will need to click “ENTER COURSE”).
We will now go through the four steps specifically for the t-test for the population mean and apply them to our two examples.
Only in a few cases is it reasonable to assume that the population standard deviation, σ (sigma), is known and so we will not cover hypothesis tests in this case. We discussed both cases for confidence intervals so that we could still calculate some confidence intervals by hand.
For this and all future tests we will rely on software to obtain our summary statistics, test statistics, and p-values for us.
The case where σ (sigma) is unknown is much more common in practice. What can we use to replace σ (sigma)? If you don’t know the population standard deviation, the best you can do is find the sample standard deviation, s, and use it instead of σ (sigma). (Note that this is exactly what we did when we discussed confidence intervals).
Is that it? Can we just use s instead of σ (sigma), and the rest is the same as the previous case? Unfortunately, it’s not that simple, but not very complicated either.
Here, when we use the sample standard deviation, s, as our estimate of σ (sigma) we can no longer use a normal distribution to find the cutoff for confidence intervals or the p-values for hypothesis tests.
We discussed this issue for confidence intervals. We will talk more about the t-distribution after we discuss the details of this test for those who are interested in learning more.
It isn’t really necessary for us to understand this distribution but it is important that we use the correct distributions in practice via our software.
We will wait until UNIT 4B to look at how to accomplish this test in the software. For now focus on understanding the process and drawing the correct conclusions from the p-values given.
Now let’s go through the four steps in conducting the t-test for the population mean.
The null and alternative hypotheses for the t-test for the population mean (μ, mu) have exactly the same structure as the hypotheses for z-test for the population proportion (p):
Now try it yourself. Here are a few exercises on stating the hypotheses for tests for a population mean.
Here are a few more activities for practice.
When setting up hypotheses, be sure to use only the information in the research question. We cannot use our sample data to help us set up our hypotheses.
For this test, it is still important to correctly choose the alternative hypothesis as “less than”, “greater than”, or “different” although generally in practice two-sample tests are used.
Obtain data from a sample:
- In this step we would obtain data from a sample. This is not something we do much of in courses but it is done very often in practice!
Check the conditions:
- Then we check the conditions under which this test (the t-test for one population mean) can be safely carried out – which are:
In practice, for small samples, it can be very difficult to determine if the population is normal. Here is a simulation to give you a better understanding of the difficulties.
Now try it yourself with a few activities.
- It is always a good idea to look at the data and get a sense of their pattern regardless of whether you actually need to do it in order to assess whether the conditions are met.
- This idea of looking at the data is relevant to all tests in general. In the next module—inference for relationships—conducting exploratory data analysis before inference will be an integral part of the process.
Here are a few more problems for extra practice.
When setting up hypotheses, be sure to use only the information in the res
Calculate Test Statistic
Assuming that the conditions are met, we calculate the sample mean x-bar and the sample standard deviation, s (which estimates σ (sigma)), and summarize the data with a test statistic.
Recall that such a standardized test statistic represents how many standard deviations above or below μ0 (mu_zero) our sample mean x-bar is.
Therefore our test statistic is a measure of how different our data are from what is claimed in the null hypothesis. This is an idea that we mentioned in the previous test as well.
Again we will rely on the p-value to determine how unusual our data would be if the null hypothesis is true.
As we mentioned, the test statistic in the t-test for a population mean does not follow a standard normal distribution. Rather, it follows another bell-shaped distribution called the t-distribution.
We will present the details of this distribution at the end for those interested but for now we will work on the process of the test.
Here are a few important facts.
- In statistical language we say that the null distribution of our test statistic is the t-distribution with (n-1) degrees of freedom. In other words, when Ho is true (i.e., when μ = μ0 (mu = mu_zero)), our test statistic has a t-distribution with (n-1) d.f., and this is the distribution under which we find p-values.
- For a large sample size (n), the null distribution of the test statistic is approximately Z, so whether we use t(n – 1) or Z to calculate the p-values does not make a big difference. However, software will use the t-distribution regardless of the sample size and so will we.
Although we will not calculate p-values by hand for this test, we can still easily calculate the test statistic.
Try it yourself:
From this point in this course and certainly in practice we will allow the software to calculate our test statistics and we will use the p-values provided to draw our conclusions.
We will use software to obtain the p-value for this (and all future) tests but here are the images illustrating how the p-value is calculated in each of the three cases corresponding to the three choices for our alternative hypothesis.
Note that due to the symmetry of the t distribution, for a given value of the test statistic t, the p-value for the two-sided test is twice as large as the p-value of either of the one-sided tests. The same thing happens when p-values are calculated under the t distribution as when they are calculated under the Z distribution.
We will show some examples of p-values obtained from software in our examples. For now let’s continue our summary of the steps.
As usual, based on the p-value (and some significance level of choice) we assess the statistical significance of results, and draw our conclusions in context.
To review what we have said before:
We are now ready to look at two examples.
In most situations in practice we use TWO-SIDED HYPOTHESIS TESTS, followed by confidence intervals to gain more insight.
For completeness in covering one sample t-tests for a population mean, we still cover all three possible alternative hypotheses here HOWEVER, this will be the last test for which we will do so.
Now try a few yourself.
From this point in this course and certainly in practice we will allow the software to calculate our test statistic and p-value and we will use the p-values provided to draw our conclusions.
That concludes our discussion of hypothesis tests in Unit 4A.
In the next unit we will continue to use both confidence intervals and hypothesis test to investigate the relationship between two variables in the cases we covered in Unit 1 on exploratory data analysis – we will look at Case CQ, Case CC, and Case QQ.
Before moving on, we will discuss the details about the t-distribution as a general object.
We have seen that variables can be visually modeled by many different sorts of shapes, and we call these shapes distributions. Several distributions arise so frequently that they have been given special names, and they have been studied mathematically.
So far in the course, the only one we’ve named, for continuous quantitative variables, is the normal distribution, but there are others. One of them is called the t-distribution.
The t-distribution is another bell-shaped (unimodal and symmetric) distribution, like the normal distribution; and the center of the t-distribution is standardized at zero, like the center of the standard normal distribution.
Like all distributions that are used as probability models, the normal and the t-distribution are both scaled, so the total area under each of them is 1.
The following picture illustrates the fundamental difference between the normal distribution and the t-distribution:
Here we have an image which illustrates the fundamental difference between the normal distribution and the t-distribution:
You can see in the picture that the t-distribution has slightly less area near the expected central value than the normal distribution does, and you can see that the t distribution has correspondingly more area in the “tails” than the normal distribution does. (It’s often said that the t-distribution has “fatter tails” or “heavier tails” than the normal distribution.)
This reflects the fact that the t-distribution has a larger spread than the normal distribution. The same total area of 1 is spread out over a slightly wider range on the t-distribution, making it a bit lower near the center compared to the normal distribution, and giving the t-distribution slightly more probability in the ‘tails’ compared to the normal distribution.
Therefore, the t-distribution ends up being the appropriate model in certain cases where there is more variability than would be predicted by the normal distribution. One of these cases is stock values, which have more variability (or “volatility,” to use the economic term) than would be predicted by the normal distribution.
There’s actually an entire family of t-distributions. They all have similar formulas (but the math is beyond the scope of this introductory course in statistics), and they all have slightly “fatter tails” than the normal distribution. But some are closer to normal than others.
The t-distributions that have higher “degrees of freedom” are closer to normal (degrees of freedom is a mathematical concept that we won’t study in this course, beyond merely mentioning it here). So, there’s a t-distribution “with one degree of freedom,” another t-distribution “with 2 degrees of freedom” which is slightly closer to normal, another t-distribution “with 3 degrees of freedom” which is a bit closer to normal than the previous ones, and so on.
The following picture illustrates this idea with just a couple of t-distributions (note that “degrees of freedom” is abbreviated “d.f.” on the picture):
The test statistic for our t-test for one population mean is a t-score which follows a t-distribution with (n – 1) degrees of freedom. Recall that each t-distribution is indexed according to “degrees of freedom.” Notice that, in the context of a test for a mean, the degrees of freedom depend on the sample size in the study.
Remember that we said that higher degrees of freedom indicate that the t-distribution is closer to normal. So in the context of a test for the mean, the larger the sample size, the higher the degrees of freedom, and the closer the t-distribution is to a normal z distribution.
As a result, in the context of a test for a mean, the effect of the t-distribution is most important for a study with a relatively small sample size.
We are now done introducing the t-distribution. What are implications of all of this?
- The null distribution of our t-test statistic is the t-distribution with (n-1) d.f. In other words, when Ho is true (i.e., when μ = μ0 (mu = mu_zero)), our test statistic has a t-distribution with (n-1) d.f., and this is the distribution under which we find p-values.
- For a large sample size (n), the null distribution of the test statistic is approximately Z, so whether we use t(n – 1) or Z to calculate the p-values does not make a big difference.