Population Means (Part 2)

CO-4: Distinguish among different measurement scales, choose the appropriate descriptive and inferential statistical methods based on these distinctions, and interpret the results.
LO 4.30: Interpret confidence intervals for population parameters in context.
LO 4.31: Find confidence intervals for the population mean using the normal distribution (Z) based confidence interval formula (when required conditions are met) and perform sample size calculations.
CO-6: Apply basic concepts of probability, random variation, and commonly used statistical probability distributions.
LO 6.24: Explain the connection between the sampling distribution of a statistic, and its properties as a point estimator.
LO 6.25: Explain what a confidence interval represents and determine how changes in sample size and confidence level affect the precision of the confidence interval.

Other Levels of Confidence

95% is the most commonly used level of confidence. However, we may wish to increase our level of confidence and produce an interval that’s almost certain to contain μ (mu). Specifically, we may want to report an interval for which we are 99% confident that it contains the unknown population mean, rather than only 95%.

Using the same reasoning as in the last comment, in order to create a 99% confidence interval for μ (mu), we should ask: There is a probability of 0.99 that any normal random variable takes values within how many standard deviations of its mean? The precise answer is 2.576, and therefore, a 99% confidence interval for μ (mu) is:

mod11-CI_mean99

Another commonly used level of confidence is a 90% level of confidence. Since there is a probability of 0.90 that any normal random variable takes values within 1.645 standard deviations of its mean, the 90% confidence interval for μ (mu) is:

mod11-CI_mean90

EXAMPLE:

Let’s go back to our first example, the IQ example:

The IQ level of students at a particular university has an unknown mean (μ, mu) and known standard deviation σ (sigma) =15. A simple random sample of 100 students is found to have a sample mean IQ of 115 (x-bar). Estimate μ (mu) with a 90%, 95%, and 99% confidence interval.

A 90% confidence interval for μ (mu) is:

mod11-CI_mean90_ex1

A 95% confidence interval for μ (mu) is:

mod11-CI_mean95_ex2

A 99% confidence interval for μ (mu) is:

mod11-CI_mean99_ex1

The purpose of this next activity is to give you guided practice at calculating and interpreting confidence intervals, and drawing conclusions from them.

Note from the previous example and the previous “Did I Get This?” activity, that the more confidence I require, the wider the confidence interval for μ (mu). The 99% confidence interval is wider than the 95% confidence interval, which is wider than the 90% confidence interval.

A number line illustrating confidence intervals for μ. x-bar is marked at 115. The interval 112.5 and 117.5 is the 90% confidence interval. Enclosing this interval is the interval 112 and 118, which is the 95% confidence interval. Even larger is the 99% confidence interval, ranging from 111 to 119.

This is not very surprising, given that in the 99% interval we multiply the standard deviation of the statistic by 2.576, in the 95% by 2, and in the 90% only by 1.645. Beyond this numerical explanation, there is a very clear intuitive explanation and an important implication of this result.

Let’s start with the intuitive explanation. The more certain I want to be that the interval contains the value of μ (mu), the more plausible values the interval needs to include in order to account for that extra certainty. I am 95% certain that the value of μ (mu) is one of the values in the interval (112.1, 117.9). In order to be 99% certain that one of the values in the interval is the value of μ (mu), I need to include more values, and thus provide a wider confidence interval.

In our example, the wider 99% confidence interval (111, 119) gives us a less precise estimation about the value of μ (mu) than the narrower 90% confidence interval (112.5, 117.5), because the smaller interval ‘narrows-in’ on the plausible values of μ (mu).

The important practical implication here is that researchers must decide whether they prefer to state their results with a higher level of confidence or produce a more precise interval. In other words,

There is a trade-off between the level of confidence and the precision with which the parameter is estimated.

The price we have to pay for a higher level of confidence is that the unknown population mean will be estimated with less precision (i.e., with a wider confidence interval). If we would like to estimate μ (mu) with more precision (i.e. a narrower confidence interval), we will need to sacrifice and report an interval with a lower level of confidence.

So far we’ve developed the confidence interval for the population mean “from scratch” based on results from probability, and discussed the trade-off between the level of confidence and the precision of the interval. The price you pay for a higher level of confidence is a lower level of precision of the interval (i.e., a wider interval).

Is there a way to bypass this trade-off? In other words, is there a way to increase the precision of the interval (i.e., make it narrower) without compromising on the level of confidence? We will answer this question shortly, but first we’ll need to get a deeper understanding of the different components of the confidence interval and its structure.

Understanding the General Structure of Confidence Intervals

We explored the confidence interval for μ (mu) for different levels of confidence, and found that in general, it has the following form:

mod11-CI_mean

where z* is a general notation for the multiplier that depends on the level of confidence. As we discussed before:

  • For a 90% level of confidence, z* = 1.645
  • For a 95% level of confidence, z* = 1.96
  • For a 99% level of confidence, z* = 2.576

To start our discussion about the structure of the confidence interval, let’s denote

mod11-CI_margerr

The confidence interval, then, has the form:

mod11-CI_mean2

To summarize, we have

A formula: x-bar ± z-star × σ/√n Note that z-star × σ/√n is m.

X-bar is the sample mean, the point estimator for the unknown population mean (μ, mu).

m is called the margin of error, since it represents the maximum estimation error for a given level of confidence.

For example, for a 95% confidence interval, we are 95% confident that our estimate will not depart from the true population mean by more than m, the margin of error and m is further made up of the product of two components:

Here is a summary of the different components of the confidence interval and its structure:

x-bar is the point estimator. It is either added to or subtracted by the margin of error (m). The margin of error is composed of the confidence multiplier, z-star, which is multiplied by the standard deviation of the point estimator, which is σ/√n .

This structure: estimate ± margin of error, where the margin of error is further composed of the product of a confidence multiplier and the standard deviation of the statistic (or, as we’ll see, the standard error) is the general structure of all confidence intervals that we will encounter in this course.

Obviously, even though each confidence interval has the same components, the formula for these components is different from confidence interval to confidence interval, depending on what unknown parameter the confidence interval aims to estimate.

Since the structure of the confidence interval is such that it has a margin of error on either side of the estimate, it is centered at the estimate (in our current case, x-bar), and its width (or length) is exactly twice the margin of error:

A number line, on which the estimate has been placed. To the left and to the right are two intervals with the size m. So, the confidence interval, which comprises of both margins of errors (the left one and right one) is of width 2m.

The margin of error, m, is therefore “in charge” of the width (or precision) of the confidence interval, and the estimate is in charge of its location (and has no effect on the width).

Did I Get This?:  Margin of Error

Let us now go back to the confidence interval for the mean, and more specifically, to the question that we posed at the beginning of the previous page:

Is there a way to increase the precision of the confidence interval (i.e., make it narrower) without compromising on the level of confidence?

Since the width of the confidence interval is a function of its margin of error, let’s look closely at the margin of error of the confidence interval for the mean and see how it can be reduced:

mod11-CI_margerr

Since z* controls the level of confidence, we can rephrase our question above in the following way:

Is there a way to reduce this margin of error other than by reducing z*?

If you look closely at the margin of error, you’ll see that the answer is yes. We can do that by increasing the sample size n (since it appears in the denominator).

Many Students Wonder: Confidence Intervals (Population Mean)

Question: Isn’t it true that another way to reduce the margin of error (for a fixed z*) is to reduce σ (sigma)?

Answer: While it is true that strictly mathematically speaking the smaller the value of σ (sigma), the smaller the margin of error, practically speaking we have absolutely no control over the value of σ (sigma) (i.e., we cannot make it larger or smaller). σ (sigma) is the population standard deviation; it is a fixed value (which here we assume is known) that has an effect on the width of the confidence interval (since it appears in the margin of error), but is definitely not a value we can change.

Let’s look at an example first and then explain why increasing the sample size is a way to increase the precision of the confidence interval without compromising on the level of confidence.

EXAMPLE:

Recall the IQ example:

The IQ level of students at a particular university has an unknown mean (μ, mu) and a known standard deviation of σ (sigma) =15. A simple random sample of 100 students is found to have the sample mean IQ of 115 (x-bar).

For simplicity, in this question, we will round z* = 1.96 to 2. You should use z* = 1.96 in all problems unless you are specifically instructed to do otherwise.

A 95% confidence interval for μ (mu) in this case is:

mod11-CI_mean_ex3a

Note that the margin of error is m = 3, and therefore the width of the confidence interval is 6.

Now, what if we change the problem slightly by increasing the sample size, and assume that it was 400 instead of 100?

A large circle represents the Population of all Students at SU. We are interested in the variable IQ, and the unknown parameter is μ, the population mean IQ level. In addition, we know that σ = 15. From this population we create a sample of size n=400, represented by a smaller circle. In this sample, we find that x bar = 115.

In this case, a 95% confidence interval for μ (mu) is:

mod11-CI_mean_ex3b

The margin of error here is only m = 1.5, and thus the width is only 3.

Note that for the same level of confidence (95%) we now have a narrower, and thus more precise, confidence interval.

Let’s try to understand why is it that a larger sample size will reduce the margin of error for a fixed level of confidence. There are three ways to explain this: mathematically, using probability theory, and intuitively.

We’ve already alluded to the mathematical explanation; the margin of error is

mod11-CI_margerr

and since n, the sample size, appears in the denominator, increasing n will reduce the margin of error.

As we saw in our discussion about point estimates, probability theory tells us that:

Two sampling distribution curves for x-bar. One is squished down and wider, while the other is much taller and narrower. Both curves share the same μ. The tall, narrow distribution was based on a larger sample size, which has a smaller standard deviation, and so is less spread out. This means that values of x-bar are more likely to be closer to μ when the sample size is larger.

This explains why with a larger sample size the margin of error (which represents how far apart we believe x-bar might be from μ (mu) for a given level of confidence) is smaller.

On an intuitive level, if our estimate x-bar is based on a larger sample (i.e., a larger fraction of the population), we have more faith in it, or it is more reliable, and therefore we need to account for less error around it.

Comment:

  • While it is true that for a given level of confidence, increasing the sample size increases the precision of our interval estimation, in practice, increasing the sample size is not always possible.
    • Consider a study in which there is a non-negligible cost involved for collecting data from each participant (an expensive medical procedure, for example). If the study has some budgetary constraints, which is usually the case, increasing the sample size from 100 to 400 is just not possible in terms of cost-effectiveness.
    • Another instance in which increasing the sample size is impossible is when a larger sample is simply not available, even if we had the money to afford it. For example, consider a study on the effectiveness of a drug on curing a very rare disease among children. Since the disease is rare, there are a limited number of children who could be participants.
  • This is the reality of statistics. Sometimes theory collides with reality, and you simply do the best you can.
Did I Get This?: Sample Size and Confidence