Wrap-Up (Case C-Q)

BE AWARE -These materials are not as complete as the rest of our materials. Do the best you can with the available content and let us know if you have questions about the processes involved here. The written content is thorough and the supplemental learn by doing materials provided for Unit 4B are an excellent resource for additional examples. 
As we mentioned at the end of the Introduction to Unit 4B, we will focus only on two-sided tests for the remainder of this course. One-sided tests are often possible but rarely used in clinical research.

We are now done with case C→Q.

  • We learned that this case is further classified into sub-cases, depending on the number of groups that we are comparing (i.e., the number of categories that the explanatory variable has), and the design of the study (independent vs. dependent samples).
  • For each of the three sub-cases that we covered, we learned the appropriate inferential method, and emphasized the idea behind the method, the conditions under which it can be safely used, how to carry it out using software, and the interpretation of the results.
  • We also learned which non-parametric tests are applicable and under what circumstances they might be used instead of the standard methods.

The following table summarizes when each of the three standard tests, covered in this module, are used:

A Two-sample test is used in circumstances: * Categorical explanatory variable with two categories * Comparing two population means based on two independent samples * Either normal populations or large sample size A Paired t-test (special case of the one sample t-test) is used when: * Categorical explanatory variable with two categories * Comparing the two population means, when the samples are dependent on each other or "matched pairs." *Samples are dependent in the sense that every observation in one sample is linked to an observation in another sample. Examples of dependent samples include: -same subjects measured twice, -twins ANOVA is used when * Categorical explanatory variable with more than two categories. * Comparing more than two population means based on independent samples


The following summary discusses each of the above named sub-cases of C→Q within the context of the hypothesis testing process.

Step 1: Stating the null and alternative hypotheses (Hand Ha)

  • Although the one-sided alternatives are provided here where possible, remember that we will focus only on two-sided tests supplemented by confidence intervals for methods in Unit 4B.

In a Two-Sample t-test, the hypotheses are: H_0: μ_1 - μ_2 = 0 (or H_0: μ_1 = μ_2), and one of: * H_a: μ_1 - μ_2 < 0 (same as H_a: μ_1 < μ_2) * H_a: μ_1 - μ_2 > 0 (same as H_a: μ_1 > μ_2) * H_a: μ_1 - μ_2 ≠ 0 (same as H_a: μ_1 ≠ μ_2) For a paired t-test, the hypotheses are H_0: μ_d = 0, and one of: * H_a: μ_d < 0, * H_a: μ_d > 0, * H_0: μ_0 ≠ 0. For ANOVA, H_0: μ_0 = μ_2 = ... = μ_k, and H_a:not all μ's are equal

Step 2: Check Conditions and Summarize the Data Using a Test Statistic

We need to check that the conditions under which the test can be reliably used are met.

For the Paired t-test (as a special case of a one-sample t-test), the conditions are:

  • The sample of differences is random (or at least can be considered so in context).
  • We are in one of the three situations marked with a green check mark in the following table:

A table which has two columns and two rows. The column headings are: "Small Sample Size" and "Large Sample Size. " The row headings are "Variable varies normally" and "Variable doesn't vary normally." Here is the data in the table by cell in "Row, Column: Value" format: Variable varies normally, Small sample size: OK (in this case, we should check normality visually using a histogram of the sample differences); Variable varies normally, Large sample size: OK; Variable doesn't vary normally, Small sample size: NOT OK; Variable doesn't vary normally, Large sample size: OK;

For the Two-Sample t-test, the conditions are:

  • Two samples are independent and random
  • One of the following two scenarios holds:
    • Both populations are normal
    • Populations are not normal, but large sample size (>30)

For an ANOVA, the conditions are:

  • The samples drawn from each of the populations being compared are independent.
  • The response variable varies normally within each of the populations being compared. As is often the case, we do not have to worry about this assumption for large sample sizes.
  • The populations all have the same standard deviation.

Now we summarize the data using a test statistic. 

  • Although we will not be calculating these test statistics by hand, we will review the formulas for each test statistic here.

For the Paired t-test the test statistic is: pairedttest

For the Two-Sample t-test assuming equal variances the test statistic is:




For the Two-Sample t-test assuming unequal variances the test statistic is:


For an ANOVA the test statistic is: 


Step 3: Finding the p-value of the test

Use statistical software to determine the p-value.

  • The p-value is the probability of getting data like those observed (or even more extreme) assuming that the null hypothesis is true, and is calculated using the null distribution of the test statistic.
  • The p-value is a measure of the evidence against H0.
  • The smaller the p-value, the more evidence the data present against H0.

The p-values for three C→Q tests are obtained from the output.

Step 4: Making conclusions

Conclusions about the significance of the results:

  • If the p-value is small, the data present enough evidence to reject Ho (and accept Ha).
  • If the p-value is not small, the data do not provide enough evidence to reject H0.
  • To help guide our decision, we use the significance level as a cutoff for what is considered a small p-value. The significance cutoff is usually set at .05, but should not be considered inviolable.

Conclusions should always be stated in the context of the problem and can all be written in the basic form below:

  • There (IS or IS NOT) enough evidence that there is an association between (X) and (Y).  Where X and Y should be given in context.

Following the test…

  • For a paired t-test, a 95% confidence interval for μcan be very insightful after a test has rejected the null hypothesis, and can also be used for testing in the two-sided case.
  • For a two-sample t-test, a 95% confidence interval for μ1−μcan be very insightful after a test has rejected the null hypothesis, and can also be used for testing in the two-sided case.
  • If the ANOVA F-test has rejected the null hypothesis, looking at the confidence intervals for the population means that are in the output can provide visual insight into why the Hwas rejected (i.e., which of the means differ).

Non-parametric Alternatives

  • For a Paired t-test we might investigate using the Wilcoxon Signed-Rank test or the Sign test.
  • For a Two-Sample t-test we might investigate using the Wilcoxon Rank-Sum test (Mann-Whitney U test).
  • For an ANOVA we might investigate using the Kruskal-Wallis test.