CO-4: Distinguish among different measurement scales, choose the appropriate descriptive and inferential statistical methods based on these distinctions, and interpret the results.
Recall the role-type classification table framing our discussion on inference about the relationship between two variables.
We start with case C→Q, where the explanatory variable is categorical and the response variable is quantitative.
Recall that in the Exploratory Data Analysis unit, examining the relationship between X and Y in this situation amounts, in practice, to:
- Comparing the distributions of the (quantitative) response Y for each value (category) of the explanatory X.
To do that, we used
- side-by-side boxplots (each representing the distribution of Y in one of the groups defined by X),
- and supplemented the display with the corresponding descriptive statistics.
We will need to add one layer of difficulty here with the possibility that we may have paired or matched samples as opposed to independent samples or groups. Note that all of the examples we discussed in Case CQ in Unit 1 consisted of independent samples.
First we will review the general scenario.
Comparing Means between Groups
To understand the logic, we’ll start with an example and then generalize.
EXAMPLE: GPA and Year in College
Suppose that our variable of interest is the GPA of college students in the United States. From Unint 4A, we know that since GPA is quantitative, we will conduct inference on μ, the (population) mean GPA among all U.S. college students.
Since this section is about relationships, let’s assume that what we are really interested in is not simply GPA, but the relationship between:
- X : year in college (1 = freshmen, 2 = sophomore, 3 = junior, 4 = senior) and
- Y : GPA
In other words, we want to explore whether GPA is related to year in college.
The way to think about this is that the population of U.S. college students is now broken into 4 sub-populations: freshmen, sophomores, juniors and seniors. Within each of these four groups, we are interested in the GPA.
The inference must therefore involve the 4 sub-population means:
- μ1 : mean GPA among freshmen in the United States.
- μ2 : mean GPA among sophomores in the United States
- μ3 : mean GPA among juniors in the United States
- μ4 : mean GPA among seniors in the United States
It makes sense that the inference about the relationship between year and GPA has to be based on some kind of comparison of these four means.
If we infer that these four means are not all equal (i.e., that there are some differences in GPA across years in college) then that’s equivalent to saying GPA is related to year in college. Let’s summarize this example with a figure:
In general, making inferences about the relationship between X and Y in Case C→Q boils down to comparing the means of Y in the sub-populations, which are created by the categories defined by X (say k categories). The following figure summarizes this:
We will split this into two different scenarios (k = 2 and k > 2), where k is the number of categories defined by X.
- If we are interested in whether GPA (Y) is related to gender (X), this is a scenario where k = 2 (since gender has only two categories: M, F), and the inference will boil down to comparing the mean GPA in the sub-population of males to that in the sub-population of females.
- On the other hand, in the example we looked at earlier, the relationship between GPA (Y) and year in college (X) is a scenario where k > 2 or more specifically, k = 4 (since year has four categories).
In terms of inference, these two situations (k = 2 and k > 2) will be treated differently!
Scenario with k = 2
Scenario with k > 2
Dependent vs. Independent Samples (k = 2)
LO 4.37: Identify and distinquish between independent and dependent samples.
Furthermore, within the scenario of comparing two means (i.e., examining the relationship between X and Y, when X has only two categories, k = 2) we will distinguish between two scenarios.
Here, the distinction is somewhat subtle, and has to do with how the samples from each of the two sub-populations we’re comparing are chosen. In other words, it depends upon what type of study design will be implemented.
We have learned that many experiments, as well as observational studies, make a comparison between two groups (sub-populations) defined by the categories of the explanatory variable (X), in order to see if the response (Y) differs.
In some situations, one group (sub-population 1) is defined by one category of X, and another independent group (sub-population 2) is defined by the other category of X. Independent samples are then taken from each group for comparison.
Suppose we are conducting a clinical trial. Participants are randomized into two independent subpopulations:
- those who are given a drug and
- those who are given a placebo.
Each individual appears in only one of these two groups and individuals are not matched or paired in any way. Thus the two samples or groups are independent. We can say those given the drug are independent from those given the placebo.
Recall: By randomly assigning individuals to the treatment we control for both known and unknown lurking variables.
Suppose the Highway Patrol wants to study the reaction times of drivers with a blood alcohol content of half the legal limit in their state.
An observational study was designed which would also serve as publicity on the topic of drinking and driving. At a large event where enough alcohol would be consumed to obtain plenty of potential study participants, officers set up an obstacle course and provided the vehicles. (Other considerations were also implemented to keep the car and track conditions consistent for each participant.)
Volunteers were recruited from those in attendance and given a breathalyzer test to determine their blood alcohol content. Two types of volunteers were chosen to participate:
- Those with a blood alcohol content of zero – as measured by the breathalyzer – of which 10 were chosen to drive the course.
- Those with a blood alcohol content within a small range of half the legal limit (in Florida this would be around 0.04%) – of which 9 were chosen.
Here also, we have two independent groups – even if originally they were taken from the same sample of volunteers – each individual appears in only one of the two groups, the comparison of the reaction times is a comparison between two independent groups.
However, in this study, there was NO random assignment to the treatment and so we would need to be much more concerned about the possibility of lurking variables in this study compared to one in which individuals were randomized into one of these two groups.
We will see it may be more appropriate in some studies to use the same individual as a subject in BOTH treatments – this will result in dependent samples.
When a matched pairs sample design is used, each observation in one sample is matched/paired/linked with an observation in the other sample. These are sometimes called “dependent samples.”
Matching could be by person (if the same person is measured twice), or could actually be a pair of individuals who belong together in a relevant way (husband and wife, siblings).
In this design, then, the same individual or a matched pair of individuals is used to make two measurements of the response – one for each of the two levels of the categorical explanatory variable.
Advantages of a paired sample approach include:
- Reduced measurement error since the variance within subjects is typically smaller than that between subjects
- Requires smaller number of subjects to achieve the same power than independent sample methods.
Disadvantages of a paired sample approach include:
- An order effect based upon which treatment individuals received first.
- A carryover effect such as a drug remaining in the system.
- Testing effect such as particpants learning the obstacle course in the first run improving their performance in the 2nd.
Suppose we are conducting a study on a pain blocker which can be applied to the skin and are comparing two different dosage levels of the solution which in this study will be applied to the forearm.
For each participant both solutions are applied with the following protocol:
- Which drug is applied to which arm is random.
- Patients and clinical staff are blind to the two treatment applications.
- Pain tolerance is measured on both arms using the same standard test with the order of testing randomized.
Here we have dependent samples since the same patient appears in both dosage groups.
Again, randomization is employed to help minimize other issues related to study design such as an order or testing effect.
Suppose the department of motor vehicles wants to check whether drivers are impaired after drinking two beers.
The reaction times (measured in seconds) in an obstacle course are measured for 8 randomly selected drivers before and then after the consumption of two beers.
We have a matched-pairs design, since each individual was measured twice, once before and once after.
In matched pairs, the comparison between the reaction times is done for each individual.
- Note that in the first figure, where the samples are independent, the sample sizes of the two independent samples need not be the same.
- On the other hand, it is obvious from the design that in the matched pairs the sample sizes of the two samples must be the same (and thus we used n for both).
- Dependent samples can occur in many other settings but for now we focus on the case of investigating the relationship between a two-level categorical explanatory variable and a quantitative response variable.
We will begin our discussion of Inference for Relationships with Case C-Q, where the explanatory variable (X) is categorical and the response variable (Y) is quantitative. We discussed that inference in this case amounts to comparing population means.
- We distinguish between scenarios where the explanatory variable (X) has only two categories and scenarios wheret he explanatory variable (X) has MORE than two categories.
- When comparing two means, we make the futher distinction between situations where we have independent samples and those where we have matched pairs.
- For comparing more than two means in this course, we will focus only on the situation where we have independent samples. In studies with more than two groups on dependent samples, it is good to know that a common method used is repeated measures but we will not cover it here.
- We will first discuss comparing two population means starting with matched pairs (dependent samples) then independent samples and conclude with comparing more than two population means in the case of independent samples.
Now test your skills at identifying the three scenarios in Case C-Q.
Looking Ahead – Methods in Case C-Q
- Methods in BOLD will be our main focus in this unit.
Here is a summary of the tests we will learn for the scenario where k = 2.
Independent Samples (More Emphasis)
Dependent Samples (Less Emphasis)
- Two Sample T-Test Assuming Equal Variances
- Two Sample T-Test Assuming Unequal Variances
- Mann-Whitney U (or Wilcoxon Rank-Sum) Test
- Sign Test
- Wilcoxon Signed-Rank Test
Here is a summary of the tests we will learn for the scenario where k > 2.
Independent Samples (Only Emphasis)
Dependent Samples (Not Discussed)
- One-way ANOVA (Analysis of Variance)
- Kruskal–Wallis One-way ANOVA
- Repeated Measures ANOVA (or similar)