Types of Variables
Classifying Types of Variables
Variables can be broadly classified into one of two types:
Below we define these two main types of variables and provide further sub-classifications for each type.
Categorical variables take category or label values, and place an individual into one of several groups.
Categorical variables are often further classified as either:
- Nominal, when there is no natural ordering among the categories.
Common examples would be gender, eye color, or ethnicity.
- Ordinal, when there is a natural order among the categories, such as, ranking scales or letter grades.
However, ordinal variables are still categorical and do not provide precise measurements.
Differences are not precisely meaningful, for example, if one student scores an A and another a B on an assignment, we cannot say precisely the difference in their scores, only that an A is larger than a B.
Quantitative variables take numerical values, and represent some kind of measurement.
Quantitative variables are often further classified as either:
- Discrete, when the variable takes on a countable number of values.
Most often these variables indeed represent some kind of count such as the number of prescriptions an individual takes daily.
- Continuous, when the variable can take on any value in some range of values.
Our precision in measuring these variables is often limited by our instruments.
Units should be provided.
Common examples would be height (inches), weight (pounds), or time to recovery (days).
One special variable type occurs when a variable has only two possible values.
A variable is said to be Binary or Dichotomous, when there are only two possible levels.
These variables can usually be phrased in a “yes/no” question. Whether nor not someone is a smoker is an example of a binary variable.
Currently we are primarily concerned with classifying variables as either categorical or quantitative.
Sometimes, however, we will need to consider further and sub-classify these variables as defined above.
These concepts will be discussed and reviewed as needed but here is a quick practice on sub-classifying categorical and quantitative variables.
EXAMPLE: Medical Records
Let’s revisit the dataset showing medical records for a sample of patients
In our example of medical records, there are several variables of each type:
- Age, Weight, and Height are quantitative variables.
- Race, Gender, and Smoking are categorical variables.
- Notice that the values of the categorical variable Smoking have been coded as the numbers 0 or 1.
It is quite common to code the values of a categorical variable as numbers, but you should remember that these are just codes.
They have no arithmetic meaning (i.e., it does not make sense to add, subtract, multiply, divide, or compare the magnitude of such values).
Usually, if such a coding is used, all categorical variables will be coded and we will tend to do this type of coding for datasets in this course.
- Sometimes, quantitative variables are divided into groups for analysis, in such a situation, although the original variable was quantitative, the variable analyzed is categorical.
A common example is to provide information about an individual’s Body Mass Index by stating whether the individual is underweight, normal, overweight, or obese.
This categorized BMI is an example of an ordinal categorical variable.
- Categorical variables are sometimes called qualitative variables, but in this course we’ll use the term “categorical.”
Why Does the Type of Variable Matter?
The types of variables you are analyzing directly relate to the available descriptive and inferential statistical methods.
It is important to:
- assess how you will measure the effect of interest and
- know how this determines the statistical methods you can use.
As we proceed in this course, we will continually emphasize the types of variables that are appropriate for each method we discuss.
To compare the number of polio cases in the two treatment arms of the Salk Polio vaccine trial, you could use
- Fisher’s Exact Test
- Chi-Square Test
To compare blood pressures in a clinical trial evaluating two blood pressure-lowering medications, you could use
- Two-sample t-Test
- Wilcoxon Rank-Sum Test