In slide 7, there is an extra “the” in the third bullet. “If we standardize an entire **the **variable, the new variable will…”

This extremely short video contains an overview of the five-number summary.

The original slides are not available.

Transcript – Live Five-Number Summary

This document is linked from Measures of Position.

]]>This document is linked from One Quantitative Variable.

]]>**Related SAS Tutorials**

- 5A – (3:01) Numeric Measures using PROC MEANS

**Related SPSS Tutorials**

- 5A – (8:00) Numeric Measures using EXPLORE

Although not a required aspect of describing distributions of one quantitative variable, we are often interested in where a particular value falls in the distribution. Is the value unusually low or high or about what we would expect?

Answers to these questions rely on measures of position (or location). These measures give information about the distribution but also give information about how individual values relate to the overall distribution.

A common measure of position is the percentile. Although there are some mathematical considerations involved with calculating percentiles which we will not discuss, you should have a basic understanding of their interpretation.

In general the *P*-th percentile can be interpreted as a location in the data for which approximately *P*% of the other values in the distribution fall below the *P*-th percentile and (100 –*P*)% fall above the *P*-th percentile.

The quartiles Q1 and Q3 are special cases of percentiles and thus are measures of position.

The combination of the five numbers (min, Q1, M, Q3, Max) is called the **five number summary**, and provides a quick numerical description of both the center and spread of a distribution.

Each of the values represents a measure of position in the dataset.

The min and max providing the boundaries and the quartiles and median providing information about the 25th, 50th, and 75th percentiles.

Standardized scores, also called z-scores use the mean and standard deviation as the primary measures of center and spread and are therefore most useful when the mean and standard deviation are appropriate, i.e. when the distribution is reasonably symmetric with no extreme outliers.

For any individual, the **z-score** tells us how many standard deviations the raw score for that individual deviates from the mean and in what direction. A positive z-score indicates the individual is above average and a negative z-score indicates the individual is below average.

To calculate a z-score, we take the individual value and subtract the mean and then divide this difference by the standard deviation.

Measures of position also allow us to compare values from different distributions. For example, we can present the percentiles or z-scores of an individual’s height and weight. These two measures together would provide a better picture of how the individual fits in the overall population than either would alone.

Although measures of position are not stressed in this course as much as measures of center and spread, we have seen and will see many measures of position used in various aspects of examining the distribution of one variable and it is good to recognize them as measures of position when they appear.

]]>**Related SAS Tutorials**

- 5A – (3:01) Numeric Measures using PROC MEANS
- 5B – (4:05) Creating Histograms and Boxplots using SGPLOT
- 5C – (5:41) Creating QQ-Plots and other plots using UNIVARIATE

**Related SPSS Tutorials**

- 5A – (8:00) Numeric Measures using EXPLORE
- 5B – (2:29) Creating Histograms and Boxplots
- 5C – (2:31) Creating QQ-Plots and PP-Plots

In the previous section, we explored the distribution of a categorical variable using graphs (pie chart, bar chart) supplemented by numerical measures (percent of observations in each category).

In this section, we will explore the data collected from a **quantitative** variable, and learn how to describe and summarize the important features of its distribution.

We will learn how to display the **distribution** using **graphs** and discuss a variety of **numerical measures**.

An introduction to each of these topics follows.

To display data from one quantitative variable graphically, we can use either a **histogram** or **boxplot**.

We will also present several “by-hand” displays such as the **stemplot** and **dotplot** (although we will not rely on these in this course).

The overall pattern of the **distribution** of a quantitative variable is described by its **shape**, **center**, and **spread**.

By inspecting the histogram or boxplot, we can describe the shape of the distribution, but we can only get a rough estimate for the center and spread.

A description of the distribution of a quantitative variable must include, in addition to the **graphical display**, a more precise** numerical description** of the center and spread of the distribution.

In this section we will learn:

- how to display the
**distribution of one quantitative variable**using various graphs; - how to quantify the
**center**and**spread**of the**distribution of one quantitative variable**with various numerical measures; - some of the
**properties**of those**numerical****measures**; - how to choose the
**appropriate****numerical****measures**of**center**and**spread**to supplement the graph(s); and - how to identify potential outliers in the
**distribution of one quantitative variable**

- We will also discuss a few
**measures of position**(also called**measures of location**). These measures- allow us to quantify where a particular value is relative to the
**distribution**of all values - do provide information about the distribution itself
- also use the information
**about the distribution**to learn more about an**INDIVIDUAL**

- allow us to quantify where a particular value is relative to the

Before reading further, try this interactive applet which will give you a preview of some of the topics we will be learning about in this section on exploratory data analysis for one quantitative variable.