This document is linked from Linear Relationships – Correlation.

]]>This document is linked from Scatterplots.

]]>This document is linked from Case Q-Q.

]]>From the online version of Little Handbook of Statistical Practice, this reading contains a detailed discussion of correlation.

This document is linked from Linear Relationships – Correlation.

]]>Optional: Create your own solutions using your software for extra practice.

- Observe how an outlier can affect the correlation coefficient by comparing the value using data with and without an outlier.

Use the following output to answer the questions that follow.

The average gestation period, or time of pregnancy, of an animal is closely related to its longevity — the length of its lifespan. Data on the average gestation period and longevity (in captivity) of 40 different species of animals have been recorded.

Here is a summary of the variables in our dataset:

**animal:**the name of the animal species.**gestation:**the average gestation period of the species, in days.**longevity:**the average longevity of the species, in years.

Remember that the correlation is only an appropriate measure of the **linear **relationship between two quantitative variables. First produce a scatterplot to verify that gestation and longevity are nearly linear in their relationship.

Answer the following questions using the output obtained. In this exercise we will:

- use the scatterplot to examine the relationship between two quantitative variables.
- use the labeled scatterplot to better understand the form of a relationship.

(Optional) SPSS Steps:

**Label Variables amd Define Variable Properties****Create Scatterplot:**GRAPHS > CHART BUILDER, create a simple scatterplot relating X = longevity to Y = gestation**Calculate Correlation:**ANALYZE > CORRELATE > BIVARIATE, calculate the correlation between longevity and gestation as illustrated**Remove Outlier and Save New Data:**select the row containing the outlier, right-click on the row number and choose CUT**Re-create Scatterplot:**GRAPHS > CHART BUILDER, create a simple scatterplot relating X = longevity to Y = gestation using the new dataset**Re-calculate Correlation:**ANALYZE > CORRELATE > BIVARIATE, calculate the correlation on the new dataset

**Label Variables:**Using a DATA step create a new dataset (animals2) where you label the varibles longevity and gestation as Longevity (years) and Gestation (days) using a LABEL statement.**View Dataset Information in SAS:**Use PROC CONTENTS to view the information about the new dataset.**Create Basic Scatterplot:**Use PROC SGPLOT and the SCATTER statement to create a scatterplot of X=longevity by Y=gestation.**Calculate Correlation Coefficient:**Use PROC CORR to calculate the correlation coefficient between X=longevity by Y=gestation. In SAS 9.3 you will likely get the scatterplot matrix automatically, in SAS 9.2 you must request this by using ODS GRAPHICS ON before the procedure and ODS GRAPHIC OFF to stop producing this output after the procedure (or whenever you wish to stop producing ODS GRAPHICS).**Delete Outlier:**Using a DATA step create a new dataset (animals3) and use an IF-THEN statement to delete the observation corresponding to the outlier. This outlier is an elephant with average longevity of 40 years and average gestation of 645 days.**View Dataset Information in SAS:**Use PROC CONTENTS to view the information about the new dataset where you have removed the outlier.**Create Basic Scatterplot:**Use PROC SGPLOT and the SCATTER statement to create a scatterplot of X=longevity by Y=gestation on the dataset with the outlier removed.**Calculate Correlation Coefficient:**Use PROC CORR to calculate the correlation coefficient bewteen X=longevity by Y=gestation on the dataset with the outlier removed.

This document is linked from Linear Relationships – Correlation.

]]>- Fill the scatterplot with a hypothetical positive linear relationship between X and Y (by clicking on the graph about a dozen times starting at lower left and going up diagonally to the top right). Pay attention to the correlation coefficient calculated at the top right of the applet. (Clicking on the garbage can will let you start over.)

- Once you are satisfied with your hypothetical data, create an outlier by clicking on one of the data points in the upper right of the graph, and dragging it down along the right side of the graph. Again, pay attention to what happens to the value of the correlation

This document is linked from Linear Relationships – Correlation.

]]>

This document is linked from Linear Relationships – Correlation.

]]>Here is another interactive demonstration from the Rosman/Chance collection which has extensive options and illustrates many ideas about linear regression and correlation.

And, remember the two-variable calculator we introduced earlier.

This document is linked from Linear Relationships – Correlation.

]]>This document is linked from Linear Relationships – Correlation.

]]>Choose one of the datasets in the list and click through the tabs at the top to see the data and results!

This document is linked from Case Q-Q.

]]>