This document linked from Unit 4B: Inference for Relationships

]]>This document is linked from Linear Relationships – Correlation.

]]>This document is linked from Linear Relationships – Regression.

]]>This document is linked from Scatterplots.

]]>This document is linked from Case Q-Q.

]]>This document is linked from Role-Type Classification.

]]>From the online version of Little Handbook of Statistical Practice, this reading contains a detailed discussion of correlation.

This document is linked from Linear Relationships – Correlation.

]]>

This document is linked from Linear Relationships – Linear Regression.

]]>Optional: Create your own solutions using your software for extra practice.

- Find a regression line and plot it on the scatterplot
- Examine the effect of outliers on the regression line

Use the following output to answer the questions that follow.

The modern Olympic Games have changed dramatically since their inception in 1896. For example, many commentators have remarked on the change in the quality of athletic performances from year to year. Regression will allow us to investigate the change in winning times for one event — the 1,500 meter race.

Here is a summary of the variables in our dataset:

**Year:**the year of the Olympic Games, from 1896 to 2000.**Time:**the winning time for the 1,500 meter race, in seconds.

Answer the following questions using the output. In this exercise you will:

- use the regression line to make predictions
- evaluate how reliable these predictions are

Use the linear regression on the full data to answer the following question.

Use the linear regression after removing the outlier to answer the next two questions.

**Import Data:**FILE > OPEN > DATA, choose Excel file from the pull-down, find the file, continue**Edit Data:**DATA > DEFINE VARIABLE PROPERTIES**Scatterplot:**GRAPHS > CHART BUILDER, create a simple scatterplot relating X = Year to Y = Time, double click on created scatterplot to add trend-line**Regression Equation:**ANALYZE > REGRESSION > LINEAR**Remove Outlier and Save New Data:**select the row containing the outlier, right-click on the row number and choose CUT**Scatterplot:**GRAPHS > CHART BUILDER, create a simple scatterplot relating X = Year to Y = Time using the new dataset, double click on created scatterplot to add trend-line**Regression Equation:**ANALYZE > REGRESSION > LINEAR

**View Dataset Information in SAS:**Use PROC CONTENTS to view the information about the dataset.**Create Regression Analysis with Fit Plot:**Use PROC REG to obtain the simple linear regression analysis for Y = time using X = year as the predictor. In SAS 9.3 (if you have ODS GRAPHICS enabled) you should obtain the fit plot by default in your HTML output). In SAS 9.2 you must use ODS GRAPHCIS ON to obtain these results.Note: In SAS 9.2, I tend to use ODS GRAPHICS OFF immediately following the procedure. This is not neccessary, however, you will receive ODS GRAPHICS until you turn it off with this command or exit SAS 9.2. In SAS 9.3, ODS GRAPHICS are enabled by default but can be enabled/disabled under TOOLS > OPTIONS > PREFERENCES in the RESULTS tab.**Delete Outlier:**Using a DATA step create a new dataset (olympics2) and use an IF-THEN statement to delete the observation corresponding to the outlier. This outlier is for the first observation in year=1896.**Create Regression Analysis with Fit Plot:**Use PROC REG to obtain the simple linear regression analysis for Y = time using X = year as the predictor using your dataset with the outlier removed. In SAS 9.3 (if you have ODS GRAPHICS enabled) you should obtain the fit plot by default in your HTML output). In SAS 9.2 you must use ODS GRAPHCIS ON to obtain these results.

This document is linked from Linear Relationships – Linear Regression.

]]>To see the effect of outliers on a regression equation, use the applet introduced earlier. Draw points on the graph, add the regression line and then add an outlier or move an observation to see how the regression line changes.

Here is another similar applet that can be used to illustrate outliers and guessing lines of best fit.

Here is an interactive demonstration from the Rosman/Chance collection which has extensive options and illustrates many ideas about linear regression and correlation.

And, remember the two-variable calculator we introduced earlier.

This document is linked from Linear Relationships – Linear Regression.

]]>