# PHC 6053

## Prerequisites

This course requires

• PHC 6052 as a pre-requisite
• SAS installed and working at the start of the semester

For students who have not taken PHC 6052, in order to enroll you will need to

• have taken a graduate level course in basic statistics covering one and two variable methods
• demonstrate SAS competency at the PHC 6052 level by completing an assigned analysis in SAS prior to the start of the semester.
• Contact Dr. Amy Cantrell directly, preferrably at least 2 months prior to the start of classes, to discuss the possibility of enrolling without the pre-requisite course.

Although most statistical analyses will be conducted using software, students should be comfortable

• working with equations
• performing basic mathematical calculations including
• order of operations
• fractions
• square roots
• logarithms (base e)
• exponentials (ex).

Note: in statistics the notation log is equivalent to the natural logarithm ln.  This can be confusing as in algebra a log with no base assumes a base of 10.

• In this course, a log with no base is assumed to be the natural logarithm, log = ln = loge.
• We may still occasionally use ln notation for natrual logarithm as well.
• We will never use any base other than e regardless of the notation used.

## Main Course Goal

This course introduces graduate students in fields other than statistics to a wide range of modern regression methods. Emphasis is on modeling driven by actual data from studies in a variety of areas, primarily from health, biology, and ecology.

The primary topics are multiple linear regression, logistic regression, and Poisson regression. A main goal is to learn what approach to use among the linear and nonlinear models, and how to determine if the fit is adequate.

By the end of the course, students will achieve competency in carrying out the analyses in SAS.

## Course Objectives

The following objectives will be addressed.

Study Tip: During the course, contemplate the course objectives and consider what you have learned that applies to each.

• CO-1: Select appropriate methods for a scenario; determine if a linear or a nonlinear approach is appropriate
• CO-2: Use statistical software for performing regression analysis in the SAS language
• CO-3: Test and interpret linear models for continuous outcome data (normal linear model)
• CO-4: Test and interpret models for categorical outcome data (logistic and Poisson regression)
• CO-5: Draw appropriate conclusions for both randomized designed experiments and observational studies
• CO-6: Communicate clearly to subject matter experts the purposes and results of complex statistical analysis, both orally and in writing.

## Topic List

The following broad topics will be covered, those given in bold will be our primary focus for most of the semester.

• Unit 1: Exploratory Methods and Inference in Case CQ
• Unit 2: Inference in Case QQ – Simple Linear Regression
• Unit 3: Multiple Linear Regression
• Unit 4: Inference in Case CC and QC – Contingency Tables and Simple Logistic Regression
• Unit 5: Multiple Logistic Regression
• Unit 6: Model Selection
• Unit 7: GLM and Poisson Regression

## References and Suggested Textbooks

Penn State has two courses with excellent sets of online materials:

The course materials were originally developed using the following book which is available freely via the UF library. The textbook is not required for reading in this course but is a good reference book for regression methods at the applied level. The mathematical detail is kept to a minimum. The links may only work properly when connected to the UF network either directly or via VPN.

• Regression Methods in Biostatistics – Linear, Logistic, Survival, and Repeated Measures Models.
Authors: Eric Vittinghoff, David V. Glidden, Stephen C. Shiboski, Charles E. McCulloch
ISBN: 978-1-4614-1352-3 (Print) 978-1-4614-1353-0 (Online)