Syllabus of Population Health 800

Quantitative Methods in Population Health I

Spring 2002

 

Instructor:

Guan-Hua Huang, Ph.D.

 

Office: 703 WARF

 

Phone: 608-265-6176

 

Email: guanhuahuang@facstaff.wisc.edu

Teaching assistant:

Rosanne Scholl

 

Office:

 

Phone: 608-286-1586

 

Email: rmscholl@students.wisc.edu

Class meetings:

Lecture: Tuesday and Thursday 10:00-10:50 am at 758 WARF

 

Lab: Thursday 12:30-2:00 pm at 758 WARF

Office hours:

Instructor: Tuesday 4:00-5:00 pm

 

TA: Wednesday 3:00-4:00 pm

Course website:

http://webct.wisc.edu/

 

COURSE SUMMARY

 

The goals of this course are to introduce regression analysis for continuous and discrete data, and data analyses that integrate the methods learned in Stat 541 and PM 650 sec. 2. Topics include measures of association, simple and multiple linear regressions, inference for regression coefficients, confounding and interaction, regression diagnostics, logistic regression, and conditional logistic regression.

 

The course consists of lectures and laboratory sessions. The lectures are given on Tuesday and Thursday mornings.  The lectures will primarily review and reinforce major issues. There is a laboratory session on Thursday afternoon. The laboratory exercise will be distributed prior to each class, and students are expected to read each lab exercise at home. Each student will be assigned to a lab group and discuss the exercise with group members in the lab. At the end of the lab, there will be a seminar-type discussion. Each group is required to hand in a write-up of laboratory problems. 

 

The course uses the SAS software for statistical computing. Students are expected to be familiar with the usage of the software.

 

HANDOUTS AND TEXTBOOKS

 

Handouts corresponding to each lecture will be available on the course website before each class. The required textbook for this course is

 

Kleinbaum DG, Kupper LL, Muller KE and Nizam A: "Applied Regression Analysis and Other Multivariable Methods" 3rd Edition, Duxbury Press, 1998.

 

METHOD OF STUDENT EVALUATION

 

The course grade will be based on homeworks (25%), write-ups of lab problems (20%), one midterm exam (25%), and one final exam (30%). The midterm exam will be held on March 14 (12:30-2:00 pm), and the final exam will be during finals week. Both exams are open book.


OUTLINE OF MODULES

 

Readings refer to Kleinbaum, Kupper, Muller and Nizam: “Applied Regression Analysis and Other Multivariable Methods” 3rd Edition, Duxbury Press, 1998 (KKMN).

 

Module 0

Revisiting means and review of fundamentals (KKMN Chapter 3)

 

Parameter versus sample estimator

Sampling distributions and their importance

Confidence interval and p-value

Some sampling distributions and their relationships

 

 

Module 1

Measures of association with emphasis on the difference of means

 

Typical procedure for analyzing measures of association

 

 

Module 2

Basics of linear regression analysis (KKMN Chapter 5 except 5-10)

 

The slope as a measure of difference in means

Interpretation of straight line

Least squares principle

Variance, inference on b1

Assumptions, model checking

How assumptions influence estimators

 

 

Module 3

Correlation (KKMN Chapter 6 except 6-3 and 6-7)

 

Pearson correlation

Interpretation of correlation squared

Significance tests and confidence intervals (Fisher's z-transformation)

Different names for correlation coefficients depending on data type

Spearman correlation

Equivalence of test for 0 correlation and t-test for 2 means (c2 for 2 x 2 table)

 

 

Module 4

The Analysis of Variance (ANOVA) Table (KKMN Chapter 7)

 

Definition of model and error sums of squares and mean squares

F-test and its relationship to t-test

 

 

Module 5

Multiple regression (KKMN Chapter 8 and 10-3)

 

Interpretation of multiple regression coefficients as "adjusted"

Interpretation of multiple regression coefficients as averages of simple regression coefficients

Concept of statistical efficiency in estimation of multiple regression coefficients

R2

 

 

Supplement

Direct standardization

 

Connection to adjustment for confounding

Comparison with multiple regression

Seeing multiple regression as the statistically most efficient form of direct standardization

 

 

Module 6

Testing hypotheses in multiple regression (KKMN Chapter 9)

 

t-tests and F-tests including partial F-test

 

 

Module 7

Polynomial regression (KKMN13-1 through 13-6)

 

Adding quadratic terms to model

Interpretation

Hierarchical principle

Significance tests

 

 

Module 8

Dummy variables (KKMN 14-1 through 14-3)

 

Coding k categories with k-1 variables

Interpretation of coefficients as differences of means

Testing singly and jointly

 

 

Module 9

Confounding and Interaction (KKMN Chapter 11, 14-4 to 14-9)

 

Confounding, definition, detection

Understanding how confounding arises

Interaction, definition, detection

Interpreting interaction as a regression of the slope

Extracting regression coefficients for specific levels and groups

Difference between confounding and interaction

 

 

Module 10

Regression diagnosis (KKMN Chapter 12)

 

Leverage, influence and outliers

Residuals, studentization

Residual plots

Remedies for assumption violations (transformation)

 

 

Module 11

Model selection for investigation of associations (KKMN 16-5-3 to 16-5-4)

 

Risk factor analysis versus prediction

Structuring the variable selection

 

 

Module 12

Rates and risks

 

Differences between risk and rate

Confidence intervals for risk and rate

Instantaneous rate, cumulative rate

 

 

Module 13

Some properties of the odds ratio and the relative risk

 

Difference versus ratio as a measure of association

Some special properties of the odds ratio

Cornfield’s properties of the relative risk

 

 

Module 14

Significance testing in 2x2 tables

 

c2 tests (Pearson, Mantel-Haenszel, continuity correction)

(Review connection to correlation)

Fisher's exact test

Review concepts of confounding and interaction

Breslow-Day test of interaction

Mantel-Haenszel and logit odds ratios for combined relative risk

Foundation of logit estimator as approximately efficiently weighted Mantel-Haenszel stratified c2

 

 

Module 15

Confidence intervals for the odds ratio and the relative risk

 

Logit (=Woolf)

Test based

Cornfield

Exact

 

 

Module 16

Introduction to logistic regression (KKMN pages 656-660)

 

Transformation to expand range of p

Interpretation of coefficients as odds ratios

Solving for the risk

Assumptions

 

 

Module 17

Maximum likelihood estimation (KKMN Chapter 22)

 

Principal of maximum likelihood

Some common estimators seen as maximum likelihood based

Likelihood ratio, score and Wald c2

Standard errors, confidence interals

Correspondence of tests to ordinary regression

 

 

Module 18

Control of confounding with logistic regression analysis (KKMN page 660)

 

Comparison of regression coefficients with and without controlling for potential confounders

Choice of continuous confounder or indicator variables

 

 

Module 19

Modeling interaction effects with logistic regression (KKMN pages 661-671)

 

Constructing interactions

Testing for interactions

Obtaining odds ratios for subgroups

 

 

Module 20

Logistic regression for contingency tables

 

Model for 2x2 table

Constructing models with and without interaction

Saturated models and restrictions

Options and interpretation for dummy variables, continuous and ordinal

Comparison with contingency table based approach

 

 

Module 21

Goodness-of-fit for logistic regression

 

Comparing observed and expected numbers

Likelihood ratio and Pearson c2

Hosmer and Lemeshow test

Residual plots

 

 

Module 22

Logistic regression of case-control data

 

Setup and interpretation

Change in intercept based on sampling fractions of cases and controls

 

 

Module 23

Conditional logistic regression

 

Conditional likelihood

Interpretations

Advantages and disadvantages

 

 

Final Review

Overview of data analysis

 

 

 


WEEK-BY-WEEK OUTLINE

 

Homework data sets are handed out on Thursdays. Guidelines for assignment preparation should be followed.

 

Week 1

Lecture

Review (Modules 0 and 1)

Jan 22, 24

Lab

Lab group assignment

 

Reading

KKMN Chapter 3

 

Assignment

NA

 

 

 

Week 2

Lecture

Basics of linear regression (Module 2)

Jan 29, 31

Lab

Basic statistics

 

Reading

KKMN Chapter 5 except 5-10

 

Assignment

Data set to analyze with t-test and regression (due in 1 week)

 

 

 

Week 3

Lecture

Correlation (Module 3)

Feb 5, 7

Lab

Simple linear regression

 

Reading

KKMN Chapter 6 except 6-3 and 6- 7

 

Assignment

Data set to analyze with correlation analysis (due in 1 week)

 

 

 

Week 4

Lecture

The ANOV A table, multiple regression (Modules 4 and 5)

Feb 12, 14

Lab

Correlation and linear regression

 

Reading

KKMN Chapter 7, KKMN Chapter 8

 

Assignment

Data set to analyze by multiple regression (due in 2 weeks)

 

 

 

Week 5

Feb 19, 21

Lecture

Partial F-test (Module 6), polynomial regression (Module 7), indicator variables (Module 8)

 

Lab

Multiple linear regression and direct standardization

 

Reading

KKMN Chapter 9, 13-1 through 13-6 and 14-1 through 14-3

 

Assignment

NA

 

 

 

Week 6

Lecture

Confounding and interaction (Module 9)

Feb 26, 28

Lab

Partial F-test, polynomial regression and indicator variables

 

Reading

KKMN Chapter 11, 14-4 through 14-9

 

Assignment

Model building (due in 2 weeks)

 

 

 

Week 7

Lecture

Regression diagnosis (Module 10)

Mar 5, 7

Lab

Interaction and confounding

 

Reading

KKMN Chapter 12

 

Assignment

NA

 

 

 

Week 8

Lecture

Review

Mar 12, 14

Lab

Midterm exam

 

Reading

NA

 

Assignment

NA

 

 

 

Week 9

Mar 19, 21

Lecture

Review of exam, properties of relative risk and odds ratio (Module 13)

 

Lab

Model selection

 

Reading

Supplemental text

 

Assignment

NA

 

 

 

Week 10

 

Spring break

Mar 26, 28

 

 

 

 

 

Week 11

Lecture

Significance testing in 2x2 and 2xk table (Modules 12 and 14), confidence intervals for odds ratio (Module 15)

Apr 2, 4

Lab

In class exercise on contingency tables

 

Reading

Supplemental text

 

Assignment

Analysis of contingency tables with SAS (due in 2 weeks)

 

 

 

Week 12

Apr 9, 11

Lecture

Introduction to logistic regression (Module 16), maximum likelihood estimation (Module 17)

 

Lab

In class exercise on contingency tables and logistic regression

 

Reading

KKMN pages 656-660, KKMN Chapter 22

 

Assignment

NA

 

 

 

Week 13

Apr 16, 18

Lecture

Control of confounding with logistic regression (Module 18), interaction effects in logistic regression (Module 19)

 

Lab

In class exercise on logistic regression

 

Reading

KKMN pages 660-671, supplemental text

 

Assignment

Analysis of data set by logistic regression (due in 2 weeks)

 

 

 

Week 14

Lecture

Logistic regression for contingency tables (Module 20)

Apr 23, 25

Lab

In class exercise

 

Reading

Supplemental text

 

Assignment

NA

 

 

 

Week 15

Lecture

Goodness of fit of logistic regression (Module 21)

Apr 30, May 2

Lab

Model building

 

Reading

Supplemental text

 

Assignment

Analysis of data set by logistic regression (due in 2 weeks)

 

 

 

Week 16

Lecture

Logistic regression for case-control studies (Module 22, 23)

May 7, 9

Lab

Review

 

Reading

KKMN 23-5-2

 

Assignment

NA

 

 

 

Week 17

 

In class final exam during finals week