Syllabus of Population Health
800
Quantitative Methods in
Population Health I
Spring
2002
Instructor: |
Guan-Hua
Huang, Ph.D. |
|
Office:
703 WARF |
|
Phone:
608-265-6176 |
|
|
Teaching
assistant: |
Rosanne
Scholl |
|
Office: |
|
Phone:
608-286-1586 |
|
Email:
rmscholl@students.wisc.edu |
Class
meetings: |
Lecture:
Tuesday and Thursday 10:00-10:50 am at 758
WARF |
|
Lab:
Thursday 12:30-2:00 pm at 758 WARF |
Office
hours: |
Instructor:
Tuesday 4:00-5:00 pm |
|
TA:
Wednesday 3:00-4:00 pm |
Course
website: |
The goals of this course are to introduce
regression
analysis for continuous and discrete data, and data analyses that integrate the
methods learned in Stat 541 and PM 650 sec. 2. Topics include measures of
association, simple and multiple linear regressions, inference for regression
coefficients, confounding and interaction, regression diagnostics, logistic
regression, and conditional logistic regression.
The
course consists of lectures and laboratory sessions. The lectures are given on
Tuesday and Thursday mornings. The
lectures will primarily review and reinforce major issues. There is a laboratory
session on Thursday afternoon. The laboratory exercise will be distributed prior
to each class, and students are expected to read each lab exercise at home. Each
student will be assigned to a lab group and discuss the exercise with group
members in the lab. At the end of the lab, there will be a seminar-type
discussion. Each group is required to hand in a write-up of laboratory
problems.
The
course uses the SAS software for statistical computing. Students are expected to
be familiar with the usage of the software.
Handouts
corresponding to each lecture will be available on the course website before
each class. The required textbook for this course is
Kleinbaum DG, Kupper LL,
Muller KE and Nizam A: "Applied Regression Analysis and Other Multivariable
Methods" 3rd Edition, Duxbury Press, 1998.
The
course grade will be based on homeworks (25%), write-ups of lab problems (20%),
one midterm exam (25%), and one final exam (30%). The midterm exam will be held
on March 14 (12:30-2:00 pm), and the final exam will be during finals week. Both
exams are open book.
OUTLINE OF
MODULES
Readings refer to
Kleinbaum, Kupper, Muller and Nizam: “Applied Regression Analysis and Other
Multivariable Methods” 3rd Edition, Duxbury Press, 1998
(KKMN).
Module
0 |
Revisiting
means and review of fundamentals (KKMN Chapter
3) |
|
Parameter
versus sample estimator Sampling
distributions and their importance Confidence
interval and p-value Some
sampling distributions and their relationships |
|
|
Module
1 |
Measures
of association with emphasis on the difference of
means |
|
Typical
procedure for analyzing measures of
association |
|
|
Module
2 |
Basics
of linear regression analysis (KKMN Chapter 5 except
5-10) |
|
The
slope as a measure of difference in means Interpretation
of straight line Least
squares principle Variance,
inference on b1 Assumptions,
model checking How
assumptions influence estimators |
|
|
Module
3 |
Correlation
(KKMN Chapter 6 except 6-3 and 6-7) |
|
Pearson
correlation Interpretation
of correlation squared Significance
tests and confidence intervals (Fisher's z-transformation)
Different
names for correlation coefficients depending on data type
Spearman
correlation Equivalence
of test for 0 correlation and t-test for 2 means (c2
for 2 x 2 table) |
|
|
Module
4 |
The
Analysis of Variance (ANOVA) Table (KKMN Chapter 7) |
|
Definition
of model and error sums of squares and mean squares F-test
and its relationship to t-test |
|
|
Module
5 |
Multiple
regression (KKMN Chapter 8 and 10-3) |
|
Interpretation
of multiple regression coefficients as "adjusted" Interpretation
of multiple regression coefficients as averages of simple regression
coefficients Concept
of statistical efficiency in estimation of multiple regression
coefficients R2 |
|
|
Supplement |
Direct
standardization |
|
Connection
to adjustment for confounding Comparison
with multiple regression Seeing
multiple regression as the statistically most efficient form of direct
standardization |
|
|
Module
6 |
Testing
hypotheses in multiple regression (KKMN Chapter 9) |
|
t-tests
and F-tests including partial F-test |
|
|
Module
7 |
Polynomial
regression (KKMN13-1 through 13-6) |
|
Adding
quadratic terms to model Interpretation Hierarchical
principle Significance
tests |
|
|
Module
8 |
Dummy
variables (KKMN 14-1 through 14-3) |
|
Coding
k categories with k-1 variables Interpretation
of coefficients as differences of means Testing
singly and jointly |
|
|
Module
9 |
Confounding
and Interaction (KKMN Chapter 11, 14-4 to 14-9) |
|
Confounding,
definition, detection Understanding
how confounding arises Interaction,
definition, detection Interpreting
interaction as a regression of the slope Extracting
regression coefficients for specific levels and groups
Difference
between confounding and interaction |
|
|
Module
10 |
Regression
diagnosis (KKMN Chapter 12) |
|
Leverage,
influence and outliers Residuals,
studentization Residual
plots Remedies
for assumption violations (transformation) |
|
|
Module
11 |
Model
selection for investigation of associations (KKMN 16-5-3 to
16-5-4) |
|
Risk
factor analysis versus prediction Structuring
the variable selection |
|
|
Module
12 |
Rates
and risks |
|
Differences
between risk and rate Confidence
intervals for risk and rate Instantaneous
rate, cumulative rate |
|
|
Module
13 |
Some
properties of the odds ratio and the relative risk |
|
Difference
versus ratio as a measure of association Some
special properties of the odds ratio Cornfield’s
properties of the relative risk |
|
|
Module
14 |
Significance
testing in 2x2 tables |
|
c2
tests (Pearson, Mantel-Haenszel, continuity correction)
(Review
connection to correlation) Fisher's
exact test Review
concepts of confounding and interaction Breslow-Day
test of interaction Mantel-Haenszel
and logit odds ratios for combined relative risk Foundation
of logit estimator as approximately efficiently weighted Mantel-Haenszel
stratified c2 |
|
|
Module
15 |
Confidence
intervals for the odds ratio and the relative risk |
|
Logit
(=Woolf) Test
based Cornfield
Exact |
|
|
Module
16 |
Introduction
to logistic regression (KKMN pages 656-660) |
|
Transformation
to expand range of p Interpretation
of coefficients as odds ratios Solving
for the risk Assumptions |
|
|
Module
17 |
Maximum
likelihood estimation (KKMN Chapter 22) |
|
Principal
of maximum likelihood Some
common estimators seen as maximum likelihood based Likelihood
ratio, score and Wald c2 Standard
errors, confidence interals Correspondence
of tests to ordinary regression |
|
|
Module
18 |
Control
of confounding with logistic regression analysis (KKMN page
660) |
|
Comparison
of regression coefficients with and without controlling for potential
confounders Choice
of continuous confounder or indicator
variables |
|
|
Module
19 |
Modeling
interaction effects with logistic regression (KKMN pages
661-671) |
|
Constructing
interactions Testing
for interactions Obtaining
odds ratios for subgroups |
|
|
Module
20 |
Logistic
regression for contingency tables |
|
Model
for 2x2 table Constructing
models with and without interaction Saturated
models and restrictions Options
and interpretation for dummy variables, continuous and ordinal
Comparison
with contingency table based approach |
|
|
Module
21 |
Goodness-of-fit
for logistic regression |
|
Comparing
observed and expected numbers Likelihood
ratio and Pearson c2 Hosmer
and Lemeshow test Residual
plots |
|
|
Module
22 |
Logistic
regression of case-control data |
|
Setup
and interpretation Change
in intercept based on sampling fractions of cases and
controls |
|
|
Module
23 |
Conditional
logistic regression |
|
Conditional
likelihood Interpretations Advantages
and disadvantages |
|
|
Final
Review |
Overview
of data analysis |
|
|
WEEK-BY-WEEK
OUTLINE
Homework data sets are
handed out on Thursdays. Guidelines for assignment preparation should be
followed.
Week
1 |
Lecture |
Review
(Modules 0 and 1) |
Jan
22, 24 |
Lab |
Lab
group assignment |
|
Reading |
KKMN
Chapter 3 |
|
Assignment |
NA |
|
|
|
Week
2 |
Lecture |
Basics
of linear regression (Module 2) |
Jan
29, 31 |
Lab |
Basic
statistics |
|
Reading |
KKMN
Chapter 5 except 5-10 |
|
Assignment |
Data
set to analyze with t-test and regression (due in 1
week) |
|
|
|
Week
3 |
Lecture |
Correlation
(Module 3) |
Feb
5, 7 |
Lab |
Simple
linear regression |
|
Reading |
KKMN
Chapter 6 except 6-3 and 6- 7 |
|
Assignment |
Data
set to analyze with correlation analysis (due in 1
week) |
|
|
|
Week
4 |
Lecture |
The
ANOV A table, multiple regression (Modules 4 and
5) |
Feb
12, 14 |
Lab |
Correlation
and linear regression |
|
Reading |
KKMN
Chapter 7, KKMN Chapter 8 |
|
Assignment |
Data
set to analyze by multiple regression (due in 2
weeks) |
|
|
|
Week
5 Feb
19, 21 |
Lecture |
Partial
F-test (Module 6), polynomial regression (Module 7), indicator variables
(Module 8) |
|
Lab |
Multiple
linear regression and direct standardization |
|
Reading |
KKMN
Chapter 9, 13-1 through 13-6 and 14-1 through
14-3 |
|
Assignment |
NA |
|
|
|
Week
6 |
Lecture |
Confounding
and interaction (Module 9) |
Feb
26, 28 |
Lab |
Partial
F-test, polynomial regression and indicator
variables |
|
Reading |
KKMN
Chapter 11, 14-4 through 14-9 |
|
Assignment |
Model
building (due in 2 weeks) |
|
|
|
Week
7 |
Lecture |
Regression diagnosis (Module
10) |
Mar
5, 7 |
Lab |
Interaction
and confounding |
|
Reading |
KKMN
Chapter 12 |
|
Assignment |
NA |
|
|
|
Week
8 |
Lecture |
Review |
Mar
12, 14 |
Lab |
Midterm
exam |
|
Reading |
NA |
|
Assignment |
NA |
|
|
|
Week
9 Mar
19, 21 |
Lecture |
Review
of exam, properties of relative risk and odds ratio (Module
13) |
|
Lab |
Model
selection |
|
Reading |
Supplemental
text |
|
Assignment |
NA |
|
|
|
Week
10 |
|
Spring
break |
Mar
26, 28 |
|
|
|
|
|
Week
11 |
Lecture |
Significance
testing in 2x2 and 2xk table (Modules 12 and 14), confidence intervals for
odds ratio (Module 15) |
Apr
2, 4 |
Lab |
In
class exercise on contingency tables |
|
Reading |
Supplemental
text |
|
Assignment |
Analysis
of contingency tables with SAS (due in 2
weeks) |
|
|
|
Week
12 Apr
9, 11 |
Lecture |
Introduction
to logistic regression (Module 16), maximum likelihood estimation (Module
17) |
|
Lab |
In
class exercise on contingency tables and logistic
regression |
|
Reading |
KKMN
pages 656-660, KKMN Chapter 22 |
|
Assignment |
NA |
|
|
|
Week
13 Apr
16, 18 |
Lecture |
Control
of confounding with logistic regression (Module 18), interaction effects
in logistic regression (Module 19) |
|
Lab |
In
class exercise on logistic regression |
|
Reading |
KKMN
pages 660-671, supplemental text |
|
Assignment |
Analysis
of data set by logistic regression (due in 2
weeks) |
|
|
|
Week
14 |
Lecture |
Logistic
regression for contingency tables (Module 20) |
Apr
23, 25 |
Lab |
In
class exercise |
|
Reading |
Supplemental
text |
|
Assignment |
NA |
|
|
|
Week
15 |
Lecture |
Goodness
of fit of logistic regression (Module 21) |
Apr
30, May 2 |
Lab |
Model
building |
|
Reading |
Supplemental
text |
|
Assignment |
Analysis
of data set by logistic regression (due in 2 weeks) |
|
|
|
Week
16 |
Lecture |
Logistic
regression for case-control studies (Module 22, 23) |
May
7, 9 |
Lab |
Review |
|
Reading |
KKMN
23-5-2 |
|
Assignment |
NA |
|
|
|
Week
17 |
|
In
class final exam during finals week |
|
|
|