Syllabus of Population Health 
800
Quantitative Methods in 
Population Health I
Spring 
2002
 
| Instructor: | Guan-Hua 
      Huang, Ph.D. | 
|   | Office: 
      703 WARF | 
|   | Phone: 
      608-265-6176 | 
|   | |
| Teaching 
      assistant: | Rosanne 
      Scholl | 
|   | Office: | 
|   | Phone: 
      608-286-1586 | 
|   | Email: 
      rmscholl@students.wisc.edu | 
| Class 
      meetings: | Lecture: 
      Tuesday and Thursday 10:00-10:50 am at 758 
WARF | 
|   | Lab: 
      Thursday 12:30-2:00 pm at 758 WARF | 
| Office 
      hours: | Instructor: 
      Tuesday 4:00-5:00 pm | 
|   | TA: 
      Wednesday 3:00-4:00 pm | 
| Course 
      website: | 
 
The goals of this course are to introduce 
regression 
analysis for continuous and discrete data, and data analyses that integrate the 
methods learned in Stat 541 and PM 650 sec. 2. Topics include measures of 
association, simple and multiple linear regressions, inference for regression 
coefficients, confounding and interaction, regression diagnostics, logistic 
regression, and conditional logistic regression. 
 
The 
course consists of lectures and laboratory sessions. The lectures are given on 
Tuesday and Thursday mornings.  The 
lectures will primarily review and reinforce major issues. There is a laboratory 
session on Thursday afternoon. The laboratory exercise will be distributed prior 
to each class, and students are expected to read each lab exercise at home. Each 
student will be assigned to a lab group and discuss the exercise with group 
members in the lab. At the end of the lab, there will be a seminar-type 
discussion. Each group is required to hand in a write-up of laboratory 
problems.  
 
The 
course uses the SAS software for statistical computing. Students are expected to 
be familiar with the usage of the software. 
 
 
Handouts 
corresponding to each lecture will be available on the course website before 
each class. The required textbook for this course is
 
Kleinbaum DG, Kupper LL, 
Muller KE and Nizam A: "Applied Regression Analysis and Other Multivariable 
Methods" 3rd Edition, Duxbury Press, 1998.
 
 
The 
course grade will be based on homeworks (25%), write-ups of lab problems (20%), 
one midterm exam (25%), and one final exam (30%). The midterm exam will be held 
on March 14 (12:30-2:00 pm), and the final exam will be during finals week. Both 
exams are open book.
OUTLINE OF 
MODULES
 
Readings refer to 
Kleinbaum, Kupper, Muller and Nizam: “Applied Regression Analysis and Other 
Multivariable Methods” 3rd Edition, Duxbury Press, 1998 
(KKMN).
 
| Module 
      0 | Revisiting 
      means and review of fundamentals (KKMN Chapter 
    3) | 
|   | Parameter 
      versus sample estimator  Sampling 
      distributions and their importance  Confidence 
      interval and p-value  Some 
      sampling distributions and their relationships | 
|   |   | 
| Module 
      1 | Measures 
      of association with emphasis on the difference of 
      means | 
|   | Typical 
      procedure for analyzing measures of 
association | 
|   |   | 
| Module 
      2 | Basics 
      of linear regression analysis (KKMN Chapter 5 except 
      5-10) | 
|   | The 
      slope as a measure of difference in means  Interpretation 
      of straight line Least 
      squares principle Variance, 
      inference on b1 Assumptions, 
      model checking How 
      assumptions influence estimators | 
|   |   | 
| Module 
      3 | Correlation 
      (KKMN Chapter 6 except 6-3 and 6-7) | 
|   | Pearson 
      correlation Interpretation 
      of correlation squared Significance 
      tests and confidence intervals (Fisher's z-transformation) 
       Different 
      names for correlation coefficients depending on data type 
       Spearman 
      correlation Equivalence 
      of test for 0 correlation and t-test for 2 means (c2 
      for 2 x 2 table) | 
|   |   | 
| Module 
      4 | The 
      Analysis of Variance (ANOVA) Table (KKMN Chapter 7) | 
|   | Definition 
      of model and error sums of squares and mean squares  F-test 
      and its relationship to t-test | 
|   |   | 
| Module 
      5 | Multiple 
      regression (KKMN Chapter 8 and 10-3) | 
|   | Interpretation 
      of multiple regression coefficients as "adjusted"  Interpretation 
      of multiple regression coefficients as averages of simple regression 
      coefficients Concept 
      of statistical efficiency in estimation of multiple regression 
      coefficients R2 | 
|   |   | 
| Supplement | Direct 
      standardization | 
|   | Connection 
      to adjustment for confounding Comparison 
      with multiple regression Seeing 
      multiple regression as the statistically most efficient form of direct 
      standardization | 
|   |   | 
| Module 
      6 | Testing 
      hypotheses in multiple regression (KKMN Chapter 9) | 
|   | t-tests 
      and F-tests including partial F-test | 
|   |   | 
| Module 
      7 | Polynomial 
      regression (KKMN13-1 through 13-6) | 
|   | Adding 
      quadratic terms to model  Interpretation Hierarchical 
      principle  Significance 
      tests | 
|   |   | 
| Module 
      8 | Dummy 
      variables (KKMN 14-1 through 14-3) | 
|   | Coding 
      k categories with k-1 variables Interpretation 
      of coefficients as differences of means  Testing 
      singly and jointly | 
|   |   | 
| Module 
      9 | Confounding 
      and Interaction (KKMN Chapter 11, 14-4 to 14-9) | 
|   | Confounding, 
      definition, detection Understanding 
      how confounding arises Interaction, 
      definition, detection Interpreting 
      interaction as a regression of the slope Extracting 
      regression coefficients for specific levels and groups 
       Difference 
      between confounding and interaction | 
|   |   | 
| Module 
      10 | Regression 
      diagnosis (KKMN Chapter 12) | 
|   | Leverage, 
      influence and outliers Residuals, 
      studentization Residual 
      plots Remedies 
      for assumption violations (transformation) | 
|   |   | 
| Module 
      11 | Model 
      selection for investigation of associations (KKMN 16-5-3 to 
      16-5-4) | 
|   | Risk 
      factor analysis versus prediction  Structuring 
      the variable selection | 
|   |   | 
| Module 
      12 | Rates 
      and risks | 
|   | Differences 
      between risk and rate  Confidence 
      intervals for risk and rate  Instantaneous 
      rate, cumulative rate | 
|   |   | 
| Module 
      13 | Some 
      properties of the odds ratio and the relative risk | 
|   | Difference 
      versus ratio as a measure of association  Some 
      special properties of the odds ratio  Cornfield’s 
      properties of the relative risk | 
|   |   | 
| Module 
      14 | Significance 
      testing in 2x2 tables | 
|   | c2 
      tests (Pearson, Mantel-Haenszel, continuity correction) 
       (Review 
      connection to correlation) Fisher's 
      exact test Review 
      concepts of confounding and interaction Breslow-Day 
      test of interaction Mantel-Haenszel 
      and logit odds ratios for combined relative risk  Foundation 
      of logit estimator as approximately efficiently weighted Mantel-Haenszel 
      stratified c2 | 
|   |   | 
| Module 
      15 | Confidence 
      intervals for the odds ratio and the relative risk | 
|   | Logit 
      (=Woolf)  Test 
      based  Cornfield 
       Exact | 
|   |   | 
| Module 
      16 | Introduction 
      to logistic regression (KKMN pages 656-660) | 
|   | Transformation 
      to expand range of p  Interpretation 
      of coefficients as odds ratios  Solving 
      for the risk Assumptions | 
|   |   | 
| Module 
      17 | Maximum 
      likelihood estimation (KKMN Chapter 22) | 
|   | Principal 
      of maximum likelihood Some 
      common estimators seen as maximum likelihood based Likelihood 
      ratio, score and Wald c2 Standard 
      errors, confidence interals Correspondence 
      of tests to ordinary regression | 
|   |   | 
| Module 
      18 | Control 
      of confounding with logistic regression analysis (KKMN page 
      660) | 
|   | Comparison 
      of regression coefficients with and without controlling for potential 
      confounders Choice 
      of continuous confounder or indicator 
variables | 
|   |   | 
| Module 
      19 | Modeling 
      interaction effects with logistic regression (KKMN pages 
      661-671) | 
|   | Constructing 
      interactions Testing 
      for interactions Obtaining 
      odds ratios for subgroups | 
|   |   | 
| Module 
      20 | Logistic 
      regression for contingency tables | 
|   | Model 
      for 2x2 table Constructing 
      models with and without interaction Saturated 
      models and restrictions Options 
      and interpretation for dummy variables, continuous and ordinal 
       Comparison 
      with contingency table based approach | 
|   |   | 
| Module 
      21 | Goodness-of-fit 
      for logistic regression | 
|   | Comparing 
      observed and expected numbers  Likelihood 
      ratio and Pearson c2 Hosmer 
      and Lemeshow test Residual 
      plots | 
|   |   | 
| Module 
      22 | Logistic 
      regression of case-control data | 
|   | Setup 
      and interpretation Change 
      in intercept based on sampling fractions of cases and 
      controls | 
|   |   | 
| Module 
      23 | Conditional 
      logistic regression | 
|   | Conditional 
      likelihood  Interpretations Advantages 
      and disadvantages | 
|   |   | 
| Final 
      Review | Overview 
      of data analysis | 
|   |   | 
 
WEEK-BY-WEEK 
OUTLINE
 
Homework data sets are 
handed out on Thursdays. Guidelines for assignment preparation should be 
followed.
 
| Week 
      1 | Lecture | Review 
      (Modules 0 and 1) | 
| Jan 
      22, 24 | Lab | Lab 
      group assignment | 
|   | Reading | KKMN 
      Chapter 3 | 
|   | Assignment | NA | 
|   |   |   | 
| Week 
      2 | Lecture | Basics 
      of linear regression (Module 2) | 
| Jan 
      29, 31 | Lab | Basic 
      statistics | 
|   | Reading | KKMN 
      Chapter 5 except 5-10 | 
|   | Assignment | Data 
      set to analyze with t-test and regression (due in 1 
      week) | 
|   |   |   | 
| Week 
      3 | Lecture | Correlation 
      (Module 3) | 
| Feb 
      5, 7 | Lab | Simple 
      linear regression | 
|   | Reading | KKMN 
      Chapter 6 except 6-3 and 6- 7 | 
|   | Assignment | Data 
      set to analyze with correlation analysis (due in 1 
      week) | 
|   |   |   | 
| Week 
      4 | Lecture | The 
      ANOV A table, multiple regression (Modules 4 and 
    5) | 
| Feb 
      12, 14 | Lab | Correlation 
      and linear regression | 
|   | Reading | KKMN 
      Chapter 7, KKMN Chapter 8 | 
|   | Assignment | Data 
      set to analyze by multiple regression (due in 2 
      weeks) | 
|   |   |   | 
| Week 
      5 Feb 
      19, 21 | Lecture | Partial 
      F-test (Module 6), polynomial regression (Module 7), indicator variables 
      (Module 8) | 
|   | Lab | Multiple 
      linear regression and direct standardization | 
|   | Reading | KKMN 
      Chapter 9, 13-1 through 13-6 and 14-1 through 
  14-3 | 
|   | Assignment | NA | 
|   |   |   | 
| Week 
      6 | Lecture | Confounding 
      and interaction (Module 9) | 
| Feb 
      26, 28 | Lab | Partial 
      F-test, polynomial regression and indicator 
    variables | 
|   | Reading | KKMN 
      Chapter 11, 14-4 through 14-9 | 
|   | Assignment | Model 
      building (due in 2 weeks) | 
|   |   |   | 
| Week 
      7 | Lecture | Regression diagnosis (Module 
      10) | 
| Mar 
      5, 7 | Lab | Interaction 
      and confounding | 
|   | Reading | KKMN 
      Chapter 12 | 
|   | Assignment | NA | 
|   |   |   | 
| Week 
      8 | Lecture | Review | 
| Mar 
      12, 14 | Lab | Midterm 
      exam | 
|   | Reading | NA | 
|   | Assignment | NA | 
|   |   |   | 
| Week 
      9 Mar 
      19, 21 | Lecture | Review 
      of exam, properties of relative risk and odds ratio (Module 
      13) | 
|   | Lab | Model 
      selection | 
|   | Reading | Supplemental 
      text | 
|   | Assignment | NA | 
|   |   |   | 
| Week 
      10 |   | Spring 
      break | 
| Mar 
      26, 28 |   |   | 
|   |   |   | 
| Week 
      11 | Lecture | Significance 
      testing in 2x2 and 2xk table (Modules 12 and 14), confidence intervals for 
      odds ratio (Module 15) | 
| Apr 
      2, 4 | Lab | In 
      class exercise on contingency tables | 
|   | Reading | Supplemental 
      text | 
|   | Assignment | Analysis 
      of contingency tables with SAS (due in 2 
weeks) | 
|   |   |   | 
| Week 
      12 Apr 
      9, 11 | Lecture | Introduction 
      to logistic regression (Module 16), maximum likelihood estimation (Module 
      17) | 
|   | Lab | In 
      class exercise on contingency tables and logistic 
      regression | 
|   | Reading | KKMN 
      pages 656-660, KKMN Chapter 22 | 
|   | Assignment | NA | 
|   |   |   | 
| Week 
      13 Apr 
      16, 18 | Lecture | Control 
      of confounding with logistic regression (Module 18), interaction effects 
      in logistic regression (Module 19) | 
|   | Lab | In 
      class exercise on logistic regression | 
|   | Reading | KKMN 
      pages 660-671, supplemental text | 
|   | Assignment | Analysis 
      of data set by logistic regression (due in 2 
    weeks) | 
|   |   |   | 
| Week 
      14 | Lecture | Logistic 
      regression for contingency tables (Module 20) | 
| Apr 
      23, 25 | Lab | In 
      class exercise | 
|   | Reading | Supplemental 
      text | 
|   | Assignment | NA | 
|   |   |   | 
| Week 
      15 | Lecture | Goodness 
      of fit of logistic regression (Module 21)  | 
| Apr 
      30, May 2 | Lab | Model 
      building | 
|   | Reading | Supplemental 
      text | 
|   | Assignment | Analysis 
      of data set by logistic regression (due in 2 weeks) | 
|   |   |   | 
| Week 
      16 | Lecture | Logistic 
      regression for case-control studies (Module 22, 23) | 
| May 
      7, 9 | Lab | Review | 
|   | Reading | KKMN 
      23-5-2 | 
|   | Assignment | NA | 
|   |   |   | 
| Week 
      17 |   | In 
      class final exam during finals week | 
|   |   |   |