NATIONAL CHIAO TUNG UNIVERSITY
INSTITUTE OF STATISTICS
MULTIVARIATE ANALYSIS
SPRING 2018
Instructor: |
Guan-Hua Huang, Ph.D. |
|
Office: 423 Joint Education Hall |
|
Phone: 03-513-1334 |
|
Email: ghuang@stat.nctu.edu.tw |
Class meetings: |
Monday 9:00-12:00 at A203 Joint Education Hall |
Office hours: |
By appointment |
Class website: |
|
Credit: |
Three (3) credits |
The aims of this course are
To illustrate extensions of univariate statistical methodology to multivariate data.
To introduce students to some of the distinctive statistical methodologies which arise only in multivariate data.
To introduce students to some of the computational techniques required for multivariate analysis available in standard statistical packages.
Topics include: multivariate techniques and analyses, multivariate analysis of variance, principal component analysis and factor analysis, canonical correlation analysis, cluster analysis, discrimination and classification, machine learning.
The course uses the R software for statistical computing. Students are expected to be familiar with the usage of the software.
Handouts corresponding to each lecture will be available on the class website before each class. The required textbook for this course is:
Johnson, R.A. and Wichern, D.W., 2007. Applied Multivariate Statistical Analysis (6th Edition). Prentice Hall, Upper Saddle River, NJ.
Reading assignments will be made primary in this book.
Students are expected to have background on undergraduate linear algebra, probability, mathematical statistics, and linear regression. Computer programming knowledge on R and/or C/C++ is required.
The course grade will be based on 5 homework assignments (50%), 1 midterm exam (20%), and 1 final exam (30%).
COURSE OUTLINE
Readings refer to: Johnson, R.A. and Wichern, D.W., 2007. Applied Multivariate Statistical Analysis (6th Edition). (AMSA)
Module |
Topic |
Reading (pages) |
1 |
Aspects of multivariate analysis: - introduction - review of linear algebra and matrices |
1-30, 49-110 |
2 |
Random vectors and random sampling: - random vectors/matrices - distance - the sample - random sampling of the sample mean vector and covariance matrix - generalized variance - matrix operations of sample values |
30-37, 60-78, 111-148 |
3 |
Multivariate normal distribution: - density and properties - sampling from multivariate normal and MLE - sampling distribution and large sample behavior of and S - assessing the assumption of normality - transformation to near normality |
149-200 |
4 |
Inferences about a mean vector: - inference for a normal population mean - Hotelling's T2 and likelihood ratio test - confidence regions and simultaneous comparisons of component means - large sample inferences about a population mean vector |
210-238 |
5 |
Comparisons of several multivariate means: - paired comparisons and repeated measures design - comparing mean vectors from two populations - comparing several multivariate population means (one-way MANOVA) |
273-312 |
6 |
Principal components: - introduction - population principal components - summarizing sample variation by principal components - large sample inferences |
430-459 |
7 |
Factor analysis: - introduction - orthogonal factor model - methods of estimation - factor rotation - factor scores |
481-526 |
8 |
Canonical correlation analysis: - introduction - population and sample canonical variates and canonical correlations - sample descriptive measures of goodness |
539-563 |
9 |
Clustering: - introduction - similarity measures - hierarchical clustering methods - k-means clustering methods - multidimensional scaling |
671-715 |
10 |
Discrimination and classification: - introduction - separation and classification for two populations - classification with two multivariate normal populations - evaluating classification functions - fisher discriminant function - classification with several population |
575-644 |
11 |
Machine learning - support vector machine - neural networks - classification and regression tree |
|