NATIONAL YANG MING CHIAO TUNG UNIVERSITY
MULTIVARIATE
ANALYSIS
SPRING 2024
Instructor: |
Guan-Hua Huang, Ph.D. |
|
Office: A423 Joint Education Hall |
|
Phone: 03-513-1334 |
|
Email: ghuang@nycu.edu.tw |
Class meetings: |
Wednesday
9:00-12:00 at A406 Joint Education Hall |
Office hours: |
By
appointment |
Class website: |
|
Credit: |
Three (3) credits |
The aims of this course are
To illustrate extensions of univariate statistical
methodology to multivariate data.
To introduce students to some of the distinctive
statistical methodologies which arise only in multivariate data.
To introduce students to some of the computational
techniques required for multivariate analysis available in standard statistical
packages.
Topics include multivariate
techniques and analyses, multivariate analysis of variance, principal component
analysis and factor analysis, canonical correlation analysis, cluster analysis,
discrimination and classification, and machine learning.
The course
uses the R software for statistical computing. Students are expected to be
familiar with the usage of the software.
Handouts corresponding to each lecture
will be available on the class website before each class. The required textbook
for this course is:
Johnson, R.A. and Wichern, D.W., 2007. Applied Multivariate Statistical Analysis (6th Edition). Prentice Hall,
Upper Saddle River, NJ, USA.
The following book is recommended for further
reading:
Hastie, Tibshirani and Friedman,
2009. The Elements of Statistical Learning (2nd edition). Springer, New York, NY, USA.
Reading assignments will be made primarily in these
two books.
Students are
expected to have a background in undergraduate linear algebra, probability, mathematical
statistics, and linear regression. Computer programming knowledge on R and/or
C/C++ is required.
The course grade will be based on 5 homework assignments (50%), 1 midterm exam (20%), and 1 final exam (30%).
COURSE OUTLINE
Johnson, R.A. and Wichern, D.W., 2007. Applied Multivariate Statistical Analysis (6th Edition). (AMSA),
Hastie, Tibshirani
and Friedman, 2009. The Elements of Statistical Learning (2nd
edition).
(ESL)
Module |
Topic |
|
1 |
Aspects of multivariate
analysis: - introduction - review of linear algebra and matrices |
AMSA: 1-30, 49-110 |
2 |
Random vectors and random
sampling: - random vectors/matrices - distance - the sample - random sampling of the sample mean vector and
covariance matrix - generalized variance - matrix operations of sample values |
AMSA: 30-37, 60-78, 111-148 |
3 |
Multivariate normal
distribution: - density and properties - sampling from multivariate normal and MLE - sampling distribution and large sample behavior of and S
- assessing the assumption of normality - transformation to near normality |
AMSA: 149-200 |
4 |
Inferences about a mean vector:
- inference for a normal population mean - Hotelling's T2 and likelihood ratio test - confidence regions and simultaneous comparisons of component means - large sample inferences about a population mean vector |
AMSA: 210-238 |
5 |
Comparisons of several
multivariate means: - paired comparisons and repeated measures design - comparing mean vectors from two populations - comparing several multivariate population
means (one-way MANOVA) |
AMSA: 273-312 |
6 |
Principal components: - introduction - population principal components - summarizing sample variation by principal components - large sample inferences |
AMSA: 430-459 |
7 |
Factor analysis: - introduction - orthogonal factor model - methods of estimation - factor rotation - factor scores |
AMSA: 481-526 |
8 |
Canonical correlation
analysis: - introduction - population and sample canonical variates and canonical correlations - sample descriptive measures of goodness |
AMSA: 539-563 |
9 |
Clustering: - introduction - similarity measures - hierarchical clustering methods - k-means clustering methods - multidimensional scaling |
AMSA: 671-715 |
10 |
Discrimination and
classification: - introduction - separation and classification for two populations - classification with two multivariate normal populations - evaluating classification functions - fisher discriminant function - classification with several population |
AMSA: 575-644 |
11 |
Machine learning - classification and regression tree - neural networks - support vector machine - ensemble learning |
ESL: -
305-317, 587-603 -
389-409 -
129-135, 417-438 -
Section 8.7, Chapter
10 |