ANALYSIS OF
HIGH-THROUGHPUT GENOMIC DATA: EXPRESSION AND SNP
SPRING
2009
Instructor: |
Guan-Hua Huang, Ph.D. |
|
Office: 423 Joint Education Hall |
|
Phone: 03-513-1334 |
|
Email: ghuang@stat.nctu.edu.tw |
Class meetings: |
Thursday 9:00 - 12:00 at 406
Joint Education
Hall |
Office hours: |
By
appointment |
Class website: |
|
Credit: |
Three (3) credits |
Novel
statistical methodology can enhance understanding of the interactions between
multiple genes and environmental factors on a complex disease. The massive
amount of high-throughput genomic data brings a great challenge of developing
advanced statistical and computational data mining tools. In this course,
we will go through some effective statistical
methods for analyzing these high-throughput data. The course
especially focuses on two types
of high-throughput data: gene expression microarray and single nucleotide
polymorphism (SNP) markers.
Topics include
Gene
expression:
-
Technology and
measurement
-
Quality assessment
-
Preprocessing Affymetrix
GeneChip: background adjustment, normalization and
summarization
-
Differential expression
-
Clustering and prediction
-
Gene set enrichment analysis
SNP markers:
-
Preliminary analyses: Hardy-Weinberg equilibrium, haplotype and genotype
data, measures of linkage disequilibrium, estimates of recombination rates, SNP
tagging
-
Population-based
association study: case-control and family study
-
Candidate-gene and genome-wide association studies
-
Population stratification
-
Tests of association: single and multiple SNPs
-
Epistatic
effects and gene-environment interactions
-
Multiple testing
Handouts corresponding to each lecture
will be available on the class website before each class. There is no required textbook for this course.
Following books are recommended for further reading:
Gentleman
R, Carey VJ,
Huber W, Irizarry RA,
Dudoit S (Editor) (2005). Bioinformatics and Computational Biology Solutions Using R and
Bioconductor. Springer.
Thomas DC
(2004). Statistical Methods in Genetic Epidemiology.
Students
are expected to be familiar with computer languages R and Bioconductor.
Background on probability and mathematical statistics is required.
The course grade will be based on attendance,
participation and a final project.