ANALYSIS OF
HIGH-THROUGHPUT GENOMIC DATA: EXPRESSION AND SNP
SPRING
2008
Instructor: |
Guan-Hua Huang, Ph.D. |
|
Office: 423 Joint Education Hall |
|
Phone: 03-513-1334 |
|
Email: ghuang@stat.nctu.edu.tw |
Class meetings: |
Thursday 9:00 - 12:00 at 407 Joint Education Hall |
Office hours: |
By appointment |
Class website: |
http://www.stat.nctu.edu.tw/subhtml/source/teachers/ghuang/course/expsnp08/ |
Credit: |
Three (3) credits |
Novel
statistical methodology can enhance understanding of the interactions between
multiple genes and environmental factors on a complex disease. The massive
amount of high-throughput genomic data brings a great challenge of developing
advanced statistical and computational data mining tools. In this course, we will
go through some effective statistical methods for analyzing these high-throughput data.
The course especially focuses on two types of
high-throughput data: gene expression microarray and
single nucleotide polymorphism (SNP) markers.
Topics include
Gene expression:
-
Technology and measurement
-
Quality
assessment
-
Preprocessing Affymetrix GeneChip:
background
adjustment, normalization and summarization
-
Differential
expression
-
Clustering
and prediction
-
Gene
set enrichment analysis
SNP
markers:
-
Preliminary analyses: Hardy-Weinberg
equilibrium, haplotype and genotype data, measures of
linkage disequilibrium, estimates of recombination rates, SNP tagging
-
Population-based
association study: case-control and family study
-
Candidate-gene and genome-wide association studies
-
Population stratification
-
Tests of association: single and multiple SNPs
-
Epistatic effects and gene-environment interactions
-
Multiple testing
Handouts corresponding to each lecture will be
available on the class website before each class. There is no required textbook for this course. Following books are recommended
for further reading:
Gentleman R, Carey VJ, Huber W, Irizarry RA, Dudoit
S (Editor) (2005). Bioinformatics
and Computational Biology Solutions Using R and Bioconductor.
Springer.
Thomas DC (2004). Statistical Methods in Genetic Epidemiology.
Students are expected to
be familiar with computer languages R and Bioconductor.
Background on probability and mathematical statistics is required.
The course grade will be based on homework
assignments and a final project.