Normalization: Methods for removing systematic variation from microarray data
- Global methods
- Intensity dependent methods
- Fold Change
Although simple and intuitive, fold change does not address the reproducibility of the observed difference and cannot be used to determine the statistical significance.
- Comparison statistics use the replicates to assign a confidence level as to whether the gene is differentially expressed.
- Parametric
- T-test
- Welch’s t-test
- ANOVA (See GeneSifter's July 20 2006 Webinar on this)
- Non-parametric
- Wilcoxon Rank Sum, Kruskal-Wallis, Permutation t-test
- Parametric
Comparison tests require replicates and use the variability within the replicates to assign a confidence level as to whether differences in gene are significance or due to chance.Correction for multiple testing - Methods for adjusting the p-value from a comparison test based on the number of tests performed. These adjustments help to reduce the number of false positives in an experiment.
- FWER – Adjusts the p-value so that it reflects the chance of at least 1 false positive being found in the list.
- Bonferonni, Holm, W & Y MaxT
- FDR – Adjusts the p-value so that it reflects the frequency of falso positives in the list.
FWER is more conservative, but the false discovery rate(FDR) is usually acceptable for “discovery” experiments.Cluster Analysis: clustering methods are descriptive or exploratory tools that can be used to identify groups within complex datasets. They can be used to identify patterns of gene expression in microarray datasets.
- Visualization: Methods such as hierarchical clustering can help identify patterns in a large dataset.
- Partitioning: This type of cluster analysis can be used to separate data into discrete groups.
- K-means
- PAM
- Silhouettes
Cluster analysis can be used to identify patterns within large datasets and to partition genes based on these patterns.







