Limma analysis
Linear models for microarray analysis
Linear models for microarray analysis (Limma) is a R and Bioconductor package for organising and analysing cDNA and Affymetrix microarray data. It is written by Gordon Smyth at WEHI.
Algorithm details
For a p * n matrix of expression intensities, Limma is fitting p linear models (one for each row). The lmFit function does this by calling functions such as lm.series which use lm.fit in Package:Stats. For cDNA/oligo two spotted technologies the matrix of expression intensities is usually the marix M values with respect to treatments. For Affymetrix single channel arrays the expression intensities are directly analysed comparing two treatments.
y = Xβ + ε
The linear model reduces to effectively estimating average M values using a categorical design matrix. If correlation between rows (spots) is estimated, then the function duplicateCorrelation is called. This fits a reml model on all genes to estimate a rho correlation matrix. A fisher transformation (identical to atanh(x)) is then applied to the rho matrix and an average rho calculating an mean correlation with trim=0.15 by default, which is then backtransformed to give a consensus correlation. This correlation can be utilised in lmFit by calling gls.series which fits a generalized least squares model.
List elements from lmFit
> names(fit) [1] "coefficients" "rank" "assign" "qr" [5] "df.residual" "sigma" "cov.coefficients" "stdev.unscaled" [9] "pivot" "method" "design" "Amean" [13] "genes"
|
Empirical Bayes using conjugate priors is used to calculate moderated t-statistics and B statistics. The wrapper function eBayes for ebayes calculates these statistics for each row (spot).