Revision as of 23:27, 19 July 2006

Linear models for microarray analysis

Linear models for microarray analysis (Limma) is a R and Bioconductor package for organising and analysing cDNA and Affymetrix microarray data. It is written by Gordon Smyth at WEHI.

Algorithm details

For a p * n matrix of expression intensities, Limma is fitting p linear models (one for each row). The lmFit function does this by calling functions such as lm.series which use lm.fit in Package:Stats. For cDNA/oligo two spotted technologies the matrix of expression intensities is usually the marix M values with respect to treatments. For Affymetrix single channel arrays the expression intensities are directly analysed comparing two treatments.

y = Xβ + ε

The linear model reduces to effectively estimating average M values using a categorical design matrix. If correlation between rows (spots) is estimated, then the function duplicateCorrelation is called. This fits a reml model on all genes to estimate a rho correlation matrix. A fisher transformation (identical to atanh(x)) is then applied to the rho matrix and an average rho calculating an mean correlation with trim=0.15 by default, which is then backtransformed to give a consensus correlation. This correlation can be utilised in lmFit by calling gls.series which fits a generalized least squares model.

List elements from lmFit

> names(fit)
[1] "coefficients"     "rank"             "assign"           "qr"              
[5] "df.residual"      "sigma"            "cov.coefficients" "stdev.unscaled"  
[9] "pivot"            "method"           "design"           "Amean"           
[13] "genes"

coefficients= estimated M values
qr = qr decomposition
df.residual=residual degrees of freedom for each gene
Amean = Estimated unweighted A values
stdev.unscaled = scaling required
method = model fitting method used
design=the design matrix X used
genes = list of gene names

Empirical Bayes using conjugate priors is used to calculate moderated t-statistics and B statistics. The wrapper function eBayes for ebayes calculates these statistics for each row (spot).

List elements from ebayes

> names(ebayes(fit))
[1] "df.prior"  "s2.prior"  "s2.post"   "t"         "p.value"   "var.prior"
[7] "lods"

df.prior= d_g
s2.prior=(s_o)²
s2.post=(s_g)²
t=moderated t-statistics
p.value = moderated t-statistics p values
lods = log odds B statistics

@@ Line 38: / Line 38: @@
   [1] "df.prior"  "s2.prior"  "s2.post"   "t"         "p.value"   "var.prior"
   [7] "lods"
+*df.prior= d<sub>g</sub>
+*s2.prior=(s<sub>o</sub>)<sup>2</sup>
+*s2.post=(s<sub>g</sub>)<sup>2</sup>
+*t=moderated t-statistics
+*p.value = ''moderated t-statistics p values''
+*lods = ''log odds B statistics''
 </table>

Difference between revisions of "Limma analysis"

Revision as of 23:27, 19 July 2006

Contents

Linear models for microarray analysis

Algorithm details

List elements from lmFit

List elements from ebayes

Navigation menu

Views

Personal tools

Navigation

Search

Navigation

Blogs

Site map

Tools