Difference between revisions of "Limma analysis"

From Organic Design wiki
(List elements from lmFit)
(List elements from ebayes)
Line 38: Line 38:
 
  [1] "df.prior"  "s2.prior"  "s2.post"  "t"        "p.value"  "var.prior"
 
  [1] "df.prior"  "s2.prior"  "s2.post"  "t"        "p.value"  "var.prior"
 
  [7] "lods"
 
  [7] "lods"
 +
 +
*df.prior= d<sub>g</sub>
 +
*s2.prior=(s<sub>o</sub>)<sup>2</sup>
 +
*s2.post=(s<sub>g</sub>)<sup>2</sup>
 +
*t=moderated t-statistics
 +
*p.value = ''moderated t-statistics p values''
 +
*lods = ''log odds B statistics''
 
</table>
 
</table>

Revision as of 23:27, 19 July 2006


Linear models for microarray analysis

Linear models for microarray analysis (Limma) is a R and Bioconductor package for organising and analysing cDNA and Affymetrix microarray data. It is written by Gordon Smyth at WEHI.

Algorithm details

For a p * n matrix of expression intensities, Limma is fitting p linear models (one for each row). The lmFit function does this by calling functions such as lm.series which use lm.fit in Package:Stats. For cDNA/oligo two spotted technologies the matrix of expression intensities is usually the marix M values with respect to treatments. For Affymetrix single channel arrays the expression intensities are directly analysed comparing two treatments.

y = Xβ + ε

The linear model reduces to effectively estimating average M values using a categorical design matrix. If correlation between rows (spots) is estimated, then the function duplicateCorrelation is called. This fits a reml model on all genes to estimate a rho correlation matrix. A fisher transformation (identical to atanh(x)) is then applied to the rho matrix and an average rho calculating an mean correlation with trim=0.15 by default, which is then backtransformed to give a consensus correlation. This correlation can be utilised in lmFit by calling gls.series which fits a generalized least squares model.

List elements from lmFit

> names(fit)
[1] "coefficients"     "rank"             "assign"           "qr"              
[5] "df.residual"      "sigma"            "cov.coefficients" "stdev.unscaled"  
[9] "pivot"            "method"           "design"           "Amean"           
[13] "genes" 
  • coefficients= estimated M values
  • qr = qr decomposition
  • df.residual=residual degrees of freedom for each gene
  • Amean = Estimated unweighted A values
  • stdev.unscaled = scaling required
  • method = model fitting method used
  • design=the design matrix X used
  • genes = list of gene names

Empirical Bayes using conjugate priors is used to calculate moderated t-statistics and B statistics. The wrapper function eBayes for ebayes calculates these statistics for each row (spot).

List elements from ebayes

> names(ebayes(fit))
[1] "df.prior"  "s2.prior"  "s2.post"   "t"         "p.value"   "var.prior"
[7] "lods"
  • df.prior= dg
  • s2.prior=(so)2
  • s2.post=(sg)2
  • t=moderated t-statistics
  • p.value = moderated t-statistics p values
  • lods = log odds B statistics