Difference between revisions of "Limma analysis"

From Organic Design wiki
m (List elements from lmFit)
m (Algorithm details)
Line 7: Line 7:
 
For a ''p'' * ''n'' matrix of expression intensities, ''Limma'' is fitting ''p'' linear models (one for each row). The lmFit function does this by calling functions such as ''lm.series''
 
For a ''p'' * ''n'' matrix of expression intensities, ''Limma'' is fitting ''p'' linear models (one for each row). The lmFit function does this by calling functions such as ''lm.series''
 
which use ''lm.fit'' in ''Package:Stats''. For cDNA/oligo two spotted technologies the matrix of expression intensities is usually the marix ''M'' values with respect to treatments. For Affymetrix single channel arrays the expression intensities are directly analysed comparing two treatments.  
 
which use ''lm.fit'' in ''Package:Stats''. For cDNA/oligo two spotted technologies the matrix of expression intensities is usually the marix ''M'' values with respect to treatments. For Affymetrix single channel arrays the expression intensities are directly analysed comparing two treatments.  
 +
 +
y = Xβ + ε
  
 
The linear model reduces to effectively estimating average M values using a categorical design matrix. If correlation between rows (spots) is estimated, then the function ''duplicateCorrelation'' is called. This fits a reml model on all genes to estimate a ''rho'' correlation matrix. A ''[[Wikipedia:Fisher transformation|fisher transformation]]'' (identical to ''atanh(x)'') is then applied to the ''rho'' matrix and an average ''rho'' calculating an mean correlation with trim=0.15 by default, which is then backtransformed to give a ''consensus correlation''. This correlation can be utilised in lmFit by calling ''gls.series'' which fits a generalized least squares model.
 
The linear model reduces to effectively estimating average M values using a categorical design matrix. If correlation between rows (spots) is estimated, then the function ''duplicateCorrelation'' is called. This fits a reml model on all genes to estimate a ''rho'' correlation matrix. A ''[[Wikipedia:Fisher transformation|fisher transformation]]'' (identical to ''atanh(x)'') is then applied to the ''rho'' matrix and an average ''rho'' calculating an mean correlation with trim=0.15 by default, which is then backtransformed to give a ''consensus correlation''. This correlation can be utilised in lmFit by calling ''gls.series'' which fits a generalized least squares model.

Revision as of 23:08, 19 July 2006


Linear models for microarray analysis

Linear models for microarray analysis (Limma) is a R and Bioconductor package for organising and analysing cDNA and Affymetrix microarray data. It is written by Gordon Smyth at WEHI.

Algorithm details

For a p * n matrix of expression intensities, Limma is fitting p linear models (one for each row). The lmFit function does this by calling functions such as lm.series which use lm.fit in Package:Stats. For cDNA/oligo two spotted technologies the matrix of expression intensities is usually the marix M values with respect to treatments. For Affymetrix single channel arrays the expression intensities are directly analysed comparing two treatments.

y = Xβ + ε

The linear model reduces to effectively estimating average M values using a categorical design matrix. If correlation between rows (spots) is estimated, then the function duplicateCorrelation is called. This fits a reml model on all genes to estimate a rho correlation matrix. A fisher transformation (identical to atanh(x)) is then applied to the rho matrix and an average rho calculating an mean correlation with trim=0.15 by default, which is then backtransformed to give a consensus correlation. This correlation can be utilised in lmFit by calling gls.series which fits a generalized least squares model.

List elements from lmFit

> names(fit)

[1] "coefficients"     "rank"             "assign"           "qr"              
[5] "df.residual"      "sigma"            "cov.coefficients" "stdev.unscaled"  
[9] "pivot"            "method"           "design"           "Amean"           
[13] "genes" 
  • coefficients= estimated M values
  • Amean = Estimated unweighted A values