Difference between revisions of "Linear models for Microarray analysis"

From Organic Design wiki
m (Object orintated programming environment)
 
(9 intermediate revisions by 2 users not shown)
Line 1: Line 1:
 +
[[Category:Sven/Rosaceae]]
 
__NOTOC__
 
__NOTOC__
===== Overview of Limma package for R=====  
+
 
 +
== Overview of Limma package for R ==
 
*Fits a linear model for each spot (''gene'')
 
*Fits a linear model for each spot (''gene'')
 
*An open source software package for the R programming environment
 
*An open source software package for the R programming environment
Line 7: Line 9:
 
*Statistical analysis approach can be used for Affymetrix microarray experiments
 
*Statistical analysis approach can be used for Affymetrix microarray experiments
 
----
 
----
===== Origin =====
+
 
 +
== Origin ==
 
*Written and maintained by Gordon Smyth with contributions From WEHI, Melbourne, Australia
 
*Written and maintained by Gordon Smyth with contributions From WEHI, Melbourne, Australia
 
*Software made public at the Australian Genstat Conference, Perth, in Dec 2002
 
*Software made public at the Australian Genstat Conference, Perth, in Dec 2002
Line 13: Line 16:
 
*Limma integrates with other Bioconductor software packages, affy, marray, using convert package
 
*Limma integrates with other Bioconductor software packages, affy, marray, using convert package
 
*Active development cycle
 
*Active development cycle
[[Image:Limma_versions.tiff|thumb|200px|Linear development cycle]]
+
[[Image:Limma versions.tiff|thumb|200px|Linear development cycle]]
 
----
 
----
==== Statistical approach ====
+
 
 +
= Statistical approach =
 
*Parallel inference for each gene
 
*Parallel inference for each gene
 
*Computationally fast/robust
 
*Computationally fast/robust
Line 23: Line 27:
 
----
 
----
  
====Object orientated programming environment====
+
= Object orientated programming environment =
[[Image:OOP.tiff|thumb]]
+
[[Image:OOP.png]]
 
*Uploading data into the R programming language automatically populates elements of RGList
 
*Uploading data into the R programming language automatically populates elements of RGList
 
**<font color="Red">R (''Red foreground'')</font>
 
**<font color="Red">R (''Red foreground'')</font>
Line 44: Line 48:
 
----
 
----
  
====Advantages using Limma====
+
= Advantages using Limma =
 
*Nice organisational framework for handling cDNA expression data using object orientated programming
 
*Nice organisational framework for handling cDNA expression data using object orientated programming
 
*Flexible methods to handle weighting of poor quality spots
 
*Flexible methods to handle weighting of poor quality spots
Line 52: Line 56:
 
*Analysis methods able to encorporate duplicate spots from either technical or biological sources
 
*Analysis methods able to encorporate duplicate spots from either technical or biological sources
 
----
 
----
====Limitations====
+
 
 +
= Limitations =
 
*Experiments with different spotting templates cannot easily be combined for analysis
 
*Experiments with different spotting templates cannot easily be combined for analysis
 
*Statistical analysis cannot pool information together when there are variable numbers of the same replicated spots  
 
*Statistical analysis cannot pool information together when there are variable numbers of the same replicated spots  
Line 59: Line 64:
  
 
----
 
----
====Microarray workshop experiment====
+
 
 +
= Microarray workshop experiment =
 
*Dye swap experiment
 
*Dye swap experiment
 
+
*Directed graph
  
 
[[Image:Dyeswap.png]]
 
[[Image:Dyeswap.png]]
Line 72: Line 78:
  
 
* Fruit versus Leaf comparisons M value multipliers -1, 1, 1, -1
 
* Fruit versus Leaf comparisons M value multipliers -1, 1, 1, -1
 +
*<font color="blue">''Determining design questions of interest is the hardest part''</font>
  
[[Category:Sven/Rosaceae]]
+
[[Category:Microarray]]

Latest revision as of 21:53, 11 November 2007


Overview of Limma package for R

  • Fits a linear model for each spot (gene)
  • An open source software package for the R programming environment
  • Focus on normalization and statistical analysis of cDNA microarray gene expression data
  • OOP environment for handling information in a microarray experiment
  • Statistical analysis approach can be used for Affymetrix microarray experiments

Origin

  • Written and maintained by Gordon Smyth with contributions From WEHI, Melbourne, Australia
  • Software made public at the Australian Genstat Conference, Perth, in Dec 2002
  • Became available in the Bioconductor open source bioinformatics project April 2003
  • Limma integrates with other Bioconductor software packages, affy, marray, using convert package
  • Active development cycle

File:Limma versions.tiff


Statistical approach

  • Parallel inference for each gene
  • Computationally fast/robust
  • Handles missing information/use defined flag information
  • Linear models are essentially t-statistics for each spot/gene (signal/noise)
  • Also makes use of between gene information (moderated t-statistics)

Object orientated programming environment

File:OOP.png

  • Uploading data into the R programming language automatically populates elements of RGList
    • R (Red foreground)
    • G (Green foreground)
  • Foreground intensities range ~ 1 → 65535
    • Rb (Red background)
    • Gb (Green background)
  • Background intensities range ~ 1 → 1000
    • genes (Spot annotation list)
    • weights (prior weights weights given to each spot)
  • MAList data transformation
    • M = log2(R) - log2(G) (minus)
    • A = (log2(R) + log2(G))/2) (add - abundance)
  • Backtransforming to Normalized R', G' values
    • log2(R') = A + M/2
    • log2(G') = A - M/2

Advantages using Limma

  • Nice organisational framework for handling cDNA expression data using object orientated programming
  • Flexible methods to handle weighting of poor quality spots
  • Encorporates cDNA normalization routines with a proven track record
  • Robust statistical analysis approach
Can analyze cDNA microarray slides possessing large amounts of missing information
  • Analysis methods able to encorporate duplicate spots from either technical or biological sources

Limitations

  • Experiments with different spotting templates cannot easily be combined for analysis
  • Statistical analysis cannot pool information together when there are variable numbers of the same replicated spots
Must analyze spot information about the same transcript independently
  • Linear models cannot encorporate error model structures from time series designs

Microarray workshop experiment

  • Dye swap experiment
  • Directed graph

File:Dyeswap.png

FileName SlideNumber   Cy3   Cy5 Design
BE34.gpr          34  Leaf Fruit     -1
BE35.gpr          35 Fruit  Leaf      1
BE36.gpr     	  36 Fruit  Leaf      1
BE37.gpr   	  37  Leaf Fruit     -1
  • Fruit versus Leaf comparisons M value multipliers -1, 1, 1, -1
  • Determining design questions of interest is the hardest part