Difference between revisions of "Linear models for Microarray analysis"
From Organic Design wiki
m (→Microarray workshop experiment) |
|||
(14 intermediate revisions by 2 users not shown) | |||
Line 1: | Line 1: | ||
+ | [[Category:Sven/Rosaceae]] | ||
__NOTOC__ | __NOTOC__ | ||
− | + | ||
+ | == Overview of Limma package for R == | ||
*Fits a linear model for each spot (''gene'') | *Fits a linear model for each spot (''gene'') | ||
*An open source software package for the R programming environment | *An open source software package for the R programming environment | ||
Line 7: | Line 9: | ||
*Statistical analysis approach can be used for Affymetrix microarray experiments | *Statistical analysis approach can be used for Affymetrix microarray experiments | ||
---- | ---- | ||
− | + | ||
+ | == Origin == | ||
*Written and maintained by Gordon Smyth with contributions From WEHI, Melbourne, Australia | *Written and maintained by Gordon Smyth with contributions From WEHI, Melbourne, Australia | ||
*Software made public at the Australian Genstat Conference, Perth, in Dec 2002 | *Software made public at the Australian Genstat Conference, Perth, in Dec 2002 | ||
Line 13: | Line 16: | ||
*Limma integrates with other Bioconductor software packages, affy, marray, using convert package | *Limma integrates with other Bioconductor software packages, affy, marray, using convert package | ||
*Active development cycle | *Active development cycle | ||
− | [[Image: | + | [[Image:Limma versions.tiff|thumb|200px|Linear development cycle]] |
---- | ---- | ||
− | + | ||
+ | = Statistical approach = | ||
*Parallel inference for each gene | *Parallel inference for each gene | ||
*Computationally fast/robust | *Computationally fast/robust | ||
Line 23: | Line 27: | ||
---- | ---- | ||
− | + | = Object orientated programming environment = | |
− | [[Image:OOP. | + | [[Image:OOP.png]] |
*Uploading data into the R programming language automatically populates elements of RGList | *Uploading data into the R programming language automatically populates elements of RGList | ||
**<font color="Red">R (''Red foreground'')</font> | **<font color="Red">R (''Red foreground'')</font> | ||
Line 43: | Line 47: | ||
**log2(G') = A - M/2 | **log2(G') = A - M/2 | ||
---- | ---- | ||
− | + | ||
+ | = Advantages using Limma = | ||
*Nice organisational framework for handling cDNA expression data using object orientated programming | *Nice organisational framework for handling cDNA expression data using object orientated programming | ||
*Flexible methods to handle weighting of poor quality spots | *Flexible methods to handle weighting of poor quality spots | ||
Line 51: | Line 56: | ||
*Analysis methods able to encorporate duplicate spots from either technical or biological sources | *Analysis methods able to encorporate duplicate spots from either technical or biological sources | ||
---- | ---- | ||
− | + | ||
+ | = Limitations = | ||
*Experiments with different spotting templates cannot easily be combined for analysis | *Experiments with different spotting templates cannot easily be combined for analysis | ||
*Statistical analysis cannot pool information together when there are variable numbers of the same replicated spots | *Statistical analysis cannot pool information together when there are variable numbers of the same replicated spots | ||
Line 58: | Line 64: | ||
---- | ---- | ||
− | + | ||
+ | = Microarray workshop experiment = | ||
*Dye swap experiment | *Dye swap experiment | ||
− | * | + | *Directed graph |
[[Image:Dyeswap.png]] | [[Image:Dyeswap.png]] | ||
− | |||
− | <font color="blue">FileName | + | <font color="blue">FileName SlideNumber Cy3 Cy5 Design</font> |
− | BE34.gpr | + | BE34.gpr 34 Leaf Fruit -1 |
− | BE35.gpr | + | BE35.gpr 35 Fruit Leaf 1 |
− | BE36.gpr | + | BE36.gpr 36 Fruit Leaf 1 |
− | BE37.gpr | + | BE37.gpr 37 Leaf Fruit -1 |
+ | |||
+ | * Fruit versus Leaf comparisons M value multipliers -1, 1, 1, -1 | ||
+ | *<font color="blue">''Determining design questions of interest is the hardest part''</font> | ||
− | + | [[Category:Microarray]] | |
− | |||
− | [[Category: |
Latest revision as of 21:53, 11 November 2007
Overview of Limma package for R
- Fits a linear model for each spot (gene)
- An open source software package for the R programming environment
- Focus on normalization and statistical analysis of cDNA microarray gene expression data
- OOP environment for handling information in a microarray experiment
- Statistical analysis approach can be used for Affymetrix microarray experiments
Origin
- Written and maintained by Gordon Smyth with contributions From WEHI, Melbourne, Australia
- Software made public at the Australian Genstat Conference, Perth, in Dec 2002
- Became available in the Bioconductor open source bioinformatics project April 2003
- Limma integrates with other Bioconductor software packages, affy, marray, using convert package
- Active development cycle
Statistical approach
- Parallel inference for each gene
- Computationally fast/robust
- Handles missing information/use defined flag information
- Linear models are essentially t-statistics for each spot/gene (signal/noise)
- Also makes use of between gene information (moderated t-statistics)
Object orientated programming environment
- Uploading data into the R programming language automatically populates elements of RGList
- R (Red foreground)
- G (Green foreground)
- Foreground intensities range ~ 1 → 65535
- Rb (Red background)
- Gb (Green background)
- Background intensities range ~ 1 → 1000
- genes (Spot annotation list)
- weights (prior weights weights given to each spot)
- MAList data transformation
- M = log2(R) - log2(G) (minus)
- A = (log2(R) + log2(G))/2) (add - abundance)
- Backtransforming to Normalized R', G' values
- log2(R') = A + M/2
- log2(G') = A - M/2
Advantages using Limma
- Nice organisational framework for handling cDNA expression data using object orientated programming
- Flexible methods to handle weighting of poor quality spots
- Encorporates cDNA normalization routines with a proven track record
- Robust statistical analysis approach
- Can analyze cDNA microarray slides possessing large amounts of missing information
- Analysis methods able to encorporate duplicate spots from either technical or biological sources
Limitations
- Experiments with different spotting templates cannot easily be combined for analysis
- Statistical analysis cannot pool information together when there are variable numbers of the same replicated spots
- Must analyze spot information about the same transcript independently
- Linear models cannot encorporate error model structures from time series designs
Microarray workshop experiment
- Dye swap experiment
- Directed graph
FileName SlideNumber Cy3 Cy5 Design BE34.gpr 34 Leaf Fruit -1 BE35.gpr 35 Fruit Leaf 1 BE36.gpr 36 Fruit Leaf 1 BE37.gpr 37 Leaf Fruit -1
- Fruit versus Leaf comparisons M value multipliers -1, 1, 1, -1
- Determining design questions of interest is the hardest part