Difference between revisions of "Rank Product analysis"
m |
m (Protected "Rank Product analysis" ([read=sysop] (indefinite))) |
||
(6 intermediate revisions by 3 users not shown) | |||
Line 1: | Line 1: | ||
− | =Rank products: a simple, yet powerful, new method to detect differentially regulated genes in replicated microarray experiments= | + | == Rank products: a simple, yet powerful, new method to detect differentially regulated genes in replicated microarray experiments == |
''Rainer Breitling, Patrick Armengaud, Anna Amtmanna, Pawel Herzykb'' | ''Rainer Breitling, Patrick Armengaud, Anna Amtmanna, Pawel Herzykb'' | ||
− | ==Overview== | + | === Overview === |
Rank Product is a non parametric statistical method of analysing microarray data. It assumes that under the null hypothesis, the order of all items is random, and the probability of findng a specific item among the top ''r'' of ''n'' items in a list is ''p = r/n''. | Rank Product is a non parametric statistical method of analysing microarray data. It assumes that under the null hypothesis, the order of all items is random, and the probability of findng a specific item among the top ''r'' of ''n'' items in a list is ''p = r/n''. | ||
Line 12: | Line 12: | ||
where r<sub>i</sub> is the rank of the item in the i-th list and n<sub>i</sub> is the total number of items in the i-th list. The smaller the RP value, the smaller the probability that the observed rank of the item in the list is due to chance. The BioConductor RankProd package produces a list of up and down regulated genes with a probability measure analogous to false discovery rate control. | where r<sub>i</sub> is the rank of the item in the i-th list and n<sub>i</sub> is the total number of items in the i-th list. The smaller the RP value, the smaller the probability that the observed rank of the item in the list is due to chance. The BioConductor RankProd package produces a list of up and down regulated genes with a probability measure analogous to false discovery rate control. | ||
− | ==Usage== | + | === Usage === |
This statistical method can analyse single channel and two channel cDNA microarray data. If one class data is provided as input it is assumed to be a matrix of expression ratios (M values). If two class data is provided it is assumed to be a matrix of single channels. e.g. | This statistical method can analyse single channel and two channel cDNA microarray data. If one class data is provided as input it is assumed to be a matrix of expression ratios (M values). If two class data is provided it is assumed to be a matrix of single channels. e.g. | ||
Line 19: | Line 19: | ||
cl <- rep(0:1, c(4,5)) | cl <- rep(0:1, c(4,5)) | ||
</table> | </table> | ||
− | The origin of the data can be | + | The origin of the data can be specified, single origin is from the same laboratory source, multiple origin is from several different laboratories. e.g. |
<table class=document-code><tr><td> | <table class=document-code><tr><td> | ||
Line 27: | Line 27: | ||
This is analogous to the inputs into SAM. | This is analogous to the inputs into SAM. | ||
− | ===Example=== | + | |
+ | Alternative input data for RankProd can be any numeric information, such as M values standardized by each genes standard deviation, or moderated-t statistics after limma analysis. | ||
+ | |||
+ | ==== Example ==== | ||
<table class=document-code><tr><td> | <table class=document-code><tr><td> | ||
library(RankProd) | library(RankProd) | ||
Line 48: | Line 51: | ||
</table> | </table> | ||
+ | ;Note: RankSum and RankProd2 are very similar functions | ||
− | =See also= | + | == See also == |
*[http://www.brc.dcs.gla.ac.uk/~rb106x/publications/RankProducts_FEBS.pdf Publication] | *[http://www.brc.dcs.gla.ac.uk/~rb106x/publications/RankProducts_FEBS.pdf Publication] | ||
*[http://www.bioconductor.org/packages/1.9/bioc/html/RankProd.html rankProd BioC package] | *[http://www.bioconductor.org/packages/1.9/bioc/html/RankProd.html rankProd BioC package] | ||
*[http://www.bioconductor.org/packages/1.9/bioc/vignettes/RankProd/inst/doc/RankProd.pdf rankProd documentation] | *[http://www.bioconductor.org/packages/1.9/bioc/vignettes/RankProd/inst/doc/RankProd.pdf rankProd documentation] | ||
*[[RankProduct.R]] | *[[RankProduct.R]] | ||
+ | *[[RankProductVignette.R]] | ||
+ | [[Category:Maths]] |
Latest revision as of 18:45, 20 December 2010
Contents
Rank products: a simple, yet powerful, new method to detect differentially regulated genes in replicated microarray experiments
Rainer Breitling, Patrick Armengaud, Anna Amtmanna, Pawel Herzykb
Overview
Rank Product is a non parametric statistical method of analysing microarray data. It assumes that under the null hypothesis, the order of all items is random, and the probability of findng a specific item among the top r of n items in a list is p = r/n.
Multiplying theses probabilities over i replicates leads to the rank product definition;
[math]RP = \prod_i \frac{r_i}{n_i}[/math]
where ri is the rank of the item in the i-th list and ni is the total number of items in the i-th list. The smaller the RP value, the smaller the probability that the observed rank of the item in the list is due to chance. The BioConductor RankProd package produces a list of up and down regulated genes with a probability measure analogous to false discovery rate control.
Usage
This statistical method can analyse single channel and two channel cDNA microarray data. If one class data is provided as input it is assumed to be a matrix of expression ratios (M values). If two class data is provided it is assumed to be a matrix of single channels. e.g.
cl <- rep(1,5) cl <- rep(0:1, c(4,5)) |
The origin of the data can be specified, single origin is from the same laboratory source, multiple origin is from several different laboratories. e.g.
origin <- c(1,5) origin <- c(1,1,1,2,2) |
This is analogous to the inputs into SAM.
Alternative input data for RankProd can be any numeric information, such as M values standardized by each genes standard deviation, or moderated-t statistics after limma analysis.
Example
library(RankProd) data(arab) # Single origin analysis arab.sub <- arab[, which(arab.origin == 1)] arab.cl.sub <- arab.cl[which(arab.origin == 1)] arab.origin.sub <- arab.origin[which(arab.origin == 1)] RP.out <- RP(arab.sub, arab.cl.sub, num.perm = 100, logged = TRUE, na.rm = FALSE, plot = FALSE, rand = 123) |
The list elements of RP.out are;
pfp = estimated percentage of false positive predictions pval = raw p-values RPs = Rank Products RPrank = Rank of the rank Products Orirank = list of up/down ranks ("class1 < class2" "class1 > class 2") AveFC = Average fold change |
- Note
- RankSum and RankProd2 are very similar functions