Difference between revisions of "Rank Product analysis"

From Organic Design wiki
m
m
Line 30: Line 30:
 
<table class=document-code><tr><td>
 
<table class=document-code><tr><td>
 
  library(RankProd)
 
  library(RankProd)
 
 
  data(arab)
 
  data(arab)
 
 
  # Single origin analysis
 
  # Single origin analysis
 
  arab.sub <- arab[, which(arab.origin == 1)]  
 
  arab.sub <- arab[, which(arab.origin == 1)]  

Revision as of 00:58, 6 March 2007

Rank products: a simple, yet powerful, new method to detect differentially regulated genes in replicated microarray experiments

Rainer Breitling, Patrick Armengaud, Anna Amtmanna, Pawel Herzykb

Overview

Rank Product is a non parametric statistical method of analysing microarray data. It assumes that under the null hypothesis, the order of all items is random, and the probability of findng a specific item among the top r of n items in a list is p = r/n.

Multiplying theses probabilities over i replicates leads to the rank product definition;

[math]RP = \prod_i \frac{r_i}{n_i}[/math]

where ri is the rank of the item in the i-th list and ni is the total number of items in the i-th list. The smaller the RP value, the smaller the probability that the observed rank of the item in the list is due to chance. The BioConductor RankProd package produces a list of up and down regulated genes with a probability measure analogous to false discovery rate control.

Usage

This statistical method can analyse single channel and two channel cDNA microarray data. If one class data is provided as input it is assumed to be a matrix of expression ratios (M values). If two class data is provided it is assumed to be a matrix of single channels. e.g.

cl <- rep(1,5)
cl <- rep(0:1, c(4,5))

The origin of the data can be spcified, single origin is from the same laboratory source, multiple origin is from several different laboratories. e.g.

origin <- c(1,5)
origin <- c(1,1,1,2,2)

This is analogous to the inputs into SAM.

Example

library(RankProd)
data(arab)
# Single origin analysis
arab.sub <- arab[, which(arab.origin == 1)] 
arab.cl.sub <- arab.cl[which(arab.origin == 1)] 
arab.origin.sub <- arab.origin[which(arab.origin == 1)] 
RP.out <- RP(arab.sub, arab.cl.sub, num.perm = 100, logged = TRUE,
            na.rm = FALSE, plot = FALSE, rand = 123)

The list elements of RP.out are;

pfp  = estimated percentage of false positive predictions
pval = raw p-values
RPs = Rank Products
RPrank = Rank of the rank Products  
Orirank = list of up/down ranks ("class1 < class2"  "class1 > class 2")
AveFC = Average fold change


See also