Difference between revisions of "Microarray diagnostics"

From Organic Design wiki
(Annotation diagnostics)
(Example plots)
Line 26: Line 26:
 
  #Number of missing values
 
  #Number of missing values
 
  apply( backgroundCorrect(RG, method="subtract"), 2, sum(is.na))
 
  apply( backgroundCorrect(RG, method="subtract"), 2, sum(is.na))
 +
</pre>
 +
 +
==Comparing channels==
 +
differences between the ''Red' and ''Green'' cnahhels can be examined by plotting the differences in summary statistics, for example the pseudocode below plots the counts for the ''Green'' channel versus the ''Red'' channel where the backgroun is higher than the forground.
 +
 +
<pre>
 +
plot(apply(RG$R < RG$Rb, 2, sum, na.rm=TRUE), apply(RG$G < RG$Gb, 2, sum, na.rm=TRUE), type="n")
 +
text(apply(RG$R < RG$Rb, 2, sum, na.rm=TRUE), apply(RG$G < RG$Gb, 2, sum, na.rm=TRUE), seq(colnames(RG)))
 
</pre>
 
</pre>

Revision as of 01:47, 6 September 2006


Summary statistics

Raw data should be on the 216 scale, with data ranges of (0, 65,535). Statistics of interest include, min, max,range, summary,# NA's, # saturated for each slide, or for each block with slides. For each experiment, the number of Empty, spots or positive/negative controls may be of interest from the annotation information.

Examples using apply

 # Ranges
 apply(RG$R, 2, range, na.rm=TRUE)
 apply(RG$G, 2, range, na.rm=TRUE)
 # Maximums
 apply(RG$R, 2, max, na.rm=TRUE)
 apply(RG$Rb, 2, max, na.rm=TRUE)
 apply(RG$G, 2, max, na.rm=TRUE)
 apply(RG$Gb, 2, max, na.rm=TRUE)
 # Examining backgrounds that are higher than foreground
 apply(RG$R < RG$Rb, 2, sum, na.rm=TRUE)
 apply(RG$G < RG$Gb, 2, sum, na.rm=TRUE)

There are several times where the data ranges, or the number of introduced missing values (NA's) can be investigated during background correction and normalization.

Background Correction

 #Number of missing values
 apply( backgroundCorrect(RG, method="subtract"), 2, sum(is.na))

Comparing channels

differences between the Red' and Green cnahhels can be examined by plotting the differences in summary statistics, for example the pseudocode below plots the counts for the Green channel versus the Red channel where the backgroun is higher than the forground.

 plot(apply(RG$R < RG$Rb, 2, sum, na.rm=TRUE), apply(RG$G < RG$Gb, 2, sum, na.rm=TRUE), type="n")
 text(apply(RG$R < RG$Rb, 2, sum, na.rm=TRUE), apply(RG$G < RG$Gb, 2, sum, na.rm=TRUE), seq(colnames(RG)))