Difference between revisions of "Microarray diagnostics"
m (Caretaker: Format cat links) |
m (Caretaker: Format links, Format cat links, Format headings) |
||
Line 1: | Line 1: | ||
[[Category:Limma]][[Category:Microarray]] | [[Category:Limma]][[Category:Microarray]] | ||
− | =Summary statistics= | + | |
+ | = Summary statistics = | ||
Raw data should be on the 2<sup>16</sup> scale, with data ranges of (0, 65,535). Statistics of interest include, ''min'', ''max'',''range'', ''summary'',''# NA's'', ''# saturated'' for each slide, or for each block with slides. For each experiment, the number of ''Empty'', spots or ''positive/negative'' controls may be of interest from the annotation information. | Raw data should be on the 2<sup>16</sup> scale, with data ranges of (0, 65,535). Statistics of interest include, ''min'', ''max'',''range'', ''summary'',''# NA's'', ''# saturated'' for each slide, or for each block with slides. For each experiment, the number of ''Empty'', spots or ''positive/negative'' controls may be of interest from the annotation information. | ||
− | ==Statistical measures== | + | == Statistical measures == |
Diagnostic measues such as [[Wikipedia:Five number summary|five number summary]], [[Wikipedia:Quartile]]s, including measures such as [[Wikipedia:Skewness]], and [[Wikipedia:Kurtosis]]. | Diagnostic measues such as [[Wikipedia:Five number summary|five number summary]], [[Wikipedia:Quartile]]s, including measures such as [[Wikipedia:Skewness]], and [[Wikipedia:Kurtosis]]. | ||
− | ==Examples using apply== | + | == Examples using apply == |
<pre> | <pre> | ||
# Ranges | # Ranges | ||
Line 23: | Line 24: | ||
There are several times where the data ranges, or the number of introduced missing values (NA's) can be investigated during background correction and normalization. | There are several times where the data ranges, or the number of introduced missing values (NA's) can be investigated during background correction and normalization. | ||
− | ==Background Correction== | + | == Background Correction == |
<pre> | <pre> | ||
#Number of missing values | #Number of missing values | ||
Line 29: | Line 30: | ||
</pre> | </pre> | ||
− | ==Comparing channels== | + | == Comparing channels == |
Differences between the ''Red'' and ''Green'' channels can be examined by plotting the differences in summary statistics, for example the pseudocode below plots the counts for the ''Green'' channel versus the ''Red'' channel where the backgroun is higher than the forground. | Differences between the ''Red'' and ''Green'' channels can be examined by plotting the differences in summary statistics, for example the pseudocode below plots the counts for the ''Green'' channel versus the ''Red'' channel where the backgroun is higher than the forground. | ||
Revision as of 20:21, 3 November 2006
Contents
Summary statistics
Raw data should be on the 216 scale, with data ranges of (0, 65,535). Statistics of interest include, min, max,range, summary,# NA's, # saturated for each slide, or for each block with slides. For each experiment, the number of Empty, spots or positive/negative controls may be of interest from the annotation information.
Statistical measures
Diagnostic measues such as five number summary, Wikipedia:Quartiles, including measures such as Wikipedia:Skewness, and Wikipedia:Kurtosis.
Examples using apply
# Ranges apply(RG$R, 2, range, na.rm=TRUE) apply(RG$G, 2, range, na.rm=TRUE) # Maximums apply(RG$R, 2, max, na.rm=TRUE) apply(RG$Rb, 2, max, na.rm=TRUE) apply(RG$G, 2, max, na.rm=TRUE) apply(RG$Gb, 2, max, na.rm=TRUE) # Examining backgrounds that are higher than foreground apply(RG$R < RG$Rb, 2, sum, na.rm=TRUE) apply(RG$G < RG$Gb, 2, sum, na.rm=TRUE)
There are several times where the data ranges, or the number of introduced missing values (NA's) can be investigated during background correction and normalization.
Background Correction
#Number of missing values apply( backgroundCorrect(RG, method="subtract"), 2, sum(is.na))
Comparing channels
Differences between the Red and Green channels can be examined by plotting the differences in summary statistics, for example the pseudocode below plots the counts for the Green channel versus the Red channel where the backgroun is higher than the forground.
plot(apply(RG$R < RG$Rb, 2, sum, na.rm=TRUE), apply(RG$G < RG$Gb, 2, sum, na.rm=TRUE), type="n") text(apply(RG$R < RG$Rb, 2, sum, na.rm=TRUE), apply(RG$G < RG$Gb, 2, sum, na.rm=TRUE), seq(colnames(RG)))