Difference between revisions of "Introduction to Microarray analysis"

From Organic Design wiki
m (Analysis aim)
m (Analysis aim)
Line 42: Line 42:
 
* Easier to rank genes in order of evidence of differential expression than it is to select a specific cutoff
 
* Easier to rank genes in order of evidence of differential expression than it is to select a specific cutoff
 
*If we do select a cutoff, False Discovery Rate (FDR) cutoff is usually used
 
*If we do select a cutoff, False Discovery Rate (FDR) cutoff is usually used
:<font color="blue">''FDR threhold is the expected proportion of genes in a list that are likely to be incorrect''<font>
+
:<font color="blue">''FDR threhold is the expected proportion of genes in a list that are likely to be incorrect''</font>
TODO: Picie of  a gene list
+
''ADD PICTURE''
 
----
 
----
 
[[Category:Sven/Rosaceae]]
 
[[Category:Sven/Rosaceae]]

Revision as of 22:13, 14 March 2006

Overview of experimental process

File:Expt2.tiff

  • Competitive hybridization to spotted oligo/cDNA transcripts
  • Interested in genes that change between treatment conditions
differential expression versus equivalent expression

Statistical analysis process

File:Overview2.tiff

  • Raw data (GPR file format)
http://www.moleculardevices.com/pages/software/gn_gpr_format_history.html
  • Each GPR intensity file is typically >8 megabytes
  • Each TIFF image file is typically >30 megabytes
  • A microarray experiment consists of several → many slides

Statistical issues

  • In the past statistics was developed for n >>p
n observations, p variables
  • Gene expression data n<<p
Thousands of measured genes (p)
Small number of biological replicate slides (n)
  • Gene expression data can be highly correlated
groups of genes are regulated in the same way
  • Data not normally distributed
log transform highly skewed intensity data

File:Graph channels.tiff


Analysis wish list

  • Ideally would like unambiguous interpretation of results
  • Large amounts of data to analyse can be overwhelming and make interpretation subjective
  • Independent reproducibility of results by another collegue
Keep a record (log) of what was done

Analysis aim

  • Easier to rank genes in order of evidence of differential expression than it is to select a specific cutoff
  • If we do select a cutoff, False Discovery Rate (FDR) cutoff is usually used
FDR threhold is the expected proportion of genes in a list that are likely to be incorrect

ADD PICTURE