Difference between revisions of "BioConductor/R framework"

From Organic Design wiki
m (Caretaker: Format cat links)
m (Caretaker: headings)
 
(One intermediate revision by the same user not shown)
Line 1: Line 1:
 
[[Category:Sven/Rosaceae]]
 
[[Category:Sven/Rosaceae]]
 
__NOTOC__
 
__NOTOC__
=====What is BioConductor? (''http://www.bioconductor.org'')=====
+
 
 +
= What is BioConductor? (''http://www.bioconductor.org'') =
 
*BioConductor is an open source development software project
 
*BioConductor is an open source development software project
 
*Provides tools for analysis and comprehension of genomic data
 
*Provides tools for analysis and comprehension of genomic data
Line 16: Line 17:
 
----
 
----
  
=====BioConductor Goals=====
+
= BioConductor Goals =
 
The broad goals of the project are:
 
The broad goals of the project are:
  
Line 30: Line 31:
 
----
 
----
  
=====Object oriented class method design (''[http://en.wikipedia.org/wiki/Object-oriented_programming  OOP]'')=====
+
= Object oriented class method design (''[http://en.wikipedia.org/wiki/Object-oriented_programming  OOP]'') =
 
*Organized approach to handling large amounts of experimental data
 
*Organized approach to handling large amounts of experimental data
 
*Class structure encapsulates the data required for microarray analysis → ''object''
 
*Class structure encapsulates the data required for microarray analysis → ''object''
Line 36: Line 37:
 
*A method is a function that performs an action on data (''objects'') throughout analysis
 
*A method is a function that performs an action on data (''objects'') throughout analysis
 
----
 
----
=====Advantages=====
+
 
 +
= Advantages =
 
*Newest cutting edge statistical methods available
 
*Newest cutting edge statistical methods available
 
*Modern programming language
 
*Modern programming language
Line 42: Line 44:
 
*Its freely available
 
*Its freely available
 
----
 
----
=====Disadvantages=====
+
 
 +
= Disadvantages =
 
*Steep learning curve
 
*Steep learning curve
 
*Need to have experience programming in the R programming environment (''http://www.r-project.org'')
 
*Need to have experience programming in the R programming environment (''http://www.r-project.org'')
Line 49: Line 52:
 
----
 
----
  
=====Accessing BioConductor=====
+
= Accessing BioConductor =
 
*BioConductor tools are accessed using the R programming language
 
*BioConductor tools are accessed using the R programming language
 
*R is a programming environment for statistical computing and graphics
 
*R is a programming environment for statistical computing and graphics
Line 65: Line 68:
 
----
 
----
  
=====Installing vs loading packages=====
+
= Installing vs loading packages =
 
*Packages only need to be installed once onto a computer
 
*Packages only need to be installed once onto a computer
 
*Packages must be loaded with each new R session  
 
*Packages must be loaded with each new R session  
Line 72: Line 75:
 
----
 
----
  
=====Documentation and Help=====
+
= Documentation and Help =
 
*R [http://cran.stat.auckland.ac.nz/manuals.html manuals] and [http://cran.stat.auckland.ac.nz/other-docs.html tutorials] are available from the [http://cran.stat.auckland.ac.nz R] website or on-line in an R session
 
*R [http://cran.stat.auckland.ac.nz/manuals.html manuals] and [http://cran.stat.auckland.ac.nz/other-docs.html tutorials] are available from the [http://cran.stat.auckland.ac.nz R] website or on-line in an R session
 
* R on-line help system, detailed high quality on-line documentation, available in text, [http://en.wikipedia.org/wiki/HTML HTML], [http://en.wikipedia.org/wiki/Portable_Document_Format PDF], and [http://en.wikipedia.org/wiki/Latex L<sup>A</sup>T<sub><big>E</big></sub>X] formats.
 
* R on-line help system, detailed high quality on-line documentation, available in text, [http://en.wikipedia.org/wiki/HTML HTML], [http://en.wikipedia.org/wiki/Portable_Document_Format PDF], and [http://en.wikipedia.org/wiki/Latex L<sup>A</sup>T<sub><big>E</big></sub>X] formats.

Latest revision as of 21:31, 12 November 2006


What is BioConductor? (http://www.bioconductor.org)

  • BioConductor is an open source development software project
  • Provides tools for analysis and comprehension of genomic data
  • Extensively for Affymetrix and cDNA microarray technologies
  • The project started in the autumn of 2001
  • Includes 23 core collaborating developers
  • Project Growth
v1.0: May 2nd, 2002, 15 packages
v1.1: November 18th, 2002, 20 packages.
v1.2: May 28th, 2003, 30 packages.
v1.3 Oct 29th 2003, 49 Packages
v1.4 May 18 2004, 82 Packages
v1.5 Nov 1st, 2004, 98 Packages

BioConductor Goals

The broad goals of the project are:

  • To enable sound and powerful statistical analyses in genomics
  • To provide a computing platform that allows the rapid design and deployment of high-quality software
  • To develop a computing environment for both biologists and statisticians
  • Promote high-quality dynamic documentation and reproducible research
  • Using LATEX, the Sweave system and tcl/tk to deliver interactive step by step pdf tutorials, e.g.
library(tkWidgets)
vExplorer()

File:Sweave2.png


Object oriented class method design (OOP)

  • Organized approach to handling large amounts of experimental data
  • Class structure encapsulates the data required for microarray analysis → object
  • Allows efficient representation and manipulation (including subsetting) of data from many microarray slides in an experiment
  • A method is a function that performs an action on data (objects) throughout analysis

Advantages

  • Newest cutting edge statistical methods available
  • Modern programming language
  • Powerful graphical tools available
  • Its freely available

Disadvantages

  • Steep learning curve
  • Need to have experience programming in the R programming environment (http://www.r-project.org)
  • the dreaded command line
  • Like all software there are bugs

Accessing BioConductor

  • BioConductor tools are accessed using the R programming language
  • R is a programming environment for statistical computing and graphics
  • Initially written by Robert Gentleman and Ross Ihaka (Auckland University)
  • Download R from a comprehensive R archive network (CRAN) mirror (http://cran.stat.auckland.ac.nz)
  • Install R (available for Unix, Windows, and Mac OS X)
  • R version 2.2.1 has been released on 2005-12-20
  • R is the environment used to design and distribute software:
    • Locally downloaded files
    • Via the internet e.g. commands in R:
Sys.putenv("http_proxy"="http://proxy.hort.net.nz:8080")   # Setting proxy variable
source("http://www/bioconductor.org/getBioC.R")            # Downloading installation script
getBioC()                                                  # Running script

Installing vs loading packages

  • Packages only need to be installed once onto a computer
  • Packages must be loaded with each new R session
  • The R function library is used to load packages e.g. command in R:
library(limma)     #Installs the limma package 

Documentation and Help

  • R manuals and tutorials are available from the R website or on-line in an R session
  • R on-line help system, detailed high quality on-line documentation, available in text, HTML, PDF, and LATEX formats.
  • Frequently asked questions BioConductor / R
  • Email news lists for BioConductor / R
Note: Read the posting guides before submitting questions