Difference between revisions of "BioConductor/R framework"
From Organic Design wiki
m (Caretaker: Format cat links) |
m (Caretaker: headings) |
||
(One intermediate revision by the same user not shown) | |||
Line 1: | Line 1: | ||
[[Category:Sven/Rosaceae]] | [[Category:Sven/Rosaceae]] | ||
__NOTOC__ | __NOTOC__ | ||
− | + | ||
+ | = What is BioConductor? (''http://www.bioconductor.org'') = | ||
*BioConductor is an open source development software project | *BioConductor is an open source development software project | ||
*Provides tools for analysis and comprehension of genomic data | *Provides tools for analysis and comprehension of genomic data | ||
Line 16: | Line 17: | ||
---- | ---- | ||
− | + | = BioConductor Goals = | |
The broad goals of the project are: | The broad goals of the project are: | ||
Line 30: | Line 31: | ||
---- | ---- | ||
− | + | = Object oriented class method design (''[http://en.wikipedia.org/wiki/Object-oriented_programming OOP]'') = | |
*Organized approach to handling large amounts of experimental data | *Organized approach to handling large amounts of experimental data | ||
*Class structure encapsulates the data required for microarray analysis → ''object'' | *Class structure encapsulates the data required for microarray analysis → ''object'' | ||
Line 36: | Line 37: | ||
*A method is a function that performs an action on data (''objects'') throughout analysis | *A method is a function that performs an action on data (''objects'') throughout analysis | ||
---- | ---- | ||
− | + | ||
+ | = Advantages = | ||
*Newest cutting edge statistical methods available | *Newest cutting edge statistical methods available | ||
*Modern programming language | *Modern programming language | ||
Line 42: | Line 44: | ||
*Its freely available | *Its freely available | ||
---- | ---- | ||
− | + | ||
+ | = Disadvantages = | ||
*Steep learning curve | *Steep learning curve | ||
*Need to have experience programming in the R programming environment (''http://www.r-project.org'') | *Need to have experience programming in the R programming environment (''http://www.r-project.org'') | ||
Line 49: | Line 52: | ||
---- | ---- | ||
− | + | = Accessing BioConductor = | |
*BioConductor tools are accessed using the R programming language | *BioConductor tools are accessed using the R programming language | ||
*R is a programming environment for statistical computing and graphics | *R is a programming environment for statistical computing and graphics | ||
Line 65: | Line 68: | ||
---- | ---- | ||
− | + | = Installing vs loading packages = | |
*Packages only need to be installed once onto a computer | *Packages only need to be installed once onto a computer | ||
*Packages must be loaded with each new R session | *Packages must be loaded with each new R session | ||
Line 72: | Line 75: | ||
---- | ---- | ||
− | + | = Documentation and Help = | |
*R [http://cran.stat.auckland.ac.nz/manuals.html manuals] and [http://cran.stat.auckland.ac.nz/other-docs.html tutorials] are available from the [http://cran.stat.auckland.ac.nz R] website or on-line in an R session | *R [http://cran.stat.auckland.ac.nz/manuals.html manuals] and [http://cran.stat.auckland.ac.nz/other-docs.html tutorials] are available from the [http://cran.stat.auckland.ac.nz R] website or on-line in an R session | ||
* R on-line help system, detailed high quality on-line documentation, available in text, [http://en.wikipedia.org/wiki/HTML HTML], [http://en.wikipedia.org/wiki/Portable_Document_Format PDF], and [http://en.wikipedia.org/wiki/Latex L<sup>A</sup>T<sub><big>E</big></sub>X] formats. | * R on-line help system, detailed high quality on-line documentation, available in text, [http://en.wikipedia.org/wiki/HTML HTML], [http://en.wikipedia.org/wiki/Portable_Document_Format PDF], and [http://en.wikipedia.org/wiki/Latex L<sup>A</sup>T<sub><big>E</big></sub>X] formats. |
Latest revision as of 21:31, 12 November 2006
What is BioConductor? (http://www.bioconductor.org)
- BioConductor is an open source development software project
- Provides tools for analysis and comprehension of genomic data
- Extensively for Affymetrix and cDNA microarray technologies
- The project started in the autumn of 2001
- Includes 23 core collaborating developers
- Project Growth
- v1.0: May 2nd, 2002, 15 packages
- v1.1: November 18th, 2002, 20 packages.
- v1.2: May 28th, 2003, 30 packages.
- v1.3 Oct 29th 2003, 49 Packages
- v1.4 May 18 2004, 82 Packages
- v1.5 Nov 1st, 2004, 98 Packages
BioConductor Goals
The broad goals of the project are:
- To enable sound and powerful statistical analyses in genomics
- To provide a computing platform that allows the rapid design and deployment of high-quality software
- To develop a computing environment for both biologists and statisticians
- Promote high-quality dynamic documentation and reproducible research
- Using LATEX, the Sweave system and tcl/tk to deliver interactive step by step pdf tutorials, e.g.
library(tkWidgets) vExplorer()
Object oriented class method design (OOP)
- Organized approach to handling large amounts of experimental data
- Class structure encapsulates the data required for microarray analysis → object
- Allows efficient representation and manipulation (including subsetting) of data from many microarray slides in an experiment
- A method is a function that performs an action on data (objects) throughout analysis
Advantages
- Newest cutting edge statistical methods available
- Modern programming language
- Powerful graphical tools available
- Its freely available
Disadvantages
- Steep learning curve
- Need to have experience programming in the R programming environment (http://www.r-project.org)
- the dreaded command line
- Like all software there are bugs
Accessing BioConductor
- BioConductor tools are accessed using the R programming language
- R is a programming environment for statistical computing and graphics
- Initially written by Robert Gentleman and Ross Ihaka (Auckland University)
- Download R from a comprehensive R archive network (CRAN) mirror (http://cran.stat.auckland.ac.nz)
- Install R (available for Unix, Windows, and Mac OS X)
- R version 2.2.1 has been released on 2005-12-20
- R is the environment used to design and distribute software:
- Locally downloaded files
- Via the internet e.g. commands in R:
Sys.putenv("http_proxy"="http://proxy.hort.net.nz:8080") # Setting proxy variable source("http://www/bioconductor.org/getBioC.R") # Downloading installation script getBioC() # Running script
Installing vs loading packages
- Packages only need to be installed once onto a computer
- Packages must be loaded with each new R session
- The R function library is used to load packages e.g. command in R:
library(limma) #Installs the limma package
Documentation and Help
- R manuals and tutorials are available from the R website or on-line in an R session
- R on-line help system, detailed high quality on-line documentation, available in text, HTML, PDF, and LATEX formats.
- Frequently asked questions BioConductor / R
- Email news lists for BioConductor / R
- Note: Read the posting guides before submitting questions