Difference between revisions of "R tutorial"

From Organic Design wiki
m (Indexing)
m (Basic data types)
Line 67: Line 67:
 
*Character
 
*Character
 
  "Putative ATPase"  <font color="red"># Any character [A-Za-z] must be single or double quoted </font>
 
  "Putative ATPase"  <font color="red"># Any character [A-Za-z] must be single or double quoted </font>
 +
*Missing values
 +
NA      <font color="red">Way to handle missing information</font>
 
----
 
----
  

Revision as of 03:12, 15 March 2006

TODO

Tasks available from web which utilze example data available in R - object assignment, subsetting, plotting, mathematical functions, sorting etc (20 tasks?)
  • Usage/interaction within environment
  • Bioconductor resources/vignettes(including downloading)
  • Bioconductor basics (any resources for limma out there?)
Twenty tasks to do in R
  • Use a contributed guide as a template fo a list of twenty tasks to do. It must cover;
    • Memory based, can write to → .RData file
  • File handling
  • Object creation
  • Different types of data
Contributed material
  • usingR-2.pdf Chapter 1 → Starting up.
  • MGEDI → installing R/Bioconductor p30 - p34 (documentation/help)


Resources

Some Contributed guides for the beginner


Obtaining help in R

help.start()           # Browser based help documentation
help()                 # Help on a topic (note: help pages have a set format)
? ls                   # alternative help method on ls function
apropos(mean)          # Find Objects by (Partial) Name
example(mean)          # Run an Examples Section from the Online Help
demo()                 # Demonstrations of R Functionality
demo(graphics)         # Demonstration or graphics Functionality
There objects are functions, to run them you must put () after the function name

Useful commands in the R environment

search()              # Give Search Path for R Objects
searchpaths()         # Give Full Search Path for R Objects
ls()                  # List objects
objects()             # alternate function to list objects
data()                # Publically available datasets
rm(list=ls())         # Remove Objects from a Specified Environment
save.image()          # Save R Objects
q()                   # Terminate an R Session →  prompted to Save workspace image? [y/n/c]:

Command prompt

  • Type commands after the prompt (>) e.g.
> x <- 1:10        # assignment of 1 to 10 to an object called 'x'
> x                # Returning the x object to the screen
 [1]  1  2  3  4  5  6  7  8  9 10
  • Continuation of commands is expected after the plus symbol (+) e.g.
> x <- 1:          # partial command → parser is expecting more information
+  10
> x
 [1]  1  2  3  4  5  6  7  8  9 10

Text following a '#' is commented out


Basic data types

  • Logical
T                 # TRUE
F                 # FALSE
  • Numeric
3.141592654       # Any number [0-9\.]
  • Character
"Putative ATPase"  # Any character [A-Za-z] must be single or double quoted 
  • Missing values
NA       Way to handle missing information

Assignment of objects

  • objects must start with a letter [A-Za-z]
  • "<-" The arrow assigns information to the object on the left
x <- 42                # Assignment to the left
x
x = 42                 # Equivalent assignment (not recommended)
x 
42 -> x                # Assignment to the right

Saving objects

getwd()                        # Returns the current directory where R is running
setwd("C:/DATA/Microarray")    # Set the working directory to another location
getwd()                        # Check the directory has changed
x <- 42
save.image()                   # Saves a snapshot of objects to file .RData
y <- x * 2                     # Make a new object called 'y'
y                              # Return value of 'y'
q()                            # quit R

Restart R by double clicking on the file .RData in C:/DATA/Microarray

x              # Returns 'x' as it was saved to .RData
y              # 'y' should not exist

Object data types

  • Create a scalar (vector of length 1)
a <- 3.14            # Built in value
length(a)            # The scalar is actually a vector of length 1 
pi                   # Already have a built in object for pythagorus 
search()             # Print the search path for all objects
find("pi")           # "pi" is located in package:base
  • Create a vector
x <- c(2,3,5,2,7,1)  # Numbers put into a vector using 'c' function concatenate
x
y <- c(10,15,12) 
y
z <- c(x,y)
z
  • Create a matrix
zmat <- cbind(x,y)   # cbind joins vectors together by column       
zmat 
Whats going on in the second column → number recycling
mat <- matrix(1:20, nrow=5, ncol=4)                   # Constructing a matrix
mat
colnames(mat) <- c("Col1","Col2", "Col3", "Col4")     # Adding column names
mat
  • Create a list
mylist <- list(1:4,c("a","b","c"), LETTERS[1:10]) 
mylist
mylist <- list("element 1" = 1:4,"second vector" = c("a","b","c"), "capitals" = LETTERS[1:10]) 
mylist

Indexing

  • Subsetting a vector
x[c(1,2,3)]                     # Selecting the first three elements of 'x'
x[1:3]                          # Same subset using ':' sequence generation → see help(":")
  • Subsetting a matrix
mat[,1:2]                       # Selecting the first two columns of 'mat'
mat[1:2, 2:4]                   # Selecting a subset matrix of 'mat'
  • Subsetting a list
a[[1]]
a$"element 1"
a$capitals[1:5]
  1. Subsetting/Indexing
  2. Indexing from a vector
  3. Indexing from a matrix
  4. Indexing from a list
  5. Plotting data
  6. Labelling/colours etc
  7. Ab lines?
  8. how to read help
  9. Apropos/example/demo/find etc
  10. Help example
  11. ls/rm
  12. quit