Difference between revisions of "Rosaceae"
From Organic Design wiki
m  | 
				m (formatting, fix embeds)  | 
				||
| (25 intermediate revisions by 2 users not shown) | |||
| Line 1: | Line 1: | ||
| − | =Microarray analysis workshop=  | + | [[Category:Sven/Rosaceae]]  | 
| + | == Microarray analysis workshop ==    | ||
Time schedule: 8:30 - 10:30am, 11-12:30am (3.5 hours)  | Time schedule: 8:30 - 10:30am, 11-12:30am (3.5 hours)  | ||
| + | ----   | ||
| + | |||
| + | == Workflow ==  | ||
| + | ;#[[Introduction to Microarray analysis]] (''15 minutes'')   | ||
| + | ;# Normalization (''15 minutes'')   | ||
| + | ;#[[BioConductor/R framework]]  (''15 minutes'')  | ||
| + | ;# [[R tutorial]] (''60 minutes'')  | ||
| + | ;# [[Linear models for Microarray analysis]] (''15 minutes'')  | ||
| + | ;# [[BioConductor analysis tutorial]] ("60 minutes")   | ||
----  | ----  | ||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
<table class=document-code><tr><td>  | <table class=document-code><tr><td>  | ||
| − | + | {{:Introduction to Microarray analysis}}  | |
| + | </td></tr></table>  | ||
| + | |||
| + | <table class=document-code><tr><td>  | ||
| + | {{:BioConductor/R framework}}  | ||
</td></tr></table>  | </td></tr></table>  | ||
<table class=document-code><tr><td>  | <table class=document-code><tr><td>  | ||
| − | + | {{:R tutorial}}  | |
</td></tr></table>  | </td></tr></table>  | ||
| + | <table class=document-code><tr><td>  | ||
| + | {{:Linear models for Microarray analysis}}  | ||
| + | </td></tr></table>  | ||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
<table class=document-code><tr><td>  | <table class=document-code><tr><td>  | ||
| − | + | {{:BioConductor analysis tutorial}}  | |
</td></tr></table>  | </td></tr></table>  | ||
----  | ----  | ||
| − | |||
Latest revision as of 00:38, 15 January 2009
Microarray analysis workshop
Time schedule: 8:30 - 10:30am, 11-12:30am (3.5 hours)
Workflow
- Introduction to Microarray analysis (15 minutes)
 - Normalization (15 minutes)
 - BioConductor/R framework (15 minutes)
 - R tutorial (60 minutes)
 - Linear models for Microarray analysis (15 minutes)
 - BioConductor analysis tutorial ("60 minutes")
 
| 
 {{#security:edit|Sven}} {{#security:*|Sven}} 
 Overview of experimental process
 
 
 Statistical analysis process
 
 Statistical issues
 
 
 
 
 
 
 
 Analysis wish list
 
 Analysis aim
 Block Row Column ID Name M A t P.Value B 10396 20 15 23 171121_390_49 171121 5.035364 13.25087 49.62425 3.220044e-05 11.27486 4517 9 13 9 20264_118_53 20264 4.396719 11.11976 47.06004 3.220044e-05 11.05671 16881 32 21 22 165415_634_53 165415 4.645384 12.65872 43.40359 3.220044e-05 10.70650 16086 31 10 9 185903_436_49 185903 5.146504 11.36911 42.75724 3.220044e-05 10.63926 6508 13 7 22 197386_457_55 197386 4.621024 13.20426 42.09902 3.220044e-05 10.56899 5471 11 8 20 142178_355_53 142178 4.795734 12.07427 41.23346 3.220044e-05 10.47374 8395 16 20 23 251706_1_53 251706 -5.003475 13.04571 -38.61325 3.220044e-05 10.16421 4330 9 5 6 297409_340_47 297409 4.421922 12.27208 38.52215 3.220044e-05 10.15284 12479 24 14 13 163360_396_47 163360 4.367943 11.10478 38.21662 3.220044e-05 10.11439 15024 29 10 5 149243_674_53 149243 4.372419 11.36572 37.86362 3.220044e-05 10.06935 
 
 
  | 
| 
 
 What is BioConductor? (http://www.bioconductor.org)
 
 BioConductor GoalsThe broad goals of the project are: 
 library(tkWidgets) vExplorer() Object oriented class method design (OOP)
 Advantages
 Disadvantages
 Accessing BioConductor
 Sys.putenv("http_proxy"="http://proxy.hort.net.nz:8080")   # Setting proxy variable
source("http://www/bioconductor.org/getBioC.R")            # Downloading installation script
getBioC()                                                  # Running script
Installing vs loading packages
 library(limma) #Installs the limma package Documentation and Help
 
  | 
Resources
 
 
 
 Obtaining help in Rhelp.start() # Browser based help documentation help() # Help on a topic (note: help pages have a set format) ? ls # alternative help method on ls function apropos(mean) # Find Objects by (Partial) Name example(mean) # Run an Examples Section from the Online Help demo() # Demonstrations of R Functionality demo(graphics) # Demonstration or graphics Functionality RSiteSearch() # Searches web newslist archives and retrieves results using http 
 Useful commands in the R environmentsearch() # Give Search Path for R Objects searchpaths() # Give Full Search Path for R Objects ls() # List objects objects() # alternate function to list objects data() # Publically available datasets rm() # Remove Objects from a Specified Environment save.image() # Save R Objects q() # Terminate an R Session → prompted to Save workspace image? [y/n/c]: Command prompt
 > x <- 1:10 # assignment of 1 to 10 to an object called 'x' > x # Returning the x object to the screen [1] 1 2 3 4 5 6 7 8 9 10 
 > x <- 1: # partial command → parser is expecting more information + 10 > x [1] 1 2 3 4 5 6 7 8 9 10 
 Basic (atomic) data types
 T # TRUE F # FALSE 
 3.141592654 # Any number [0-9\.] 
 "Putative ATPase" # Any character [A-Za-z] must be single or double quoted 
 NA # Label for missing information in datasets 
 Assignment of objects
 x <- 42 # Assignment to the left x x = 42 # Equivalent assignment (not recommended) x 42 -> x # Assignment to the right x Saving objectsgetwd()                        # Returns the current directory where R is running
setwd("C:/DATA/Microarray")    # Set the working directory to another location
getwd()                        # Check the directory has changed
x <- 42
save.image()                   # Saves a snapshot of objects to file .RData
y <- x * 2                     # Make a new object called 'y'
y                              # Return value of 'y'
q()                            # quit R
Restart R by double clicking on the file .RData in C:/DATA/Microarray x # Returns 'x' as it was saved to .RData in "C:/DATA/Microarray" y # 'y' should not exist Object data types
 a <- 3.14            # Assign pythagorus to object 'a'
length(a)            # The scalar is actually a vector of length 1 
pi                   # Already have a built in object for pythagorus 
search()             # Print the search path for all objects
find("pi")           # "pi" is located in package:base
 x <- c(2,3,5,2,7,1)  # Numbers put into a vector using 'c' function concatenate
x
y <- c(10,15,12) 
y
names(y) <- c("first","second","third")    # Elements can be given names
z <- c(y,x)
z
 zmat <- cbind(x,y) # cbind joins vectors together by column zmat 
 mat <- matrix(1:20, nrow=5, ncol=4)                   # Constructing a matrix
mat
colnames(mat) <- c("Col1","Col2", "Col3", "Col4")     # Adding column names
mat
 mylist <- list(1:4,c("a","b","c"), LETTERS[1:10]) 
mylist
mylist <- list("element 1" = 1:4,"second vector" = c("a","b","c"), "Capitals" = LETTERS[1:10]) 
mylist
Indexing
 x[c(1,2,3)]                     # Selecting the first three elements of 'x'
x[1:3]                          # Same subset using ':' sequence generation → see help(":")
y[2]                            # Selecting the second element of 'y'
y["second"]                     # Selecting the second element of 'y' (by name)
 mat[,1:2] # Selecting the first two columns of 'mat' mat[1:2, 2:4] # Selecting a subset matrix of 'mat' 
 mylist[[1]] # Subsetting list 'mylist' by index mylist[["element 1"]] # Subsetting list 'mylist' by name 'element 1' mylist$"element 1" # Alternate way of subsetting mylist$Capitals[1:5] # Selecting the first five elements of 'Capitals' in 'mylist' (case sensitive) Plotting data
 help(plot) help(par) example(plot) par(ask=TRUE) # Set the printing device to prompt user before displaying next graph example(hist) Reading / writing files
 help(scan) help(read.table) 
 dataDir <- "C:/DATA/Microarray/GPR") mydata <- scan(file.path(dataDir, "BE34.gpr"), what="", nlines=29) # Get first 29 rows of data mydata 
 colClasses <- rep("NULL", 82)
colClasses[c(1:5, 9,12, 18, 21)] <- NA                                      # Set colClasses to ignore unwanted columns
mydata <- read.table(file.path(dataDir,"BE34.gpr"), header=T,  sep="\t", 
                                  nrows=20, skip=31, colClasses=colClasses) # Get first 20 lines of data after 31st row
mydata
 help(write) help(write.table) 
 User defined functions
 myfun <- function( arglist ){ body }
 myfun <- function(x){x}        # Creating identity function
myfun("foo")                   # Running the function
myfun()                        # Fails: no input arguement provided
 square <- function(x){x * x}         # Square the input number
square(10)                  # Returns 10 squared
square(1:4)              # Underlying arithmetic is vectorized
 
 "biasVar" <- function(df1=4,  df2=15,  N = 100,  seed=1)
{
 set.seed(seed)
 # 1) Data setup
 ylim <- c(-2,2)
 xlim <- c(-3,3)
 par(mfrow=c(2,2), mar=c(5,4,4-2,2)+0.1,mgp=c(2,.5,0) )
 x <- rnorm(80, 0, 1)
 y <- sin(x) + rnorm(80, 0, 1/9)
 xno   <- 500
 sim <- matrix(NA, nc=N, nr=xno)
 xseq <- seq(min(x),max(x), length=xno)
 plot(x, y, main=paste("df=",df1,sep=""), xlim=xlim, ylim=ylim)    # Using Splines
 truex <- seq(min(x), max(x), length = 80)
 lines(truex, sin(truex), lty = 5)
 splineobj <- smooth.spline(x, y, df = df1)
 lines(splineobj, lty = 1)
 plot(x, y, main=paste("df=",df2,sep=""), xlim=xlim, ylim=ylim)    # Using Splines
 truex <- seq(min(x), max(x), length = 80)
 lines(truex, sin(truex), lty = 5)
 splineobj <- smooth.spline(x, y, df = df2)
 lines(splineobj, lty = 1)
 plot(x, y, main=paste("Bias-Variance tradeoff, df=",df1, sep=""), type="n", xlim=xlim, ylim=ylim)
 for(i in seq(N))
   {
     x <- rnorm(80, 0, 1)
     y <- sin(x) + rnorm(80, 0, 1/9)
     splineobj <- smooth.spline(x, y, df = df1)      
     sim[,i] <- predict(splineobj,xseq)$y
   }
 ci <- qt(0.975, N) * sqrt(apply(sim,1, var))
 bias <- apply(sim,1, mean)
 rect(xseq,bias-ci,xseq,bias+ci, border="grey")
 rect(xseq,sin(xseq),xseq,bias, border="black")
 lines(truex, sin(truex))
 plot(x, y, main=paste("Bias-Variance tradeoff, df=",df2,sep=""), type="n", xlim=xlim, ylim=ylim)  
 for(i in seq(N))
   {
     x <- rnorm(80, 0, 1)
     y <- sin(x) + rnorm(80, 0, 1/9)
     splineobj <- smooth.spline(x, y, df = df2)      
     sim[,i] <- predict(splineobj,xseq)$y
   }
 ci <- qt(0.975,N) * sqrt(apply(sim,1, var))
 bias <- apply(sim,1, mean)
 rect(xseq,bias-ci,xseq,bias+ci, border="grey")
 rect(xseq,sin(xseq),xseq,bias, border="black")
 lines(truex, sin(truex))
}
 biasVar() # Generates data from a sine curve looking at bias variance tradeoff biasVar(df1=2, df2=30) # Let's change the smoothing parameters in the function arguements Quiting Rrm(list=ls()) # Cleaning up: Remove Objects from a Specified Environment q() Links | 
| 
 
 Overview of Limma package for R
 Origin
 Statistical approach
 Object orientated programming environment
 
 
 Advantages using Limma
 
 
 Limitations
 
 
 Microarray workshop experiment
 FileName SlideNumber Cy3 Cy5 Design BE34.gpr 34 Leaf Fruit -1 BE35.gpr 35 Fruit Leaf 1 BE36.gpr 36 Fruit Leaf 1 BE37.gpr 37 Leaf Fruit -1 
  | 
| 
 {{#security:edit|Sven}} {{#security:*|Sven,Tam,Andy}} Analysis of dyeswap experiment
 Block Row Column ID Name M A t P.Value B 10396 20 15 23 171121_390_49 171121 5.035364 13.25087 49.62425 3.220044e-05 11.27486 4517 9 13 9 20264_118_53 20264 4.396719 11.11976 47.06004 3.220044e-05 11.05671 16881 32 21 22 165415_634_53 165415 4.645384 12.65872 43.40359 3.220044e-05 10.70650 16086 31 10 9 185903_436_49 185903 5.146504 11.36911 42.75724 3.220044e-05 10.63926 6508 13 7 22 197386_457_55 197386 4.621024 13.20426 42.09902 3.220044e-05 10.56899 5471 11 8 20 142178_355_53 142178 4.795734 12.07427 41.23346 3.220044e-05 10.47374 8395 16 20 23 251706_1_53 251706 -5.003475 13.04571 -38.61325 3.220044e-05 10.16421 4330 9 5 6 297409_340_47 297409 4.421922 12.27208 38.52215 3.220044e-05 10.15284 12479 24 14 13 163360_396_47 163360 4.367943 11.10478 38.21662 3.220044e-05 10.11439 15024 29 10 5 149243_674_53 149243 4.372419 11.36572 37.86362 3.220044e-05 10.06935 See also | 




