Contents

1 Version Info

R version: R version 4.1.1 (2021-08-10)
Bioconductor version: 3.14
Package version: 1.20.0

2 Sample Workflow

The following code illustrates a typical R / Bioconductor session. It uses RMA from the affy package to pre-process Affymetrix arrays, and the limma package for assessing differential expression.

## Load packages
library(affy)   # Affymetrix pre-processing
library(limma)  # two-color pre-processing; differential
                  # expression
                
## import "phenotype" data, describing the experimental design
phenoData <- 
    read.AnnotatedDataFrame(system.file("extdata", "pdata.txt",
    package="arrays"))

## RMA normalization
celfiles <- system.file("extdata", package="arrays")
eset <- justRMA(phenoData=phenoData,
    celfile.path=celfiles)
## Warning: replacing previous import 'AnnotationDbi::tail' by 'utils::tail' when
## loading 'hgfocuscdf'
## Warning: replacing previous import 'AnnotationDbi::head' by 'utils::head' when
## loading 'hgfocuscdf'
## 
## differential expression
combn <- factor(paste(pData(phenoData)[,1],
    pData(phenoData)[,2], sep = "_"))
design <- model.matrix(~combn) # describe model to be fit

fit <- lmFit(eset, design)  # fit each probeset to model
efit <- eBayes(fit)        # empirical Bayes adjustment
topTable(efit, coef=2)      # table of differentially expressed probesets
##                 logFC   AveExpr         t      P.Value    adj.P.Val        B
## 204582_s_at  3.468416 10.150533  39.03471 1.969915e-14 1.732146e-10 19.86082
## 211548_s_at -2.325670  7.178610 -22.73165 1.541158e-11 6.775701e-08 15.88709
## 216598_s_at  1.936306  7.692822  21.73818 2.658881e-11 7.793180e-08 15.48223
## 211110_s_at  3.157766  7.909391  21.19204 3.625216e-11 7.969130e-08 15.24728
## 206001_at   -1.590732 12.402722 -18.64398 1.715422e-10 3.016740e-07 14.01955
## 202409_at    3.274118  6.704989  17.72512 3.156709e-10 4.626157e-07 13.51659
## 221019_s_at  2.251730  7.104012  16.34552 8.353283e-10 1.049292e-06 12.69145
## 204688_at    1.813001  7.125307  14.75281 2.834343e-09 3.115297e-06 11.61959
## 205489_at    1.240713  7.552260  13.62265 7.264649e-09 7.097562e-06 10.76948
## 209288_s_at -1.226421  7.603917 -13.32681 9.401074e-09 7.784531e-06 10.53327

A top table resulting from a more complete analysis, described in Chapter 7 of Bioconductor Case Studies, is shown below. The table enumerates Affymetrix probes, the log-fold difference between two experimental groups, the average expression across all samples, the t-statistic describing differential expression, the unadjusted and adjusted (controlling for false discovery rate, in this case) significance of the difference, and log-odds ratio. These results can be used in further analysis and annotation.

      ID logFC AveExpr    t  P.Value adj.P.Val     B
636_g_at  1.10    9.20 9.03 4.88e-14  1.23e-10 21.29
39730_at  1.15    9.00 8.59 3.88e-13  4.89e-10 19.34
 1635_at  1.20    7.90 7.34 1.23e-10  1.03e-07 13.91
 1674_at  1.43    5.00 7.05 4.55e-10  2.87e-07 12.67
40504_at  1.18    4.24 6.66 2.57e-09  1.30e-06 11.03
40202_at  1.78    8.62 6.39 8.62e-09  3.63e-06  9.89
37015_at  1.03    4.33 6.24 1.66e-08  6.00e-06  9.27
32434_at  1.68    4.47 5.97 5.38e-08  1.70e-05  8.16
37027_at  1.35    8.44 5.81 1.10e-07  3.08e-05  7.49
37403_at  1.12    5.09 5.48 4.27e-07  1.08e-04  6.21

[ Back to top ]

3 Installation and Use

Follow installation instructions to start using these packages. You can install affy and limma as follows:

if (!"BiocManager" %in% rownames(installed.packages()))
     install.packages("BiocManager")
BiocManager::install(c("affy", "limma"), dependencies=TRUE)

To install additional packages, such as the annotations associated with the Affymetrix Human Genome U95A 2.0, use

BiocManager::install("hgu95av2.db", dependencies=TRUE)

Package installation is required only once per R installation. View a /packagesfull list of available packages.

To use the affy and limma packages, evaluate the commands

library("affy")
library("limma")

These commands are required once in each R session.

[ Back to top ]

4 Exploring Package Content

Packages have extensive help pages, and include vignettes highlighting common use cases. The help pages and vignettes are available from within R. After loading a package, use syntax like

help(package="limma")
?topTable

to obtain an overview of help on the limma package, and the topTable function, and

browseVignettes(package="limma")

to view vignettes (providing a more comprehensive introduction to package functionality) in the limma package. Use

help.start()

to open a web page containing comprehensive help resources.

[ Back to top ]

5 Pre-Processing Resources

The following provide a brief overview of packages useful for pre-processing. More comprehensive workflows can be found in documentation (available from package descriptions) and in Bioconductor Books and monographs.

5.1 Affymetrix 3’-biased Array

affy, gcrma, affyPLM

  • Require cdf package, probe package and annotation package
  • All these packages are available from Bioconductor via BiocManager::install()

xps

  • Requires installation of ROOT
  • Uses data files from Affymetrix (.CDF, .PGF, .CLF, .CSV) directly

5.2 Affymetrix Exon ST Arrays

oligo

  • Requires a pdInfoPackage built using pdInfoBuilder
  • This package collates cdf, probe, annotation data together
  • These packages are available from Bioconductor via BiocManager::install()
  • Most cases will require a 64-bit computer running Linux and >= 8Gb RAM

exonmap

  • Requires installation of MySQL and Ensembl core database tables
  • Requires specially modified cdf and affy package
  • Requires a 64-bit computer running Linux and >= 8 Gb RAM

xps

  • Requires installation of ROOT
  • Uses data files from Affymetrix (.CDF, .PGF, .CLF, .CSV) directly
  • Will run on conventional desktop computers

5.3 Affymetrix Gene ST Arrays

oligo

  • Requires a pdInfoPackage built using pdInfoBuilder
  • This package collates cdf, probe, annotation data together
  • These packages are available from Bioconductor via BiocManager::install()

xps

  • Requires installation of ROOT
  • Uses data files from Affymetrix (.CDF, .PGF, .CLF, .CSV) directly

5.4 Affymetrix SNP Arrays

oligo

  • Requires a pdInfoPackage built using pdInfoBuilder
  • This package collates cdf, probe, annotation and HapMap data
  • These packages are available from Bioconductor via BiocManager::install()
  • Not yet capable of processing CNV regions in SNP5.0 and SNP6.0

5.5 Affymetrix Tiling Arrays

oligo

  • Requires a pdInfoPackage built using pdInfoBuilder
  • This package collates data from bpmap and cif files

5.6 Nimblegen Arrays

oligo

5.7 Illumina Expression Microarrays

lumi

  • Requires lumi-specific mapping and annotation packages (e.g., lumiHumanAll.db and lumiHumanIDMapping)

beadarray

  • Requires beadarray-specific mapping and annotation packages (e.g., illuminaHumanv1BeadID.db and illuminaHumanV1.db)

[ Back to top ]

sessionInfo()
## R version 4.1.1 (2021-08-10)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 20.04.3 LTS
## 
## Matrix products: default
## BLAS:   /home/biocbuild/bbs-3.14-bioc/R/lib/libRblas.so
## LAPACK: /home/biocbuild/bbs-3.14-bioc/R/lib/libRlapack.so
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=en_GB              LC_COLLATE=C              
##  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
##  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
## [1] hgfocuscdf_2.18.0   affy_1.72.0         Biobase_2.54.0     
## [4] BiocGenerics_0.40.0 limma_3.50.0        arrays_1.20.0      
## [7] BiocStyle_2.22.0   
## 
## loaded via a namespace (and not attached):
##  [1] Rcpp_1.0.7             XVector_0.34.0         GenomeInfoDb_1.30.0   
##  [4] bslib_0.3.1            compiler_4.1.1         BiocManager_1.30.16   
##  [7] jquerylib_0.1.4        bitops_1.0-7           tools_4.1.1           
## [10] zlibbioc_1.40.0        digest_0.6.28          bit_4.0.4             
## [13] memoise_2.0.0          jsonlite_1.7.2         evaluate_0.14         
## [16] RSQLite_2.2.8          preprocessCore_1.56.0  png_0.1-7             
## [19] rlang_0.4.12           DBI_1.1.1              yaml_2.2.1            
## [22] xfun_0.27              fastmap_1.1.0          GenomeInfoDbData_1.2.7
## [25] httr_1.4.2             stringr_1.4.0          knitr_1.36            
## [28] Biostrings_2.62.0      IRanges_2.28.0         S4Vectors_0.32.0      
## [31] sass_0.4.0             vctrs_0.3.8            stats4_4.1.1          
## [34] bit64_4.0.5            R6_2.5.1               AnnotationDbi_1.56.0  
## [37] rmarkdown_2.11         bookdown_0.24          blob_1.2.2            
## [40] magrittr_2.0.1         htmltools_0.5.2        KEGGREST_1.34.0       
## [43] stringi_1.7.5          RCurl_1.98-1.5         cachem_1.0.6          
## [46] crayon_1.4.1           affyio_1.64.0

[ Back to top ]