Contents

1 Background

The aim of mimager is to simplify the process of imaging microarrays and inspecting them for spatial artifacts by providing a single visualization function (mimage()) that works consistently with many of Bioconductor’s microarray object classes. Currently, mimager supports AffyBatch objects from the affy package, PLMset objects from the affyPLM package, ExpressionFeatureSet, ExonFeatureSet, GeneFeatureSet and SnpFeatureSet classes from the oligoClasses package, and the oligoPLM class from the oligo package.

2 Installation

You can install the latest release of mimager from Bioconductor:

if (!requireNamespace("BiocManager", quietly=TRUE))
    install.packages("BiocManager")
BiocManager::install("mimager")

3 Basic visualization

In this vignette we’ll work with the Dilution data provided by the affydata package.

library(mimager)
library(affydata)
data("Dilution")

The most basic functionality of mimage() is to produce a visualization of the probe-level intensities arranged by their physical location on the chip:

mimage(Dilution, select = 1)

NOTE:

For microarray platforms with perfect-match (PM) and mismatch (MM) probes, such as the Affymetrix Human Genome U95v2 chip used here, only PM probes are included by default. However, the type argument can be used to select only MM probes (probes="mm") or both (probes="all"). The abundant missing values that result from excluding a particular probe type can produce rasterization artifacts in the microarray images that diminish their informativeness. To compensate for this, mimage() fills in empty rows with values from neighboring rows. By default, a row is considered empty if more than 60% of its values are missing. This threshold can be changed by setting empty.thresh or row filling can be disabled altogether by changing empty.rows = "fill" to empty.rows = "ignore".

4 Visualizing multiple arrays

One of mimager’s nicest features is the ability to visualize multiple microarrays simultaneously. Simply passing the Dilution object to mimage() will produce a grid of images that each represent one chip in the sample.

mimage(Dilution)

The order of images is determined by the order of samples in the microarray data object. You can override this order or specify a subset of samples to include using the select argument, which accepts either a numeric vector corresponding to each sample’s index or or a character vector of sample names.

mimage() will compute a sensible layout for the image grid based on the dimensions of the current graphics device, or you can specify it manually by setting nrow, ncol or both.

mimage(Dilution, select = c("10A", "20A", "10B", "20B"), nrow = 1)

5 Probe-level linear models

The fitPLM() function from the affyPLM package and the fitProbeLevelModel() function from the oligo package can both be used to fit probe-level linear models (PLMs). As demonstrated by Ben Bolstad (2004), replacing raw probe intensities with the residuals or weights from PLMs can reveal spatial artifacts that might otherwise go undetected. mimager supports visualizing PLM objects generated by either package. One notable difference when working with PLM objects (as opposed to microarray objects) is the type argument refers to model value type (i.e., residuals or weights) rather than probe type.

library(affyPLM)
DilutionPLM <- fitPLM(Dilution)

mimage(DilutionPLM)

6 Transformations

Prior to visualization, the supplied microarray object is converted to a \(m \times n \times k\) array, where \(m\) and \(n\) correspond to the physical dimensions of the microarray chip and \(k\) is the number of chips (or samples) in the data-set. The array values are then transformed by whatever function is passed to the transform argument. For example, the default for AffyBatch objects is transform = log2 to log-transform the probe intensities.

mimager provides 2 transformation functions that have proven useful in identifying spatial biases on microarrays. The first, arank(), computes the rank of values within each matrix of a three-dimensional array, an approach recommended by the oligo package’s vignette. The second, arle(), computes relative log expression (RLE) values, where each value is compared with a “standard reference”, which is constructed by calculating the median value of each value’s position across all samples. The utility of this approach for identifying spatial artifacts was first demonstrated by Reimers & Weinstein (2005) and often produces results similar to what’s achieved with PLMs:

# use a divergent color palette for RLEs
div.colors <- scales::brewer_pal(palette = "RdBu")(9)

mimage(Dilution, transform = arle, legend.label = "RLE", colors = div.colors)

Any function can be passed to transform provided that it expects and returns a three-dimensional array. The arank() function is necessary because base::rank() violates this assumption and returns a numeric vector. You can experiment with custom transformation functions using marray() to convert a Bioconductor microarray object to a three-dimensional array of matrices.

DilutionArray <- marray(Dilution)
dim(DilutionArray)
## [1] 640 640   4
DilutionRank <- arank(DilutionArray)
dim(DilutionRank)
## [1] 640 640   4

References

Bolstad, B.M. 2004. “Low Level Analysis of High-Density Oligonucleotide Data: Background, Normalization and Summarization.” PhD thesis, University of California, Berkeley.

Reimers, Mark, and John N Weinstein. 2005. “Quality Assessment of Microarrays: Visualization of Spatial Artifacts and Quantitation of Regional Biases.” BMC Bioinformatics 6 (July): 166. https://doi.org/10.1186/1471-2105-6-166.