Inferring Somatic Signatures from Single Nucleotide Variant Calls

Table of Contents

Julian Gehring (EMBL Heidelberg)

1 Motivation: The Concept Behind Mutational Signatures

Recent publications introduced the concept of identifying mutational signatures from cancer sequencing studies and linked them to potential mutation generation processes [6] [14] [11]. Conceptually, this relates somatically occurring single nucleotide variants (SNVs) to the surrounding sequence which will be referred to as mutational or somatic motifs in the following. Based on the frequency of the motifs occurring in multiple samples, these can be decomposed mathematically into so called mutational signatures. In case of the investigation of tumors, the term somatic signatures will be used here to distinguish them from germline mutations and their generating processes.

The SomaticSignatures package provides an efficient and user-friendly implementation for the extraction of somatic motifs based on a list of somatically mutated genomic sites and the estimation of somatic signatures with a number of statistical approaches.

2 Methodology: From Mutations to Somatic Signatures

The basic idea of somatic signatures is composed of two parts:

Firstly, each somatic mutation is described in relation of the sequence context in which it occurs. As an example, consider a SNV, resulting in the alteration from A in the normal to G in the tumor sample, that is embedded in the sequence context TAC. Thus, the somatic motif can be written as TAC>TGC or T.C A>G. The frequency of these motifs across multiple samples is then represented as a matrix \(M_{ij}\), where \(i\) counts over the motifs and \(j\) over the samples.

In a second step, the matrix \(M\) is numerically decomposed into two matrices \(W\) and \(H\)

$$M_{ij} = \sum_{k=1}^{R} W_{ik} H_{jk}$$

for a fixed number \(R\) of signatures. While \(W\) describes the composition of each signature in term of somatic motifs, \(H\) shows the contribution of the signature to the alterations present in each sample.

3 Workflow and Implementation: Analysis with the SomaticSignatures Package

The SomaticSignatures package offers a framework for inferring signatures of SNVs in a user-friendly and efficient manner for large-scale data sets. A tight integration with standard data representations of the Bioconductor project [1] was a major design goal. Further, it extends the selection of multivariate statistical methods for the matrix decomposition and allows a simple visualization of the results.

For a typical workflow, a set of variant calls and the reference sequence are needed. Ideally, the SNVs are represented as a VRanges object with the genomic location as well as reference and alternative allele defined. The reference sequence can be, for example, a FaFile object, representing an indexed FASTA file, a BSgenome object, or a GmapGenome object. Alternatively, we provide functions to extract the relevant information from other sources of inputs. At the moment, this covers the MuTect [12] variant caller and the h5vc package [15] [9].

Generally, the individual steps of the analysis can be summarized as:

  1. The somatic motifs for each variant are retrieved from the reference sequence with the mutationContext function and converted to a matrix representation with the motifMatrix function.
  2. Somatic signatures are estimated with a method of choice (currently nmfDecomposition and pcaDecomposition).
  3. The somatic signatures and their representation in the samples are assessed with a set of accessor and plotting functions.

To decompose \(M\), the SomaticSignatures package implements two methods:

Non-negative matrix factorization (NMF)
The NMF decomposes \(M\) with the constraint of positive components in \(W\) and \(H\) [4]. The method was used [6] for the identification of mutational signatures, and can be computationally expensive for large data sets.
Principal component analysis (PCA)
The PCA employs the eigenvalue decomposition and is therefore suitable for large data sets [2]. While this is related to the NMF, no constraint on the sign of the elements of \(W\) and \(H\) exists.

Other methods can be supplied through the decomposition argument of the identifySignatures function.

4 Use case: Estimating Somatic Signatures from TCGA WES Studies

In the following, the concept of somatic signatures and the steps for inferring these from an actual biological data set are shown. For the example, somatic variant calls from whole exome sequencing (WES) studies from The Cancer Genome Atlas (TCGA) project will be used, which are part of the SomaticCancerAlterations package.

library(SomaticSignatures)
library(ggplot2)
library(SomaticCancerAlterations)
library(BSgenome.Hsapiens.UCSC.hg19)

4.1 Data: Preproccessing of the TCGA WES Studies

The SomaticCancerAlterations package provides the somatic SNV calls for eight WES studies, each investigating a different cancer type. The metadata summarizes the biological and experimental settings of each study.

sca_metadata = scaMetadata()

print(sca_metadata)
             Cancer_Type        Center NCBI_Build Sequence_Source
   gbm_tcga          GBM broad.mit.edu         37             WXS
   hnsc_tcga        HNSC broad.mit.edu         37         Capture
   kirc_tcga        KIRC broad.mit.edu         37         Capture
   luad_tcga        LUAD broad.mit.edu         37             WXS
   lusc_tcga        LUSC broad.mit.edu         37             WXS
   ov_tcga            OV broad.mit.edu         37             WXS
   skcm_tcga        SKCM broad.mit.edu         37         Capture
   thca_tcga        THCA broad.mit.edu         37             WXS
             Sequencing_Phase      Sequencer Number_Samples
   gbm_tcga           Phase_I Illumina GAIIx            291
   hnsc_tcga          Phase_I Illumina GAIIx            319
   kirc_tcga          Phase_I Illumina GAIIx            297
   luad_tcga          Phase_I Illumina GAIIx            538
   lusc_tcga          Phase_I Illumina GAIIx            178
   ov_tcga            Phase_I Illumina GAIIx            142
   skcm_tcga          Phase_I Illumina GAIIx            266
   thca_tcga          Phase_I Illumina GAIIx            406
             Number_Patients                           Cancer_Name
   gbm_tcga              291               Glioblastoma multiforme
   hnsc_tcga             319 Head and Neck squamous cell carcinoma
   kirc_tcga             293                    Kidney Chromophobe
   luad_tcga             519                   Lung adenocarcinoma
   lusc_tcga             178          Lung squamous cell carcinoma
   ov_tcga               142     Ovarian serous cystadenocarcinoma
   skcm_tcga             264               Skin Cutaneous Melanoma
   thca_tcga             403                    Thyroid carcinoma

The starting point of the analysis is a VRanges object which describes the somatic variants in terms of their genomic locations as well as reference and alternative alleles. For more details about this class and how to construct it, please see the documentation of the VariantAnnotation package [5]. Since the genomic positions are given in the NCBI notation and the references used later are in UCSC notation, the functions ucsc and ncbi are used to easily switch between the two notations. In this example, all mutational calls of a study will be pooled together, in order to find signatures related to a specific cancer type.

sca_vr = scaSNVRanges()

head(sca_vr, 3)
   VRanges object with 3 ranges and 1 metadata column:
         seqnames           ranges strand         ref              alt
                        
     gbm     chr1 [887446, 887446]      +           G                A
     gbm     chr1 [909247, 909247]      +           C                T
     gbm     chr1 [978952, 978952]      +           C                T
             totalDepth       refDepth       altDepth   sampleNames
            
     gbm                                   TCGA-06-5858
     gbm                                   TCGA-32-1977
     gbm                                   TCGA-06-0237
         softFilterMatrix |    study
                  | 
     gbm                  |      gbm
     gbm                  |      gbm
     gbm                  |      gbm
     -------
     seqinfo: 22 sequences from an unspecified genome
     hardFilters: NULL

To get a first impression of the data, we count the number of somatic variants per study.

sort(table(sca_vr$study), decreasing = TRUE)
   
     luad   skcm   hnsc   lusc   kirc    gbm   thca     ov 
   208724 200589  67125  61485  24158  19938   6716   5872

4.2 Motifs: Extracting the Sequence Context of Somatic Variants

In a first step, the sequence motif for each variant is extracted based on the reference sequence. Here, the BSgenomes object for the human hg19 reference is used. However, all objects with a defined getSeq method can serve as the reference, e.g. an indexed FASTA file. Additionally, we transform all motifs to have a pyrimidine base (C or T) as a reference base [14].

sca_motifs = mutationContext(sca_vr, BSgenome.Hsapiens.UCSC.hg19, unify = TRUE)

To continue with the estimation of the somatic signatures, the matrix \(M\) of the form {motifs × studies} is constructed. The normalize argument specifies that frequencies rather than the actual counts are returned.

sca_mm = motifMatrix(sca_motifs, group = "study", normalize = TRUE)

head(round(sca_mm, 4))
             gbm   hnsc   kirc   luad   lusc     ov   skcm   thca
   CA A.A 0.0083 0.0098 0.0126 0.0200 0.0165 0.0126 0.0014 0.0077
   CA A.C 0.0093 0.0082 0.0121 0.0217 0.0156 0.0192 0.0009 0.0068
   CA A.G 0.0026 0.0061 0.0046 0.0144 0.0121 0.0060 0.0004 0.0048
   CA A.T 0.0057 0.0051 0.0070 0.0134 0.0100 0.0092 0.0007 0.0067
   CA C.A 0.0075 0.0143 0.0215 0.0414 0.0390 0.0128 0.0060 0.0112
   CA C.C 0.0075 0.0111 0.0138 0.0415 0.0275 0.0143 0.0018 0.0063

The observed occurrence of the motifs, also termed somatic spectrum, can be visualized across studies, which gives a first impression of the data. The distribution of the motifs clearly varies between the studies.

plotMutationSpectrum(sca_motifs, "study")
Mutation spectrum over studies

Mutation spectrum over studies

4.3 Decomposition: Inferring Somatic Signatures

The somatic signatures can be estimated with each of the statistical methods implemented in the package. Here, we will use the NMF and PCA, and compare the results. Prior to the estimation, the number \(R\) of signatures to obtain has to be fixed; in this example, the data will be decomposed into 5 signatures.

n_sigs = 5

sigs_nmf = identifySignatures(sca_mm, n_sigs, nmfDecomposition)

sigs_pca = identifySignatures(sca_mm, n_sigs, pcaDecomposition)

The results contains the decomposed matrices stored in a list and can be accessed using standard R accessor functions.

sigs_nmf
   MutationalSignatures:
     Samples (8): gbm, hnsc, ..., skcm, thca
     Signatures (5): S1, S2, S3, S4, S5
     Motifs (96): CA A.A, CA A.C, ..., TG T.G, TG T.T
sigs_pca
   MutationalSignatures:
     Samples (8): gbm, hnsc, ..., skcm, thca
     Signatures (5): S1, S2, S3, S4, S5
     Motifs (96): CA A.A, CA A.C, ..., TG T.G, TG T.T

4.4 Visualization: Exploration of Signatures and Samples

To explore the results for the TCGA data set, we will use the plotting functions. All figures are generated with the ggplot2 package, and thus, their properties and appearances can also be modified at a later stage.

Focusing on the results of the NMF first, the five somatic signatures (named S1 to S5) can be visualized either as a heatmap or as a barchart.

plotSignatureMap(sigs_nmf) + ggtitle("Somatic Signatures: NMF - Heatmap")
Composition of somatic signatures estimated with the NMF, represented as a heatmap.

Composition of somatic signatures estimated with the NMF, represented as a heatmap.

plotSignatures(sigs_nmf) + ggtitle("Somatic Signatures: NMF - Barchart")
Composition of somatic signatures estimated with the NMF, represented as a barchart.

Composition of somatic signatures estimated with the NMF, represented as a barchart.

plotObservedSpectrum(sigs_nmf)
plot of chunk unnamed-chunk-2
plotFittedSpectrum(sigs_nmf)
plot of chunk unnamed-chunk-3

Each signature represents different properties of the somatic spectrum observed in the data. While signature S1 is mainly characterized by selective C>T alterations, others as S4 and S5 show a broad distribution across the motifs.

In addition, the contribution of the signatures in each study can be represented with the same sets of plots. Signature S1 and S3 are strongly represented in the GBM and SKCM study, respectively. Other signatures show a weaker association with a single cancer type.

plotSampleMap(sigs_nmf)
Occurrence of signatures estimated with the NMF, represented as a heatmap.

Occurrence of signatures estimated with the NMF, represented as a heatmap.

plotSamples(sigs_nmf)
Occurrence of signatures estimated with the NMF, represented as a barchart.

Occurrence of signatures estimated with the NMF, represented as a barchart.

In the same way as before, the results of the PCA can be visualized. In contrast to the NMF, the signatures also contain negative values, indicating the depletion of a somatic motif.

Comparing the results of the two methods, we can see similar characteristics between the sets of signatures, for example S1 of the NMF and S2 of the PCA.

4.4.1 PCA

plotSignatureMap(sigs_pca) + ggtitle("Somatic Signatures: PCA - Heatmap")
Composition of somatic signatures estimated with the PCA, represented as a heatmap.

Composition of somatic signatures estimated with the PCA, represented as a heatmap.

plotSignatures(sigs_pca) + ggtitle("Somatic Signatures: PCA - Barchart")
Composition of somatic signatures estimated with the PCA, represented as a barchart.

Composition of somatic signatures estimated with the PCA, represented as a barchart.

plotObservedSpectrum(sigs_pca)
plot of chunk unnamed-chunk-4
plotFittedSpectrum(sigs_pca)
plot of chunk unnamed-chunk-5

4.5 Extensions: Normalization of Sequence Motif Frequencies and Batch Effects

When investigating somatic signatures between samples from different studies, corrections for technical confounding factors should be considered. In our use case of the TCGA WES studies, this is of minor influence due to similar sequencing technology and variant calling methods across the studies. Approaches for the identification of so termed batch effects have been proposed [3] [8] and could be adapted to the setting of somatic signatures with existing implementations (the sva and leapp packages). While this correction is not performed here, we exemplify the usage by taking the different sequencing technologies of the studies into account.

library(sva)

df = as(sca_metadata, "data.frame") ## sample x covariable
pheno = data.frame(s = unlist(df[ ,"Sequence_Source"]), c = unlist(df[ ,"Cancer_Type"]))
rownames(pheno) = gsub("(.*)_.*", "\\1", rownames(pheno))
mod = model.matrix(~ s + c, data = pheno)
mod0 = model.matrix(~ c, data = pheno)

sv = sva(sca_mm, mod, mod0, method = "irw")

If comparisons are performed across samples or studies with different capture targets, for example by comparing whole exome with whole genome sequencing, further corrections for the frequency of sequence motifs can be taken into account. The kmerFrequency function provides the basis for calculating the occurrence of k-mers over a set of ranges of a reference sequence.

As an example, we compute the frequency of 3-mers for the human chromosome 1, based on a sample of 100'000 locations. Analogously, the k-mer occurrence across the human exome can be obtained easily.

k = 3
n = 1e5
chrs = "chr1"

chr1_ranges = as(seqinfo(BSgenome.Hsapiens.UCSC.hg19), "GRanges")
chr1_ranges = keepSeqlevels(chr1_ranges, chrs)

k3_chr1 = kmerFrequency(BSgenome.Hsapiens.UCSC.hg19, n, k, chr1_ranges)

k3_chr1

With the normalizeMotifs function, the frequency of motifs can be adjusted. Here, we will transform our results of the TCGA WES studies to have the same motif distribution as of a whole-genome analysis. The kmers dataset contains the estimated frequency of 3-mers across the human genome and exome.

head(sca_mm)

data(kmers)
norms = k3wg / k3we
head(norms)

sca_norm = normalizeMotifs(sca_mm, norms)

head(sca_norm)

5 Alternatives: Inferring Somatic Signatures with Different Approaches

For the identification of somatic signatures, other methods and implementations exist. The original framework [6] [11] proposed for this is based on the NMF and available for the Matlab programming language [7]. In extension, a probabilistic approach based on Poisson processes has been proposed [13] and implemented [10].

6 Frequently Asked Questions

6.1 Citing SomaticSignatures

If you use the SomaticSignatures package in your work, please cite it:

citation("SomaticSignatures")
   
     Julian Gehring, Bernd Fischer, Michael Lawrence, Wolfgang
     Huber (2014): SomaticSignatures: Inferring Mutational
     Signatures from Single Nucleotide Variants. bioRxiv
     preprint, http://dx.doi.org/10.1101/010686
   
   A BibTeX entry for LaTeX users is
   
     @Article{,
       title = {SomaticSignatures: Inferring Mutational Signatures from Single Nucleotide Variants.},
       author = {Julian Gehring and Bernd Fischer and Michael Lawrence and Wolfgang Huber},
       year = {2014},
       journal = {bioRxiv},
       doi = {10.1101/010686},
       url = {http://dx.doi.org/10.1101/010686},
     }

6.2 Getting help

We welcome emails with questions or suggestions about our software, and want to ensure that we eliminate issues if and when they appear. We have a few requests to optimize the process:

  • All emails and follow-up questions should take place over the Bioconductor mailing list, which serves as a repository of information.
  • The subject line should contain SomaticSignatures and a few words describing the problem. First search the Bioconductor mailing list, for past threads which might have answered your question.
  • If you have a question about the behavior of a function, read the sections of the manual page for this function by typing a question mark and the function name, e.g. ?mutationContext. Additionally, read through the vignette to understand the interplay between different functions of the package. We spend a lot of time documenting individual functions and the exact steps that the software is performing.
  • Include all of your R code related to the question you are asking.
  • Include complete warning or error messages, and conclude your message with the full output of sessionInfo().

6.3 Installing the package

Before you want to install the SomaticSignatures package, please ensure that you have the latest version of R and Bioconductor installed. For details on this, please have a look at the help packages for R and Bioconductor. Then you can open R and run the following commands which will install the latest release version of SomaticSignatures:

source("http://bioconductor.org/biocLite.R")
biocLite("SomaticSignatures")

6.4 Working with VRanges

A central object in the workflow of SomaticSignatures is the VRanges class which is part of the VariantAnnotation package. It builds upon the commonly used GRanges class of the GenomicRanges package. Essentially, each row represents a variant in terms of its genomic location as well as its reference and alternative allele.

library(VariantAnnotation)

There are multiple ways of converting its own variant calls into a VRanges object. One can for example import them from a VCF file with the readVcf function or employ the readMutect function for importing variant calls from the MuTect caller directly. Further, one can also construct it from any other format in the form of:

vr = VRanges(
    seqnames = "chr1",
    ranges = IRanges(start = 1000, width = 1),
    ref = "A",
    alt = "C")

vr
   VRanges object with 1 range and 0 metadata columns:
         seqnames       ranges strand         ref              alt
                    
     [1]     chr1 [1000, 1000]      +           A                C
             totalDepth       refDepth       altDepth   sampleNames
            
     [1]                                           
         softFilterMatrix
                 
     [1]                 
     -------
     seqinfo: 1 sequence from an unspecified genome; no seqlengths
     hardFilters: NULL

7 References

References

[1] Robert C. Gentleman, Vincent J. Carey, Douglas M. Bates, Ben Bolstad, Marcel Dettling, Sandrine Dudoit, Byron Ellis, Laurent Gautier, Yongchao Ge, Jeff Gentry, Kurt Hornik, Torsten Hothorn, Wolfgang Huber, Stefano Iacus, Rafael Irizarry, Friedrich Leisch, Cheng Li, Martin Maechler, Anthony J. Rossini, Gunther Sawitzki, Colin Smith, Gordon Smyth, Luke Tierney, Jean YH Yang, and Jianhua Zhang. Bioconductor: open software development for computational biology and bioinformatics. Genome Biology, 5(10):R80, September 2004. PMID: 15461798. [ DOI | http ]
[2] Wolfram Stacklies, Henning Redestig, Matthias Scholz, Dirk Walther, and Joachim Selbig. pcaMethodsa bioconductor package providing PCA methods for incomplete data. Bioinformatics, 23(9):1164-1167, May 2007. PMID: 17344241. [ DOI | http ]
[3] Jeffrey T Leek and John D Storey. Capturing heterogeneity in gene expression studies by surrogate variable analysis. PLoS Genet, 3(9):e161, September 2007. [ DOI | http ]
[4] Renaud Gaujoux and Cathal Seoighe. A flexible r package for nonnegative matrix factorization. BMC Bioinformatics, 11(1):367, July 2010. PMID: 20598126. [ DOI | http ]
[5] Valerie Obenchain, Martin Morgan, and Michael Lawrence. VariantAnnotation: annotation of genetic variants, 2011. [ .html ]
[6] Serena Nik-Zainal, Ludmil B. Alexandrov, David C. Wedge, Peter Van Loo, Christopher D. Greenman, Keiran Raine, David Jones, Jonathan Hinton, John Marshall, Lucy A. Stebbings, Andrew Menzies, Sancha Martin, Kenric Leung, Lina Chen, Catherine Leroy, Manasa Ramakrishna, Richard Rance, King Wai Lau, Laura J. Mudie, Ignacio Varela, David J. McBride, Graham R. Bignell, Susanna L. Cooke, Adam Shlien, John Gamble, Ian Whitmore, Mark Maddison, Patrick S. Tarpey, Helen R. Davies, Elli Papaemmanuil, Philip J. Stephens, Stuart McLaren, Adam P. Butler, Jon W. Teague, Göran Jönsson, Judy E. Garber, Daniel Silver, Penelope Miron, Aquila Fatima, Sandrine Boyault, Anita Langerød, Andrew Tutt, John W.M. Martens, Samuel A.J.R. Aparicio, Ake Borg, Anne Vincent Salomon, Gilles Thomas, Anne-Lise Børresen-Dale, Andrea L. Richardson, Michael S. Neuberger, P. Andrew Futreal, Peter J. Campbell, and Michael R. Stratton. Mutational processes molding the genomes of 21 breast cancers. Cell, 149(5):979-993, May 2012. [ DOI | http ]
[7] Ludmil Alexandrov. WTSI mutational signature framework, October 2012. [ http ]
[8] Yunting Sun, Nancy R. Zhang, and Art B. Owen. Multiple hypothesis testing adjusted for latent variables, with an application to the AGEMAP gene expression data. The Annals of Applied Statistics, 6(4):1664-1688, December 2012. Zentralblatt MATH identifier: 06141543; Mathematical Reviews number (MathSciNet): MR3058679. [ DOI | http ]
[9] Paul Theodor Pyl. h5vc: Managing alignment tallies using a hdf5 backend, 2013. [ .html ]
[10] Andrej Fischer. EMu: expectation-maximisation inference of mutational signatures, 2013. [ http ]
[11] Ludmil B. Alexandrov, Serena Nik-Zainal, David C. Wedge, Peter J. Campbell, and Michael R. Stratton. Deciphering signatures of mutational processes operative in human cancer. Cell Reports, 3(1):246-259, January 2013. [ DOI | http ]
[12] Kristian Cibulskis, Michael S. Lawrence, Scott L. Carter, Andrey Sivachenko, David Jaffe, Carrie Sougnez, Stacey Gabriel, Matthew Meyerson, Eric S. Lander, and Gad Getz. Sensitive detection of somatic point mutations in impure and heterogeneous cancer samples. Nature Biotechnology, advance online publication, February 2013. [ DOI | .html ]
[13] Andrej Fischer, Christopher Jr Illingworth, Peter J Campbell, and Ville Mustonen. EMu: probabilistic inference of mutational processes and their localization in the cancer genome. Genome biology, 14(4):R39, April 2013. PMID: 23628380. [ DOI ]
[14] Ludmil B. Alexandrov, Serena Nik-Zainal, David C. Wedge, Samuel A. J. R. Aparicio, Sam Behjati, Andrew V. Biankin, Graham R. Bignell, Niccolò Bolli, Ake Borg, Anne-Lise Børresen-Dale, Sandrine Boyault, Birgit Burkhardt, Adam P. Butler, Carlos Caldas, Helen R. Davies, Christine Desmedt, Roland Eils, Jórunn Erla Eyfjörd, John A. Foekens, Mel Greaves, Fumie Hosoda, Barbara Hutter, Tomislav Ilicic, Sandrine Imbeaud, Marcin Imielinsk, Natalie Jäger, David T. W. Jones, David Jones, Stian Knappskog, Marcel Kool, Sunil R. Lakhani, Carlos López-Otín, Sancha Martin, Nikhil C. Munshi, Hiromi Nakamura, Paul A. Northcott, Marina Pajic, Elli Papaemmanuil, Angelo Paradiso, John V. Pearson, Xose S. Puente, Keiran Raine, Manasa Ramakrishna, Andrea L. Richardson, Julia Richter, Philip Rosenstiel, Matthias Schlesner, Ton N. Schumacher, Paul N. Span, Jon W. Teague, Yasushi Totoki, Andrew N. J. Tutt, Rafael Valdés-Mas, Marit M. van Buuren, Laura van t Veer, Anne Vincent-Salomon, Nicola Waddell, Lucy R. Yates, Australian Pancreatic Cancer Genome Initiative, ICGC Breast Cancer Consortium, ICGC MMML-Seq Consortium, Icgc PedBrain, Jessica Zucman-Rossi, P. Andrew Futreal, Ultan McDermott, Peter Lichter, Matthew Meyerson, Sean M. Grimmond, Reiner Siebert, Elías Campo, Tatsuhiro Shibata, Stefan M. Pfister, Peter J. Campbell, and Michael R. Stratton. Signatures of mutational processes in human cancer. Nature, 500(7463):415-421, August 2013. [ DOI | .html ]
[15] Paul Theodor Pyl, Julian Gehring, Bernd Fischer, and Wolfgang Huber. h5vc: Scalable nucleotide tallies with HDF5. Bioinformatics, page btu026, January 2014. PMID: 24451629. [ DOI | http ]

8 Session Info

   R version 3.1.2 (2014-10-31)
   Platform: x86_64-unknown-linux-gnu (64-bit)
   
   locale:
    [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
    [3] LC_TIME=en_US.UTF-8        LC_COLLATE=C              
    [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
    [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
    [9] LC_ADDRESS=C               LC_TELEPHONE=C            
   [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
   
   attached base packages:
   [1] stats4    parallel  stats     graphics  grDevices utils    
   [7] datasets  methods   base     
   
   other attached packages:
    [1] BSgenome.Hsapiens.UCSC.hg19_1.4.0
    [2] BSgenome_1.34.0                  
    [3] rtracklayer_1.26.1               
    [4] SomaticCancerAlterations_1.1.1   
    [5] ggplot2_1.0.0                    
    [6] SomaticSignatures_2.2.3          
    [7] bigmemory_4.4.6                  
    [8] BH_1.54.0-4                      
    [9] bigmemory.sri_0.1.3              
   [10] Biobase_2.26.0                   
   [11] VariantAnnotation_1.12.2         
   [12] Rsamtools_1.18.1                 
   [13] Biostrings_2.34.0                
   [14] XVector_0.6.0                    
   [15] GenomicRanges_1.18.1             
   [16] GenomeInfoDb_1.2.2               
   [17] IRanges_2.0.0                    
   [18] S4Vectors_0.4.0                  
   [19] BiocGenerics_0.12.0              
   [20] knitr_1.7                        
   
   loaded via a namespace (and not attached):
    [1] AnnotationDbi_1.28.1    BBmisc_1.8             
    [3] BatchJobs_1.5           BiocParallel_1.0.0     
    [5] DBI_0.3.1               Formula_1.1-2          
    [7] GGally_0.4.8            GenomicAlignments_1.2.0
    [9] GenomicFeatures_1.18.2  Hmisc_3.14-5           
   [11] MASS_7.3-35             NMF_0.20.5             
   [13] OrganismDbi_1.8.0       RBGL_1.42.0            
   [15] RColorBrewer_1.0-5      RCurl_1.95-4.3         
   [17] RSQLite_1.0.0           Rcpp_0.11.3            
   [19] XML_3.98-1.1            acepack_1.3-3.3        
   [21] base64enc_0.1-2         biomaRt_2.22.0         
   [23] biovizBase_1.14.0       bitops_1.0-6           
   [25] brew_1.0-6              checkmate_1.5.0        
   [27] cluster_1.15.3          codetools_0.2-9        
   [29] colorspace_1.2-4        dichromat_2.0-0        
   [31] digest_0.6.4            doParallel_1.0.8       
   [33] evaluate_0.5.5          exomeCopy_1.12.0       
   [35] fail_1.2                foreach_1.4.2          
   [37] foreign_0.8-61          formatR_1.0            
   [39] ggbio_1.14.0            graph_1.44.0           
   [41] grid_3.1.2              gridBase_0.4-7         
   [43] gridExtra_0.9.1         gtable_0.1.2           
   [45] highr_0.4               iterators_1.0.7        
   [47] labeling_0.3            lattice_0.20-29        
   [49] latticeExtra_0.6-26     markdown_0.7.4         
   [51] mime_0.2                munsell_0.4.2          
   [53] nnet_7.3-8              pcaMethods_1.56.0      
   [55] pkgmaker_0.22           plyr_1.8.1             
   [57] proto_0.3-10            proxy_0.4-13           
   [59] registry_0.2            reshape_0.8.5          
   [61] reshape2_1.4            rngtools_1.2.4         
   [63] rpart_4.1-8             scales_0.2.4           
   [65] sendmailR_1.2-1         splines_3.1.2          
   [67] stringr_0.6.2           survival_2.37-7        
   [69] tools_3.1.2             xtable_1.7-4           
   [71] zlibbioc_1.12.0