Welcome to the RiboCrypt
package.
RiboCrypt
is an R package for interactive visualization in genomics. RiboCrypt
works with any NGS-based method, but much emphasis is put on Ribo-seq data visualization.
This vignette will walk you through usage with examples.
RibCrypt
currently supports creating interactive browser views for NGS tracks:
If you’re not familiar with terms like “p-shifting” or “p-site offset”, it’s best to walk through ORFikOverview vignette, especially chapter 6 “RiboSeq footprints automatic shift detection and shifting”
The simplest way to create a window is to use the syntax of programing found in the package ORFik.
library(RiboCrypt) # This package
library(ORFik) # The backend package for RiboCrypt
Now load the experiment and load the coding sequences (cds). Here, we use ORFik experiment data structure, to familiarize yourself with the concept check the ORFikExperiment vignette:
https://bioconductor.org/packages/release/bioc/vignettes/ORFik/inst/doc/ORFikExperiment.html
df <- RiboCrypt.template.experiment()
cds <- loadRegion(df, "cds")
cds # gene annotation
## GRangesList object of length 6:
## $ENSTTEST10001
## GRanges object with 1 range and 3 metadata columns:
## seqnames ranges strand | cds_id cds_name exon_rank
## <Rle> <IRanges> <Rle> | <integer> <character> <integer>
## [1] chr1 446-751 + | 1 <NA> 2
## -------
## seqinfo: 6 sequences from an unspecified genome
##
## $ENSTTEST10002
## GRanges object with 1 range and 3 metadata columns:
## seqnames ranges strand | cds_id cds_name exon_rank
## <Rle> <IRanges> <Rle> | <integer> <character> <integer>
## [1] chr2 446-751 - | 2 <NA> 2
## -------
## seqinfo: 6 sequences from an unspecified genome
##
## $ENSTTEST10003
## GRanges object with 1 range and 3 metadata columns:
## seqnames ranges strand | cds_id cds_name exon_rank
## <Rle> <IRanges> <Rle> | <integer> <character> <integer>
## [1] chr3 446-751 + | 3 <NA> 2
## -------
## seqinfo: 6 sequences from an unspecified genome
##
## ...
## <3 more elements>
df # let's look at libraries the experiment contains
## experiment: ORFik with 4 library types and 16 runs
## Tjeldnes et al.
## libtype rep condition
## 1: CAGE 1 Mutant
## 2: CAGE 2 Mutant
## 3: CAGE 1 WT
## 4: CAGE 2 WT
## 5: PAS 1 Mutant
## 6: PAS 2 Mutant
## 7: PAS 1 WT
## 8: PAS 2 WT
## 9: RFP 1 Mutant
## 10: RFP 2 Mutant
## 11: RFP 1 WT
## 12: RFP 2 WT
## 13: RNA 1 Mutant
## 14: RNA 2 Mutant
## 15: RNA 1 WT
## 16: RNA 2 WT
We can see that the experiment consists of CAGE (5’ends), PAS (3’ends), RNA-seq and Ribo-seq libraries in wild-type and mutant conditions.
ORFik experiment can by subsetted either by index or by column. It will be useful to pick libraries to display.
df[4:6,]
## experiment: ORFik with 2 library types and 3 runs
## Tjeldnes et al.
## libtype rep condition
## 1: CAGE 2 WT
## 2: PAS 1 Mutant
## 3: PAS 2 Mutant
df[which(df$libtype == "CAGE"),]
## experiment: ORFik with 1 library type and 4 runs
## Tjeldnes et al.
## libtype rep condition
## 1: CAGE 1 Mutant
## 2: CAGE 2 Mutant
## 3: CAGE 1 WT
## 4: CAGE 2 WT
df[which(df$condition == "Mutant"),]
## experiment: ORFik with 4 library types and 8 runs
## Tjeldnes et al.
## libtype rep
## 1: CAGE 1
## 2: CAGE 2
## 3: PAS 1
## 4: PAS 2
## 5: RFP 1
## 6: RFP 2
## 7: RNA 1
## 8: RNA 2
Browser window is created with multiOmicsPlot
functions. The function displays libraries subsetted from experiment (df) from top to down. First, lets have a look at a single library of each type (cds is extended by 30 bases). In the resulting plot you can zoom in and inspect coverage:
multiOmicsPlot_ORFikExp(extendLeaders(extendTrailers(cds[3], 30), 30), annotation = cds,df = df[c(1,5,9,13),],
frames_type = "columns", custom_motif = "CTG")
From top to down, we see Ribo-seq coverage, displayed as columns with the three frames color-coded according to the bottom panel: the three rectangles represent reading frames with sequence features denoted by vertical lines: white for AUG codons and black for stop codons. Custom motifs can be displayed in purple with custom_motif argument. Middle panel is gene model schematic, note that cds frame is also color-coded to match reading frame.
Now, we can plot just the ribosome footprints (RFPs), with display changed to lines, instead of columns:
multiOmicsPlot_ORFikExp(extendLeaders(extendTrailers(cds[3], 30), 30), annotation = cds,df = df[which(df$libtype == "RFP")[1],],
frames_type = "lines")
Line display is intuitive, but Ribo-seq coverage tends to be very serrated - lines overlap and the whole picture is blurry. To aid that, you can use kmers argument, applying sliding window (sum or mean) over each frame, decreasing resolution and allowing for clearer separation of the three lines. It’s also useful when significantly zoomed out:
multiOmicsPlot_ORFikExp(extendLeaders(extendTrailers(cds[3], 30), 30), annotation = cds,df = df[which(df$libtype == "RFP")[1],],
frames_type = "lines", kmers = 6)
We can now explore the stacked area reading frame display. It’s especially useful for small figure generation: lines or columns may be indistinguishable when plot is reduced to very small dimensions for publication purposes, when stacked area should be well visible even when zoomed-out significantly. Note the camera icon in top-right panel. It allows for static image download. By default it’s svg, allowing for vector graphics support and high-quality figure generation.
multiOmicsPlot_ORFikExp(extendLeaders(extendTrailers(cds[3], 30), 30), annotation = cds,df = df[which(df$libtype == "RFP")[1],],
frames_type = "stacks", kmers = 6)
RiboCrypt offers interactive NGS profile display with several additional visualization methods designed for Ribo-seq specifically. Other utilities include kmers smoothing and static vector graphics export which for publication-grade figures generation.