SpatialDE by Svensson et al., 2018, is a method to identify spatially variable genes (SVGs) in spatially resolved transcriptomics data.
You can install spatialDE from Bioconductor with the following code:
if (!requireNamespace("BiocManager", quietly = TRUE)) {
install.packages("BiocManager")
}
BiocManager::install("spatialDE")
Reproducing the example analysis from the original SpatialDE Python package.
library(spatialDE)
library(ggplot2)
Files originally retrieved from SpatialDE GitHub repository from the following links: https://github.com/Teichlab/SpatialDE/blob/master/Analysis/MouseOB/data/Rep11_MOB_0.csv https://github.com/Teichlab/SpatialDE/blob/master/Analysis/MouseOB/MOB_sample_info.csv
# Expression file used in python SpatialDE.
data("Rep11_MOB_0")
# Sample Info file used in python SpatialDE
data("MOB_sample_info")
The Rep11_MOB_0
object contains spatial expression data for
16218 genes on 262 spots, with genes as rows
and spots as columns.
Rep11_MOB_0[1:5, 1:5]
#> 16.92x9.015 16.945x11.075 16.97x10.118 16.939x12.132 16.949x13.055
#> Nrf1 1 0 0 1 0
#> Zbtb5 1 0 1 0 0
#> Ccnl1 1 3 1 1 0
#> Lrrfip1 2 2 0 0 3
#> Bbs1 1 2 0 4 0
dim(Rep11_MOB_0)
#> [1] 16218 262
The MOB_sample_info
object contains a data.frame
with coordinates for each
spot.
head(MOB_sample_info)
Rep11_MOB_0 <- Rep11_MOB_0[rowSums(Rep11_MOB_0) >= 3, ]
Rep11_MOB_0 <- Rep11_MOB_0[, row.names(MOB_sample_info)]
MOB_sample_info$total_counts <- colSums(Rep11_MOB_0)
head(MOB_sample_info)
MOB_sample_info
X <- MOB_sample_info[, c("x", "y")]
head(X)
stabilize
The SpatialDE method assumes normally distributed data, so we stabilize the variance of the negative binomial distributed counts data using Anscombe’s approximation.
The stabilize()
function takes as input a data.frame
of expression values with samples in columns and genes in rows. Thus, in this case, we have to transpose the data.
norm_expr <- stabilize(Rep11_MOB_0)
norm_expr[1:5, 1:5]
#> 16.92x9.015 16.945x11.075 16.97x10.118 16.939x12.132 16.949x13.055
#> Nrf1 1.227749 0.8810934 0.8810934 1.2277491 0.8810934
#> Zbtb5 1.227749 0.8810934 1.2277491 0.8810934 0.8810934
#> Ccnl1 1.227749 1.6889027 1.2277491 1.2277491 0.8810934
#> Lrrfip1 1.484676 1.4846765 0.8810934 0.8810934 1.6889027
#> Bbs1 1.227749 1.4846765 0.8810934 1.8584110 0.8810934
regress_out
Next, we account for differences in library size between the samples by regressing out the effect of the total counts for each gene using linear regression.
resid_expr <- regress_out(norm_expr, sample_info = MOB_sample_info)
resid_expr[1:5, 1:5]
#> 16.92x9.015 16.945x11.075 16.97x10.118 16.939x12.132 16.949x13.055
#> Nrf1 -0.75226761 -1.2352000 -1.0164479 -0.7903289 -1.0973214
#> Zbtb5 0.09242373 -0.3323719 0.1397144 -0.2760560 -0.2533134
#> Ccnl1 -2.77597164 -2.5903783 -2.6092013 -2.8529340 -3.1193883
#> Lrrfip1 -1.92331333 -2.1578718 -2.3849405 -2.5924072 -1.7163300
#> Bbs1 -1.12186064 -1.0266476 -1.3706460 -0.5363646 -1.4666155
run
To reduce running time, the SpatialDE test is run on a subset of 1000 genes. Running it on the complete data set takes about 10 minutes.
# For this example, run spatialDE on the first 1000 genes
sample_resid_expr <- head(resid_expr, 1000)
results <- spatialDE::run(sample_resid_expr, coordinates = X)
head(results[order(results$qval), ])
model_search
Finally, we can classify the DE genes to interpetable DE classes using the model_search
function.
We apply the model search on filtered DE results, using a threshold of 0.05 for the Q-values.
de_results <- results[results$qval < 0.05, ]
ms_results <- model_search(
sample_resid_expr,
coordinates = X,
de_results = de_results
)
# To show ms_results sorted on qvalue, uncomment the following line
# head(ms_results[order(ms_results$qval), ])
head(ms_results)
spatial_patterns
Furthermore, we can group spatially variable genes (SVGs) into spatial patterns using automatic expression histology (AEH).
sp <- spatial_patterns(
sample_resid_expr,
coordinates = X,
de_results = de_results,
n_patterns = 4L, length = 1.5
)
sp$pattern_results
Visualizing one of the most significant genes.
gene <- "Pcp4"
ggplot(data = MOB_sample_info, aes(x = x, y = y, color = norm_expr[gene, ])) +
geom_point(size = 7) +
ggtitle(gene) +
scale_color_viridis_c() +
labs(color = gene)
As an alternative to the previous figure, we can plot multiple genes using the normalized expression values.
ordered_de_results <- de_results[order(de_results$qval), ]
multiGenePlots(norm_expr,
coordinates = X,
ordered_de_results[1:6, ]$g,
point_size = 4,
viridis_option = "D",
dark_theme = FALSE
)
FSV_sig(results, ms_results)
#> Warning: ggrepel: 8 unlabeled data points (too many overlaps). Consider
#> increasing max.overlaps
The SpatialDE workflow can also be executed with a SpatialExperiment object as input.
library(SpatialExperiment)
# For SpatialExperiment object, we neeed to transpose the counts matrix in order
# to have genes on rows and spot on columns.
# For this example, run spatialDE on the first 1000 genes
partial_counts <- head(Rep11_MOB_0, 1000)
spe <- SpatialExperiment(
assays = list(counts = partial_counts),
spatialData = DataFrame(MOB_sample_info[, c("x", "y")]),
spatialCoordsNames = c("x", "y")
)
out <- spatialDE(spe, assay_type = "counts", verbose = FALSE)
head(out[order(out$qval), ])
We can plot spatial patterns of multiples genes using the spe
object.
spe_results <- out[out$qval < 0.05, ]
ordered_spe_results <- spe_results[order(spe_results$qval), ]
multiGenePlots(spe,
assay_type = "counts",
ordered_spe_results[1:6, ]$g,
point_size = 4,
viridis_option = "D",
dark_theme = FALSE
)
model_search
and spatial_patterns
msearch <- modelSearch(spe,
de_results = out, qval_thresh = 0.05,
verbose = FALSE
)
head(msearch)
spatterns <- spatialPatterns(spe,
de_results = de_results, qval_thresh = 0.05,
n_patterns = 4L, length = 1.5, verbose = FALSE
)
spatterns$pattern_results
sessionInfo
Session info
#> [1] "2021-10-26 18:55:42 EDT"
#> R version 4.1.1 (2021-08-10)
#> Platform: x86_64-pc-linux-gnu (64-bit)
#> Running under: Ubuntu 20.04.3 LTS
#>
#> Matrix products: default
#> BLAS: /home/biocbuild/bbs-3.14-bioc/R/lib/libRblas.so
#> LAPACK: /home/biocbuild/bbs-3.14-bioc/R/lib/libRlapack.so
#>
#> locale:
#> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
#> [3] LC_TIME=en_GB LC_COLLATE=C
#> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
#> [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
#> [9] LC_ADDRESS=C LC_TELEPHONE=C
#> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
#>
#> attached base packages:
#> [1] stats4 stats graphics grDevices utils datasets methods
#> [8] base
#>
#> other attached packages:
#> [1] SpatialExperiment_1.4.0 SingleCellExperiment_1.16.0
#> [3] SummarizedExperiment_1.24.0 Biobase_2.54.0
#> [5] GenomicRanges_1.46.0 GenomeInfoDb_1.30.0
#> [7] IRanges_2.28.0 S4Vectors_0.32.0
#> [9] BiocGenerics_0.40.0 MatrixGenerics_1.6.0
#> [11] matrixStats_0.61.0 ggplot2_3.3.5
#> [13] spatialDE_1.0.0 BiocStyle_2.22.0
#>
#> loaded via a namespace (and not attached):
#> [1] bitops_1.0-7 filelock_1.0.2
#> [3] rprojroot_2.0.2 backports_1.2.1
#> [5] tools_4.1.1 bslib_0.3.1
#> [7] utf8_1.2.2 R6_2.5.1
#> [9] HDF5Array_1.22.0 DBI_1.1.1
#> [11] colorspace_2.0-2 rhdf5filters_1.6.0
#> [13] withr_2.4.2 gridExtra_2.3
#> [15] tidyselect_1.1.1 compiler_4.1.1
#> [17] basilisk.utils_1.6.0 DelayedArray_0.20.0
#> [19] labeling_0.4.2 bookdown_0.24
#> [21] sass_0.4.0 checkmate_2.0.0
#> [23] scales_1.1.1 stringr_1.4.0
#> [25] digest_0.6.28 rmarkdown_2.11
#> [27] R.utils_2.11.0 basilisk_1.6.0
#> [29] XVector_0.34.0 pkgconfig_2.0.3
#> [31] htmltools_0.5.2 sparseMatrixStats_1.6.0
#> [33] highr_0.9 fastmap_1.1.0
#> [35] limma_3.50.0 rlang_0.4.12
#> [37] DelayedMatrixStats_1.16.0 farver_2.1.0
#> [39] jquerylib_0.1.4 generics_0.1.1
#> [41] jsonlite_1.7.2 BiocParallel_1.28.0
#> [43] dplyr_1.0.7 R.oo_1.24.0
#> [45] RCurl_1.98-1.5 magrittr_2.0.1
#> [47] GenomeInfoDbData_1.2.7 scuttle_1.4.0
#> [49] Matrix_1.3-4 Rcpp_1.0.7
#> [51] munsell_0.5.0 Rhdf5lib_1.16.0
#> [53] fansi_0.5.0 reticulate_1.22
#> [55] lifecycle_1.0.1 R.methodsS3_1.8.1
#> [57] stringi_1.7.5 yaml_2.2.1
#> [59] edgeR_3.36.0 zlibbioc_1.40.0
#> [61] rhdf5_2.38.0 grid_4.1.1
#> [63] ggrepel_0.9.1 parallel_4.1.1
#> [65] dqrng_0.3.0 crayon_1.4.1
#> [67] dir.expiry_1.2.0 lattice_0.20-45
#> [69] beachmat_2.10.0 locfit_1.5-9.4
#> [71] magick_2.7.3 knitr_1.36
#> [73] pillar_1.6.4 rjson_0.2.20
#> [75] glue_1.4.2 evaluate_0.14
#> [77] BiocManager_1.30.16 png_0.1-7
#> [79] vctrs_0.3.8 gtable_0.3.0
#> [81] purrr_0.3.4 assertthat_0.2.1
#> [83] xfun_0.27 DropletUtils_1.14.0
#> [85] viridisLite_0.4.0 tibble_3.1.5
#> [87] ellipsis_0.3.2 here_1.0.1