DropletTestFiles 1.6.0
The DropletTestFiles package contains files for testing droplet-based utilities, such as those in the DropletUtils package. These files are literally the raw output of pipelines like 10X Genomics’ CellRanger software suite, and are usually not in an (immediately) analysis-ready state. After all, the idea is to provide some material to test the utilities to get to such a state!
This package doesn’t do anything except pull down and serve up files, so there’s not much to talk about here. There are two convenience functions to help obtain content from ExperimentHub. The first is to list all available resources managed by DropletTestFiles:
library(DropletTestFiles)
out <- listTestFiles()
out
## DataFrame with 52 rows and 18 columns
## title dataprovider species taxonomyid
## <character> <character> <character> <integer>
## EH3685 10X brain nuclei 1k .. 10X Genomics Mus musculus 10090
## EH3686 10X brain nuclei 1k .. 10X Genomics Mus musculus 10090
## EH3687 10X brain nuclei 1k .. 10X Genomics Mus musculus 10090
## EH3688 10X brain nuclei 1k .. 10X Genomics Mus musculus 10090
## EH3689 10X brain nuclei 1k .. 10X Genomics Mus musculus 10090
## ... ... ... ... ...
## EH3732 HiSeq 4000-sequenced.. Jonathan Griffiths Mus musculus 10090
## EH3769 10X PBMC 4k raw coun.. 10X Genomics Homo sapiens 9606
## EH3770 10X PBMC 4k filtered.. 10X Genomics Homo sapiens 9606
## EH3771 10X PBMC 4k raw HDF5.. 10X Genomics Homo sapiens 9606
## EH3772 10X PBMC 4k molecule.. 10X Genomics Homo sapiens 9606
## genome description coordinate_1_based
## <character> <character> <integer>
## EH3685 mm10 Molecule information.. 1
## EH3686 mm10 Filtered HDF5 matrix.. 1
## EH3687 mm10 Raw HDF5 matrix for .. 1
## EH3688 mm10 Filtered count matri.. 1
## EH3689 mm10 Raw count matrix for.. 1
## ... ... ... ...
## EH3732 mm10 Molecule information.. 1
## EH3769 hg38 Raw count matrix for.. 1
## EH3770 hg38 Filtered count matri.. 1
## EH3771 hg38 Raw HDF5 matrix for .. 1
## EH3772 hg38 Molecule information.. 1
## maintainer rdatadateadded preparerclass
## <character> <character> <character>
## EH3685 Aaron Lun <infinite... 2020-08-26 DropletTestFiles
## EH3686 Aaron Lun <infinite... 2020-08-26 DropletTestFiles
## EH3687 Aaron Lun <infinite... 2020-08-26 DropletTestFiles
## EH3688 Aaron Lun <infinite... 2020-08-26 DropletTestFiles
## EH3689 Aaron Lun <infinite... 2020-08-26 DropletTestFiles
## ... ... ... ...
## EH3732 Aaron Lun <infinite... 2020-08-26 DropletTestFiles
## EH3769 Aaron Lun <infinite... 2020-09-08 DropletTestFiles
## EH3770 Aaron Lun <infinite... 2020-09-08 DropletTestFiles
## EH3771 Aaron Lun <infinite... 2020-09-08 DropletTestFiles
## EH3772 Aaron Lun <infinite... 2020-09-08 DropletTestFiles
## tags rdataclass
## <AsIs> <character>
## EH3685 ExperimentHub,ExperimentData,ExpressionData,... character
## EH3686 ExperimentHub,ExperimentData,ExpressionData,... character
## EH3687 ExperimentHub,ExperimentData,ExpressionData,... character
## EH3688 ExperimentHub,ExperimentData,ExpressionData,... character
## EH3689 ExperimentHub,ExperimentData,ExpressionData,... character
## ... ... ...
## EH3732 ExperimentHub,ExperimentData,ExpressionData,... character
## EH3769 ExperimentHub,ExperimentData,ExpressionData,... character
## EH3770 ExperimentHub,ExperimentData,ExpressionData,... character
## EH3771 ExperimentHub,ExperimentData,ExpressionData,... character
## EH3772 ExperimentHub,ExperimentData,ExpressionData,... character
## rdatapath sourceurl sourcetype
## <character> <character> <character>
## EH3685 DropletTestFiles/ten.. https://support.10xg.. HDF5
## EH3686 DropletTestFiles/ten.. https://support.10xg.. HDF5
## EH3687 DropletTestFiles/ten.. https://support.10xg.. HDF5
## EH3688 DropletTestFiles/ten.. https://support.10xg.. tar.gz
## EH3689 DropletTestFiles/ten.. https://support.10xg.. tar.gz
## ... ... ... ...
## EH3732 DropletTestFiles/bac.. https://jmlab-gitlab.. HDF5
## EH3769 DropletTestFiles/ten.. https://support.10xg.. tar.gz
## EH3770 DropletTestFiles/ten.. https://support.10xg.. tar.gz
## EH3771 DropletTestFiles/ten.. https://support.10xg.. HDF5
## EH3772 DropletTestFiles/ten.. https://support.10xg.. HDF5
## file.dataset file.version file.name
## <character> <character> <character>
## EH3685 tenx-2.0.1-nuclei_900 1.0.0 mol_info.h5
## EH3686 tenx-2.0.1-nuclei_900 1.0.0 filtered.h5
## EH3687 tenx-2.0.1-nuclei_900 1.0.0 raw.h5
## EH3688 tenx-2.0.1-nuclei_900 1.0.0 filtered.tar.gz
## EH3689 tenx-2.0.1-nuclei_900 1.0.0 raw.tar.gz
## ... ... ... ...
## EH3732 bach-mammary-swapping 1.0.0 hiseq_4000/mol_info_..
## EH3769 tenx-2.1.0-pbmc4k 1.0.0 raw.tar.gz
## EH3770 tenx-2.1.0-pbmc4k 1.0.0 filtered.tar.gz
## EH3771 tenx-2.1.0-pbmc4k 1.0.0 raw.h5
## EH3772 tenx-2.1.0-pbmc4k 1.0.0 mol_info.h5
The second is to actually obtain a resource. This is provided in the form of a (read-only!) path on which further operations can be applied.
getTestFile(out$rdatapath[1], prefix=FALSE)
## EH3685
## "/home/biocbuild/.cache/R/ExperimentHub/3deb337e4445d7_3721"
Currently, all of the files come from 10X Genomics datasets. As such, we will see a lot of filtered/raw count matrices, molecule information files and HDF5 barcode matrices. We refer readers to the (relevant section)[https://support.10xgenomics.com/single-cell-gene-expression/software/pipelines/latest/output/overview] of the 10X Genomics website for more details.
Here, we obtain a path to a filtered HDF5 matrix and read it in with a DropletUtils function.
This produces a SingleCellExperiment
object for use in various Bioconductor pipelines.
library(DropletUtils)
path <- getTestFile("tenx-3.1.0-5k_pbmc_protein_v3/1.0.0/filtered.h5", prefix=TRUE)
sce <- read10xCounts(path, type="HDF5")
sce
## class: SingleCellExperiment
## dim: 33570 5247
## metadata(1): Samples
## assays(1): counts
## rownames(33570): ENSG00000243485 ENSG00000237613 ... IgG2a IgG2b
## rowData names(3): ID Symbol Type
## colnames: NULL
## colData names(2): Sample Barcode
## reducedDimNames(0):
## mainExpName: NULL
## altExpNames(0):
sessionInfo()
## R version 4.2.0 RC (2022-04-19 r82224)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 20.04.4 LTS
##
## Matrix products: default
## BLAS: /home/biocbuild/bbs-3.15-bioc/R/lib/libRblas.so
## LAPACK: /home/biocbuild/bbs-3.15-bioc/R/lib/libRlapack.so
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=en_GB LC_COLLATE=C
## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
##
## attached base packages:
## [1] stats4 stats graphics grDevices utils datasets methods
## [8] base
##
## other attached packages:
## [1] DropletUtils_1.16.0 SingleCellExperiment_1.18.0
## [3] SummarizedExperiment_1.26.0 Biobase_2.56.0
## [5] GenomicRanges_1.48.0 GenomeInfoDb_1.32.0
## [7] IRanges_2.30.0 S4Vectors_0.34.0
## [9] BiocGenerics_0.42.0 MatrixGenerics_1.8.0
## [11] matrixStats_0.62.0 DropletTestFiles_1.6.0
## [13] BiocStyle_2.24.0
##
## loaded via a namespace (and not attached):
## [1] bitops_1.0-7 bit64_4.0.5
## [3] filelock_1.0.2 httr_1.4.2
## [5] tools_4.2.0 bslib_0.3.1
## [7] utf8_1.2.2 R6_2.5.1
## [9] HDF5Array_1.24.0 DBI_1.1.2
## [11] rhdf5filters_1.8.0 withr_2.5.0
## [13] tidyselect_1.1.2 bit_4.0.4
## [15] curl_4.3.2 compiler_4.2.0
## [17] cli_3.3.0 DelayedArray_0.22.0
## [19] bookdown_0.26 sass_0.4.1
## [21] rappdirs_0.3.3 stringr_1.4.0
## [23] digest_0.6.29 rmarkdown_2.14
## [25] R.utils_2.11.0 XVector_0.36.0
## [27] pkgconfig_2.0.3 htmltools_0.5.2
## [29] sparseMatrixStats_1.8.0 limma_3.52.0
## [31] dbplyr_2.1.1 fastmap_1.1.0
## [33] rlang_1.0.2 RSQLite_2.2.12
## [35] shiny_1.7.1 DelayedMatrixStats_1.18.0
## [37] jquerylib_0.1.4 generics_0.1.2
## [39] jsonlite_1.8.0 BiocParallel_1.30.0
## [41] dplyr_1.0.8 R.oo_1.24.0
## [43] RCurl_1.98-1.6 magrittr_2.0.3
## [45] scuttle_1.6.0 GenomeInfoDbData_1.2.8
## [47] Matrix_1.4-1 Rcpp_1.0.8.3
## [49] Rhdf5lib_1.18.0 fansi_1.0.3
## [51] lifecycle_1.0.1 R.methodsS3_1.8.1
## [53] edgeR_3.38.0 stringi_1.7.6
## [55] yaml_2.3.5 zlibbioc_1.42.0
## [57] rhdf5_2.40.0 BiocFileCache_2.4.0
## [59] AnnotationHub_3.4.0 grid_4.2.0
## [61] blob_1.2.3 dqrng_0.3.0
## [63] parallel_4.2.0 promises_1.2.0.1
## [65] ExperimentHub_2.4.0 crayon_1.5.1
## [67] lattice_0.20-45 beachmat_2.12.0
## [69] Biostrings_2.64.0 KEGGREST_1.36.0
## [71] locfit_1.5-9.5 knitr_1.39
## [73] pillar_1.7.0 glue_1.6.2
## [75] BiocVersion_3.15.2 evaluate_0.15
## [77] BiocManager_1.30.17 png_0.1-7
## [79] vctrs_0.4.1 httpuv_1.6.5
## [81] purrr_0.3.4 assertthat_0.2.1
## [83] cachem_1.0.6 xfun_0.30
## [85] mime_0.12 xtable_1.8-4
## [87] later_1.3.0 tibble_3.1.6
## [89] AnnotationDbi_1.58.0 memoise_2.0.1
## [91] ellipsis_0.3.2 interactiveDisplayBase_1.34.0