The EMBL-EBI curation of the GWAS catalog (originated at NHGRI) includes labelings of GWAS hit records with terms from the EBI Experimental Factor Ontology (EFO). The Bioconductor gwascat package includes a graph representation of the ontology and records the EFO assignments of GWAS results in its basic representations of the catalog.
Term names are regimented.
## A graphNEL graph with directed edges
## Number of Nodes = 16331
## Number of Edges = 22186
## [1] "EFO:0000001" "BFO:0000007" "BFO:0000016" "BFO:0000019" "BFO:0000020"
## [6] "BFO:0000023"
The nodeData of the graph includes a def
field.
We will process that to create a data.frame.
nd = nodeData(efo.obo.g)
alldef = sapply(nd, function(x) unlist(x[["def"]]))
allnames = sapply(nd, function(x) unlist(x[["name"]]))
alld2 = sapply(alldef, function(x) if(is.null(x)) return(" ") else x[1])
mydf = data.frame(id = names(allnames), concept=as.character(allnames), def=unlist(alld2))
We can create an interactive data table for all terms, but for performance we limit the table size to terms involving the string ‘autoimm’.
limdf = mydf[ grep("autoimm", mydf$def, ignore.case=TRUE), ]
library(DT)
suppressWarnings({
datatable(limdf, rownames=FALSE, options=list(pageLength=5))
})
## Warning in instance$preRenderHook(instance): It seems your data is too big
## for client-side DataTables. You may consider server-side processing: https://
## rstudio.github.io/DT/server.html
The use of the graph representation allows various approaches to traversal and selection. Here we examine metadata for a term of interest, transform to an undirected graph, and obtain the adjacency list for that term.
## $`EFO:0000540`
## $`EFO:0000540`$name
## $`EFO:0000540`$name[[1]]
## [1] "immune system disease"
##
##
## $`EFO:0000540`$def
## $`EFO:0000540`$def[[1]]
## [1] "\"A group of non-neoplastic and neoplastic disorders resulting from the deregulation and/or deficiency of immune system functions. It includes autoimmune disorders (e.g., lupus erythematosus, dermatomyositis, rheumatoid arthritis), congenital and acquired immunodeficiency syndromes including the acquired immune deficiency syndrome (AIDS), and neoplasms (e.g., lymphomas and malignancies secondary to transplantation.)\" []"
##
##
## $`EFO:0000540`$xref
## NULL
ue = ugraph(efo.obo.g)
neighISD = adj(ue, "EFO:0000540")[[1]]
sapply(nodeData(subGraph(neighISD, efo.obo.g)), "[[", "name")
## $`EFO:0000408`
## [1] "disease"
##
## $`EFO:0000398`
## [1] "dermatomyositis"
##
## $`EFO:0000404`
## [1] "diffuse scleroderma"
##
## $`EFO:0000676`
## [1] "psoriasis"
##
## $`EFO:0000706`
## [1] "spondyloarthropathy"
##
## $`EFO:0000717`
## [1] "systemic scleroderma"
##
## $`EFO:0000783`
## [1] "myositis"
##
## $`EFO:0002498`
## [1] "aggressive insulitis"
##
## $`EFO:0002502`
## [1] "benign insulitis"
##
## $`EFO:0003775`
## [1] "Job's syndrome"
##
## $`EFO:0003778`
## [1] "psoriatic arthritis"
##
## $`EFO:0003785`
## [1] "allergy"
##
## $`EFO:0004246`
## [1] "mucocutaneous lymph node syndrome"
##
## $`EFO:0004599`
## [1] "acute graft vs. host disease"
##
## $`EFO:0004711`
## [1] "elephantiasis"
##
## $`EFO:0005140`
## [1] "autoimmune disease"
##
## $`EFO:0005555`
## [1] "gamma chain deficiency"
##
## $`EFO:0005565`
## [1] "janus kinase-3 deficiency"
##
## $`EFO:0005809`
## [1] "type II hypersensitivity reaction disease"
##
## $`Orphanet:179`
## [1] "Birdshot chorioretinopathy"
##
## $`Orphanet:183770`
## [1] "Rare genetic immune disease"
With RBGL we can compute paths to terms from root.
## Loading required package: graph
##
## Attaching package: 'graph'
## The following object is masked from 'package:Biostrings':
##
## complement
p = sp.between( efo.obo.g, "EFO:0000685", "EFO:0000001")
sapply(nodeData(subGraph(p[[1]]$path_detail, efo.obo.g)), "[[", "name")
## $`EFO:0000685`
## [1] "rheumatoid arthritis"
##
## $`EFO:0005856`
## [1] "arthritis"
##
## $`EFO:0005140`
## [1] "autoimmune disease"
##
## $`EFO:0000540`
## [1] "immune system disease"
##
## $`EFO:0000408`
## [1] "disease"
##
## $`BFO:0000016`
## [1] "disposition"
##
## $`BFO:0000020`
## [1] "material property"
##
## $`EFO:0000001`
## [1] "experimental factor "
The mcols
element of the GRanges
instances provided by
gwascat include mapped EFO terms and EFO URIs.
## [1] "DATE ADDED TO CATALOG" "PUBMEDID"
## [3] "FIRST AUTHOR" "DATE"
## [5] "JOURNAL" "LINK"
## [7] "STUDY" "DISEASE/TRAIT"
## [9] "INITIAL SAMPLE DESCRIPTION" "REPLICATION SAMPLE DESCRIPTION"
## [11] "REGION" "CHR_ID"
## [13] "CHR_POS" "REPORTED GENE(S)"
## [15] "MAPPED_GENE" "UPSTREAM_GENE_ID"
## [17] "DOWNSTREAM_GENE_ID" "SNP_GENE_IDS"
## [19] "UPSTREAM_GENE_DISTANCE" "DOWNSTREAM_GENE_DISTANCE"
## [21] "STRONGEST SNP-RISK ALLELE" "SNPS"
## [23] "MERGED" "SNP_ID_CURRENT"
## [25] "CONTEXT" "INTERGENIC"
## [27] "RISK ALLELE FREQUENCY" "P-VALUE"
## [29] "PVALUE_MLOG" "P-VALUE (TEXT)"
## [31] "OR or BETA" "95% CI (TEXT)"
## [33] "PLATFORM [SNPS PASSING QC]" "CNV"
## [35] "MAPPED_TRAIT" "MAPPED_TRAIT_URI"
## gwasloc instance with 6 records and 36 attributes per record.
## Extracted: 2016-01-18
## Genome: GRCh38
## Excerpt:
## GRanges object with 5 ranges and 3 metadata columns:
## seqnames ranges strand |
## <Rle> <IRanges> <Rle> |
## [1] 6 46677138 * |
## [2] 1 107800465 * |
## [3] 4 11038666 * |
## [4] 6 32638107 * |
## [5] 12 111446804 * |
## DISEASE/TRAIT SNPS P-VALUE
## <character> <character> <numeric>
## [1] Hashimoto thyroiditis versus Graves' disease rs2270450 1e-06
## [2] Hashimoto thyroiditis versus Graves' disease rs7537605 4e-08
## [3] Hashimoto thyroiditis versus Graves' disease rs2904297 7e-06
## [4] Autoimmune hepatitis type-1 rs2187668 2e-78
## [5] Autoimmune hepatitis type-1 rs3184504 8e-08
## -------
## seqinfo: 23 sequences from GRCh38 genome
## DataFrame with 6 rows and 2 columns
## MAPPED_TRAIT
## <character>
## 1 Graves disease, Hashimoto's thyroiditis, autoimmune thyroid disease
## 2 Graves disease, Hashimoto's thyroiditis, autoimmune thyroid disease
## 3 Graves disease, Hashimoto's thyroiditis, autoimmune thyroid disease
## 4 autoimmune hepatits type 1
## 5 autoimmune hepatits type 1
## 6 autoimmune hepatits type 1
## MAPPED_TRAIT_URI
## <character>
## 1 http://www.ebi.ac.uk/efo/EFO_0004237, http://www.ebi.ac.uk/efo/EFO_0003779, http://www.ebi.ac.uk/efo/EFO_0006812
## 2 http://www.ebi.ac.uk/efo/EFO_0004237, http://www.ebi.ac.uk/efo/EFO_0003779, http://www.ebi.ac.uk/efo/EFO_0006812
## 3 http://www.ebi.ac.uk/efo/EFO_0004237, http://www.ebi.ac.uk/efo/EFO_0003779, http://www.ebi.ac.uk/efo/EFO_0006812
## 4 http://www.ebi.ac.uk/efo/EFO_0005676
## 5 http://www.ebi.ac.uk/efo/EFO_0005676
## 6 http://www.ebi.ac.uk/efo/EFO_0005676