get_ontologies()
queries the OLS for information on
all available ontologies.
get_ontology()
retrieves
information on a single ontology.
get_roots()
and get_terms()
return all 'roots' and
terms in an ontology.
get_term()
returns a tibble with detailed
information about a single term.
get_parents()
, get_ancestors()
, get_children()
,
and get_descendants()
retrieve parents, ancestors, children
and descedants of a single term.
Usage
get_ontologies()
get_ontology(ontology)
get_roots(ontology)
get_terms(ontology, all_ontologies = FALSE)
get_term(ontology, id, form = c("id", "obo_id", "short_form", "iri"))
get_parents(ontology, id, all_ontologies = FALSE)
get_ancestors(ontology, id, all_ontologies = FALSE)
get_children(ontology, id, all_ontologies = FALSE)
get_descendants(ontology, id, all_ontologies = FALSE)
Arguments
- ontology
character(1)
id
(fromget_ontologies()
) of the ontology of interst.- all_ontologies
logical(1) when
FALSE
(default), only terms, parents, etc., defined inontology
are returned. WhenTRUE
, terms from all ontologies are associated with terms inontology
are returned.- id
character(1) the term identifier, usually 'obo_id' (e.g., "CL:0002494") but for
get_term()
as specified byform
.- form
character(1) the form of the identifier, as describe in Details.
Value
The get_*()
functions return tibbles summarizing information
retrieved from the OLS. The meaning of individual columns is as in
the service; columns are not renamed, but have been re-ordered to
prioritize useful information.
Details
The functions documented on this page provide a programmatic interface to the EMBL-EBI Ontology Lookup Service at https://www.ebi.ac.uk/ols4/. The API is described at https://www.ebi.ac.uk/ols4/help
The functions use an on-disk cache of results retrieved from the internet to speed interactive analysis. Generally, initial queries are 'slow', but subsequent identical queries are very fast. Details of the cache, some edge cases where the cache can get in the way of current results, and strategies for cache management are summarized in the 'Cache Management' section of the vignette.
In an attempt to simplify navigation of results, values returned
from API calls are presented as a tibble
with invariant columns
removed.
Data returned by the OLS is often hierarchical, resulting in list
columns in the tibble. In some cases (e.g., get_ontologies()
) the
list columns have been un-nested (using tidyr::unnest_wider()
) to
provide users with relevant information. Downstream processing
steps may also find it beneficial to understand 'tidy' approaches
to working with hierarchical data in tibbles, as outlined in
chapter 23 of 'R For Data Science' (2e),
https://r4ds.hadley.nz/rectangling. This is illustrated in
the "Hierarchical Data" section of the vignette.
For get_term()
, the identifier id
can be one of three
forms. The id
and obo_id
forms are synonyms and follow the
pattern ontology abbrevation, :
, and term id, e.g., "CL:0002494".
The short_form
is typically like obo_id
but with :
replaced
by _
. An iri
is the purl resource locator, typically
"http://purl.obolibrary.org/...".
When no relatives are found, get_parents()
etc. return a tibble
with 0 rows (and sometimes 0 columns).
Examples
onto <- get_ontologies()
onto
#> # A tibble: 250 × 31
#> id title description version numberOfTerms numberOfProperties
#> <chr> <chr> <chr> <chr> <int> <int>
#> 1 ado Alzheimer's D… Alzheimer'… 2.0.1 1963 186
#> 2 agro Agronomy Onto… AgrO is an… NA 4162 293
#> 3 aism Ontology for … The ontolo… 2023-0… 7443 547
#> 4 amphx Amphioxus Dev… An ontolog… NA 403 10
#> 5 apo Ascomycete Ph… A structur… 2023-1… 619 27
#> 6 apollo_sv Apollo Struct… An OWL2 on… 2023-0… 1691 366
#> 7 aro Antibiotic Re… Antibiotic… NA 7105 25
#> 8 bco Biological Co… An ontolog… 2021-1… 253 472
#> 9 bfo Basic Formal … The upper … NA 35 22
#> 10 bspo Biological Sp… An ontolog… 2023-0… 169 236
#> # ℹ 240 more rows
#> # ℹ 25 more variables: numberOfIndividuals <int>, languages <list>,
#> # loaded <chr>, updated <chr>, versionIri <chr>, namespace <chr>,
#> # preferredPrefix <chr>, homepage <chr>, mailingList <chr>, tracker <chr>,
#> # logo <lgl>, creators <lgl>, annotations <lgl>, fileLocation <chr>,
#> # oboSlims <lgl>, labelProperty <chr>, definitionProperties <list>,
#> # synonymProperties <list>, hierarchicalProperties <list>, baseUris <list>, …
get_ontology("cl") |>
glimpse()
#> Rows: 1
#> Columns: 16
#> $ languages <list> "en"
#> $ lang <chr> "en"
#> $ ontologyId <chr> "cl"
#> $ loaded <chr> "2023-11-30T14:00:19.698341744"
#> $ updated <chr> "2023-11-30T14:00:19.698341744"
#> $ status <chr> "LOADED"
#> $ message <chr> ""
#> $ version <chr> "2023-10-19"
#> $ fileHash <lgl> NA
#> $ loadAttempts <int> 0
#> $ numberOfTerms <int> 16147
#> $ numberOfProperties <int> 531
#> $ numberOfIndividuals <int> 18
#> $ config <list> "cl"
#> $ baseUris <list> <NULL>
#> $ `_links` <list> ["https://www.ebi.ac.uk/ols4/api/ontologies/cl?la…
get_roots("cl")
#> # A tibble: 130 × 13
#> obo_id label description iri synonyms annotation has_children short_form
#> <chr> <chr> <chr> <chr> <list> <list> <lgl> <chr>
#> 1 GO:001… regu… Any proces… http… <NULL> <named list> TRUE GO_0010817
#> 2 GO:005… regu… Any proces… http… <NULL> <named list> TRUE GO_0050803
#> 3 GO:005… regu… Any proces… http… <NULL> <named list> TRUE GO_0050878
#> 4 GO:009… cell… Any proces… http… <NULL> <named list> TRUE GO_0097237
#> 5 NCBITa… root NA http… <list> <named list> TRUE NCBITaxon…
#> 6 UBERON… proc… An occurre… http… <NULL> <named list> TRUE UBERON_00…
#> 7 UBERON… anat… Biological… http… <NULL> <named list> TRUE UBERON_00…
#> 8 BFO:00… cont… An entity … http… <NULL> <NULL> TRUE BFO_00000…
#> 9 BFO:00… occu… An entity … http… <NULL> <NULL> TRUE BFO_00000…
#> 10 CARO:0… CARO… NA http… <NULL> <NULL> TRUE CARO_0000…
#> # ℹ 120 more rows
#> # ℹ 5 more variables: in_subset <list>, obo_definition_citation <list>,
#> # obo_xref <list>, obo_synonym <list>, `_links` <list>
terms <- get_terms("cl")
#> Querying OLS ■■ 3% | ETA: 2m
#> Querying OLS ■■■■ 9% | ETA: 1m
#> Querying OLS ■■■■■■ 15% | ETA: 1m
#> Querying OLS ■■■■■■ 18% | ETA: 1m
#> Querying OLS ■■■■■■■■ 24% | ETA: 1m
#> Querying OLS ■■■■■■■■■■ 30% | ETA: 45s
#> Querying OLS ■■■■■■■■■■■■ 36% | ETA: 40s
#> Querying OLS ■■■■■■■■■■■■■ 39% | ETA: 38s
#> Querying OLS ■■■■■■■■■■■■■■■ 45% | ETA: 35s
#> Querying OLS ■■■■■■■■■■■■■■■■ 48% | ETA: 33s
#> Querying OLS ■■■■■■■■■■■■■■■■ 52% | ETA: 32s
#> Querying OLS ■■■■■■■■■■■■■■■■■■ 58% | ETA: 29s
#> Querying OLS ■■■■■■■■■■■■■■■■■■■ 61% | ETA: 27s
#> Querying OLS ■■■■■■■■■■■■■■■■■■■■ 64% | ETA: 25s
#> Querying OLS ■■■■■■■■■■■■■■■■■■■■■ 67% | ETA: 23s
#> Querying OLS ■■■■■■■■■■■■■■■■■■■■■■ 70% | ETA: 21s
#> Querying OLS ■■■■■■■■■■■■■■■■■■■■■■■ 73% | ETA: 20s
#> Querying OLS ■■■■■■■■■■■■■■■■■■■■■■■■ 76% | ETA: 19s
#> Querying OLS ■■■■■■■■■■■■■■■■■■■■■■■■■ 79% | ETA: 17s
#> Querying OLS ■■■■■■■■■■■■■■■■■■■■■■■■■■ 82% | ETA: 15s
#> Querying OLS ■■■■■■■■■■■■■■■■■■■■■■■■■■ 85% | ETA: 13s
#> Querying OLS ■■■■■■■■■■■■■■■■■■■■■■■■■■■ 88% | ETA: 11s
#> Querying OLS ■■■■■■■■■■■■■■■■■■■■■■■■■■■■ 91% | ETA: 8s
#> Querying OLS ■■■■■■■■■■■■■■■■■■■■■■■■■■■■■ 94% | ETA: 5s
#> Querying OLS ■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■ 97% | ETA: 3s
#> Querying OLS ■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■ 100% | ETA: 0s
terms
#> # A tibble: 2,734 × 13
#> label obo_id description iri synonyms annotation has_children short_form
#> <chr> <chr> <chr> <chr> <list> <list> <lgl> <chr>
#> 1 adren… CL:00… NA http… <NULL> <named list> FALSE CL_0000109
#> 2 pepti… CL:00… A neuron t… http… <NULL> <named list> FALSE CL_0000110
#> 3 colum… CL:00… NA http… <list> <NULL> FALSE CL_0000112
#> 4 monon… CL:00… A vertebra… http… <NULL> <named list> FALSE CL_0000113
#> 5 surfa… CL:00… NA http… <list> <named list> FALSE CL_0000114
#> 6 endot… CL:00… An endothe… http… <list> <named list> TRUE CL_0000115
#> 7 pione… CL:00… Pioneer ne… http… <NULL> <named list> FALSE CL_0000116
#> 8 CNS n… CL:00… NA http… <NULL> <NULL> TRUE CL_0000117
#> 9 baske… CL:00… Basket cel… http… <NULL> <named list> TRUE CL_0000118
#> 10 cereb… CL:00… Large intr… http… <list> <named list> FALSE CL_0000119
#> # ℹ 2,724 more rows
#> # ℹ 5 more variables: in_subset <list>, obo_definition_citation <list>,
#> # obo_xref <list>, obo_synonym <list>, `_links` <list>
CL0002494 <- get_term("cl", "CL:0002494")
CL0002494 |>
glimpse()
#> Rows: 1
#> Columns: 22
#> $ obo_id <chr> "CL:0002494"
#> $ label <chr> "cardiocyte"
#> $ description <chr> "A cell located in the heart, including both m…
#> $ iri <chr> "http://purl.obolibrary.org/obo/CL_0002494"
#> $ lang <chr> "en"
#> $ synonyms <list> ["heart cell"]
#> $ annotation <list> [["https://orcid.org/0000-0003-1980-3228"], ["…
#> $ ontology_name <chr> "cl"
#> $ ontology_prefix <chr> "CL"
#> $ ontology_iri <chr> "http://purl.obolibrary.org/obo/cl.owl"
#> $ is_obsolete <lgl> FALSE
#> $ term_replaced_by <lgl> NA
#> $ is_defining_ontology <lgl> TRUE
#> $ has_children <lgl> TRUE
#> $ is_root <lgl> FALSE
#> $ short_form <chr> "CL_0002494"
#> $ in_subset <lgl> NA
#> $ obo_definition_citation <list> [["A cell located in the heart, including both…
#> $ obo_xref <list> [["BTO", "0001539", <NULL>, "http://purl.oboli…
#> $ obo_synonym <lgl> NA
#> $ is_preferred_root <lgl> FALSE
#> $ `_links` <list> [["https://www.ebi.ac.uk/ols4/api/ontologies/c…
get_parents("cl", "CL:0002350")
#> # A tibble: 1 × 14
#> obo_id label description iri synonyms annotation has_children is_root
#> <chr> <chr> <chr> <chr> <lgl> <list> <lgl> <lgl>
#> 1 CL:0010008 cardi… NA http… NA <named list> TRUE FALSE
#> # ℹ 6 more variables: short_form <chr>, in_subset <list>,
#> # obo_definition_citation <lgl>, obo_xref <lgl>, obo_synonym <lgl>,
#> # `_links` <list>
get_ancestors("cl", "CL:0002494")
#> # A tibble: 1 × 14
#> obo_id label description iri synonyms annotation has_children is_root
#> <chr> <chr> <chr> <chr> <list> <list> <lgl> <lgl>
#> 1 CL:0000000 cell A material … http… <NULL> <named list> TRUE FALSE
#> # ℹ 6 more variables: short_form <chr>, in_subset <list>,
#> # obo_definition_citation <list>, obo_xref <list>, obo_synonym <list>,
#> # `_links` <list>
get_ancestors("cl", "CL:0002350")
#> # A tibble: 8 × 14
#> obo_id label description iri synonyms annotation has_children is_root
#> <chr> <chr> <chr> <chr> <list> <list> <lgl> <lgl>
#> 1 CL:0010008 cardi… NA http… <NULL> <named list> TRUE FALSE
#> 2 CL:0000115 endot… An endothe… http… <list> <named list> TRUE FALSE
#> 3 CL:0000213 linin… A cell wit… http… <list> <NULL> TRUE FALSE
#> 4 CL:0000215 barri… A cell who… http… <NULL> <NULL> TRUE FALSE
#> 5 CL:0000000 cell A material… http… <NULL> <named list> TRUE FALSE
#> 6 CL:0002078 meso-… Epithelial… http… <list> <named list> TRUE FALSE
#> 7 CL:0000066 epith… A cell tha… http… <list> <named list> TRUE FALSE
#> 8 CL:0002494 cardi… A cell loc… http… <list> <named list> TRUE FALSE
#> # ℹ 6 more variables: short_form <chr>, in_subset <list>,
#> # obo_definition_citation <list>, obo_xref <list>, obo_synonym <list>,
#> # `_links` <list>
get_children("cl", "CL:0002494")
#> # A tibble: 12 × 14
#> obo_id label description iri synonyms annotation has_children is_root
#> <chr> <chr> <chr> <chr> <list> <list> <lgl> <lgl>
#> 1 CL:2000022 card… Any native… http… <NULL> <named list> TRUE FALSE
#> 2 CL:1000147 card… A cell tha… http… <list> <NULL> TRUE FALSE
#> 3 CL:0011019 meso… A mesothel… http… <NULL> <named list> FALSE FALSE
#> 4 CL:0010020 card… NA http… <NULL> <NULL> FALSE FALSE
#> 5 CL:0010008 card… NA http… <NULL> <named list> TRUE FALSE
#> 6 CL:0010007 His-… NA http… <NULL> <NULL> TRUE FALSE
#> 7 CL:0008022 endo… A mesenchy… http… <NULL> <NULL> FALSE FALSE
#> 8 CL:0002592 smoo… A smooth m… http… <NULL> <named list> FALSE FALSE
#> 9 CL:0002548 fibr… A fibrobla… http… <list> <named list> TRUE FALSE
#> 10 CL:0000746 card… Cardiac mu… http… <list> <named list> TRUE FALSE
#> 11 CL:0000513 card… A precurso… http… <list> <named list> FALSE FALSE
#> 12 CL:1000309 epic… A fat cell… http… <list> <named list> TRUE FALSE
#> # ℹ 6 more variables: short_form <chr>, in_subset <list>,
#> # obo_definition_citation <list>, obo_xref <list>, obo_synonym <list>,
#> # `_links` <list>
get_children("cl", "CL:0002350") # no children, 0 x 0 tibble
#> # A tibble: 0 × 0
get_descendants("cl", "CL:0002494")
#> # A tibble: 68 × 14
#> obo_id label description iri synonyms annotation has_children is_root
#> <chr> <chr> <chr> <chr> <list> <list> <lgl> <lgl>
#> 1 CL:1000309 epic… A fat cell… http… <list> <named list> TRUE FALSE
#> 2 CL:1000310 adip… A fat cell… http… <list> <named list> FALSE FALSE
#> 3 CL:1000311 adip… A fat cell… http… <list> <named list> FALSE FALSE
#> 4 CL:0000513 card… A precurso… http… <list> <named list> FALSE FALSE
#> 5 CL:0000746 card… Cardiac mu… http… <list> <named list> TRUE FALSE
#> 6 CL:0000193 card… A striated… http… <NULL> <NULL> FALSE FALSE
#> 7 CL:0002086 spec… A cardiac … http… <NULL> <named list> TRUE FALSE
#> 8 CL:0002068 Purk… Specialize… http… <list> <named list> TRUE FALSE
#> 9 CL:1000483 Purk… A Purkinje… http… <NULL> <named list> FALSE FALSE
#> 10 CL:1000484 Purk… A Purkinje… http… <NULL> <named list> FALSE FALSE
#> # ℹ 58 more rows
#> # ℹ 6 more variables: short_form <chr>, in_subset <list>,
#> # obo_definition_citation <list>, obo_xref <list>, obo_synonym <list>,
#> # `_links` <list>
get_descendants("cl", "CL:0002350") # no descedants, 0 x 0 tibble
#> # A tibble: 0 × 0