Helper Functions to Process 'Hallmarks of Cancer' MSigDB Gene Sets
training_hallmarks.Rd
training_hallmarks()
retrieves the 'Hallmarks of
Cancer' geen sets from MSigDB.
Value
training_hallmarks()
creates a tibble with columns 'gene'
(Ensembl identifiers), 'set', and 'description' (a link to a
description of the set on the MSigDB web site).
Details
As is often the case, the Hallmarks of Cancer gene sets use gene identifiers from the NCBI ('Entrez' gene identifiers), but our data uses gene identifiers from Ensembl; this function accomplishes the tedious task of translating between gene sets using the Bioconductor 'org.Hs.eg.db' data resource and annotation functions in the AnnotationDbi pacakge. The mapping between identifiers is not 1:1, so the number of genes in each Hallmark set differs between Ensembl and Entrez identifiers.
Examples
training_hallmarks()
#> visit 'https://www.gsea-msigdb.org/gsea/msigdb/human/collections.jsp'
#> to register for use of the MSigDb 'hallmarks' dataset.
#> 'select()' returned 1:many mapping between keys and columns
#> # A tibble: 8,220 × 3
#> gene set description
#> <chr> <chr> <chr>
#> 1 ENSG00000000938 HALLMARK_ALLOGRAFT_REJECTION http://www.gsea-msi…
#> 2 ENSG00000000971 HALLMARK_INTERFERON_GAMMA_RESPONSE http://www.gsea-msi…
#> 3 ENSG00000000971 HALLMARK_COMPLEMENT http://www.gsea-msi…
#> 4 ENSG00000000971 HALLMARK_COAGULATION http://www.gsea-msi…
#> 5 ENSG00000000971 HALLMARK_KRAS_SIGNALING_UP http://www.gsea-msi…
#> 6 ENSG00000001084 HALLMARK_MTORC1_SIGNALING http://www.gsea-msi…
#> 7 ENSG00000001084 HALLMARK_XENOBIOTIC_METABOLISM http://www.gsea-msi…
#> 8 ENSG00000001084 HALLMARK_GLYCOLYSIS http://www.gsea-msi…
#> 9 ENSG00000001084 HALLMARK_REACTIVE_OXYGEN_SPECIES_PATHWAY http://www.gsea-msi…
#> 10 ENSG00000001084 HALLMARK_HEME_METABOLISM http://www.gsea-msi…
#> # … with 8,210 more rows