Skip to contents

af_predictions() retrieves information about AlphaFold predictions associated with UniProt accession identifiers.

af_prediction_view() summarizes effects of possible amino acid changes in a single UniProt protein. The changes are displayed on the AlphaFold-predicted structure.

af_colorfunc_by_position() generates a Javascript function to be used in rd3mol::m_set_style() to color residues by position, e.g., when visualizing median predicted pathogenicity.

Usage

af_predictions(uniprot_ids)

af_prediction_view(tbl, bfc = BiocFileCache())

af_colorfunc_by_position(
    tbl,
    pos,
    value,
    pos_max = NULL,
    palette = colorspace::diverging_hcl(11),
    palette_min = NULL,
    palette_max = NULL
)

Arguments

uniprot_ids

character() UniProt accession identifiers (uniprot_id in AlphaMissense tables).

tbl

A tibble containing information on the UniProt protein and AlphaMissense predicted amino acid effects.

For av_prediction_view() the tibble must have columns uniprot_id, protein_variant, am_pathogenicity, and am_class, as in tibbles returned by am_data("hg38") or am_data("aa_substitutions"), for instance. The uniprot_id must contain a single unique value.

For af_colorfunc_by_position() the tibble must have columns pos and value, as described below.

bfc

An object created with BiocFileCache::BiocFileCache(), representing the location used to cache PDB files retrieved by av_prediction_view(). The default is the BiocFileCache installation-wide location.

pos

the symbol or name of the column in tbl containing amino acid residue positions in the protein.

value

the symbol or name of the column in tbl containing values to be used for coloring amino acid residues in the protein.

pos_max

integer(1) the maximum residue position in the protein to be visualized. Default: the maximum value in pos.

palette

character() vector of colors to be used in visualization. The default (colorspace::diverging_hcl(11)) produces colors ranging from blue (low) to red (high).

palette_min

numeric(1) the value bounding the minimum palette color. The default is the minimum of value; a common value when plotting pathogenicity might be 0.

palette_max

numeric(1) the value bounding the maximum palette color. The default is the maximum of value; a common value when plotting pathogenicity might be 1.

Value

af_predictions() returns a tibble. Each row represents the AlphaFold prediction associated with the corresponding uniprot accession. Columns include:

  • entryId: AlphaFold identifier.

  • gene: gene symbol corresponding to UniProt protein.

  • uniprotAccession, uniprotId, uniprotDescription: UniProt characterization. AlphaMissense's uniprot_id is AlphaFold's uniprotAccession.

  • taxId, organismScientificName: Organism information.

  • uniprotStart, uniprotEnd, uniprotSequence: protein sequence information.

  • modelCreatedDate, latestVersion, allVersions, isReviewed, isReferenceProteome: AlphaFold provenance information.

  • cifUrl, bcifUrl, pdbUrl: URLs to AlphaFold 3-dimensional molecular representations.

  • paeImageUrl, paeDocUrl: 'Predicted Aligned Error' heat map and underlying data. These can be used to assess the confidence in relative orientation of residues in different domains, as described in part in the AlphaFold FAQ https://alphafold.ebi.ac.uk/faq

af_prediction_view() displays an interactive view of the protein in an RStudio panel or browser tab.

af_colorfunc_by_position() returns a character(1) vector representation of the Javascript function, with color vector injected.

Details

af_predictions() queries the prediction endpoint of the AlphaFold API described at https://alphafold.ebi.ac.uk/api-docs.

af_prediction_view() uses tbl to calculate median pathogenicity at each amino acid position, using am_aa_pathogenicity(). Predicted protein structure is retrieved from the unique uniprot_id using af_predictions() and the pdbUrl returned by that function. Protein structure is visualized using the r3dmol https://cran.R-project.org/package=r3dmol package. Amino acids are colored using aa_pathogenicity_median and af_colorfunc_by_position() with default palette defined on the interval 0, 1.

af_colorfunc_by_position() uses a template mechanism to inject a vector of position-specific colors into a Javascript function used by r3dmol::m_set_style() / r3dmol::m_style_cartoon() to color residues by position. Positions for which no color is specified are colored 'gray'. The template can be seen with AlphaMissenseR:::js_template("colorfunc").

Examples


## af_predictions

uniprot_ids <-
    am_data("aa_substitutions") |>
    dplyr::filter(uniprot_id %like% "P3555%") |>
    dplyr::distinct(uniprot_id) |>
    pull(uniprot_id)
af_predictions(uniprot_ids)
#> * [09:25:11][info] 2 of 4 uniprot accessions not found
#>   'P35556', 'P35555'
#> # A tibble: 2 × 25
#>   entryId  gene  sequenceChecksum sequenceVersionDate uniprotAccession uniprotId
#>   <chr>    <chr> <chr>            <chr>               <chr>            <chr>    
#> 1 AF-P355… GCK   094D4A2F78096724 1994-06-01          P35557           HXK4_HUM…
#> 2 AF-P355… PCK1  78D309E0845CC181 2006-03-07          P35558           PCKGC_HU…
#> # ℹ 19 more variables: uniprotDescription <chr>, taxId <int>,
#> #   organismScientificName <chr>, uniprotStart <int>, uniprotEnd <int>,
#> #   uniprotSequence <chr>, modelCreatedDate <chr>, latestVersion <int>,
#> #   allVersions <list>, isReviewed <lgl>, isReferenceProteome <lgl>,
#> #   cifUrl <chr>, bcifUrl <chr>, pdbUrl <chr>, paeImageUrl <chr>,
#> #   paeDocUrl <chr>, amAnnotationsUrl <chr>, amAnnotationsHg19Url <chr>,
#> #   amAnnotationsHg38Url <chr>


## af_prediction_view()

P35557 <-
    am_data("hg38") |>
    dplyr::filter(uniprot_id == "P35557")
af_prediction_view(P35557)
## no AlphaFold prediction for this protein P35555 <- am_data("aa_substitutions") |> dplyr::filter(uniprot_id == "P35555") tryCatch({ af_prediction_view(P35555) }, error = identity) #> <simpleError in af_prediction_view(P35555): 'af_prediction_view()' could not find UniProt accession 'P35555'> ## af_colorfunc_by_position() df <- tibble( pos = 1 + 1:10, # no color information for position 1 value = 10:1 / 10 ) colorfunc <- af_colorfunc_by_position( df, "pos", "value", pos_max = 12 # no color information for position 12 ) cat(colorfunc) #> function(atom) { #> const residue_colors = [ 'gray', '#8E063B', '#AB5468', '#C18692', '#D2B0B6', '#DDD0D2', '#D2D3DC', '#B3B7CF', '#8C94BF', '#5D6CAE', '#023FA5', 'gray' ]; #> return residue_colors[atom.resi]; #> } ## template used for Javascript function cat( AlphaMissenseR:::js_template("colorfunc", colors = "..."), "\n" ) #> function(atom) { #> const residue_colors = [ ... ]; #> return residue_colors[atom.resi]; #> }