Skip to contents

Getting started with swisspalmR

The purpose of the swisspalmR package is to make data from SwissPalm, a database on protein S-palmitoylation more accessible to R users. This is achieved by using httr2 to mimic the HTTP requests usually sent by a browser to SwissPalm. Check out the vignette on implementation details for more information.

The SwissPalm website

SwissPalm data is easily accessed through web browsers, with a simple interface. Users may perform several actions:

  • Use the central text bar and ‘Search’ button to get palmitoylation data for a few protein/gene names, separated by commas
  • Use the ‘Batch search’ button to upload a file with many gene identifiers at once
  • Use the ‘Palmitoyl-proteome comparison’ button to perform check a user’s protein/gene list against published palmitoyl-proteomes for a variety of species.

The header bar contains even more options, which you can check out yourself below:

The SwissPalm database is an excellent source of information for researchers interested in S-palmitoylation, but the database has not been made easily accessible before in R. The swisspalmR package resolves this by making the same HTTP requests used by the SwissPalm website to send and then retrieve S-palmitoylation data in web browsers.

Using swisspalmR

To use swisspalmR, install the package from GitHub.

# install.packages("devtools")
devtools::install_github("simpar1471/swisspalmR")

At present, swisspalmR has one function, swissPalm(), which can retrieve the protein-level information on S-palmitoylation which is usually accessible from https://www.swisspalm.org/proteins. This function returns a 25-column data frame with protein information (if available) which includes:

  • Which organism the queried gene/protein ID is from
  • Whether the protein/gene associated with a queried ID is reviewed in UniProt
  • Whether the protein is palmitolyated, and if so:
    • At which residues palmitoylation occurs
    • The number of cysteine residues
    • Which other PATs/APTs palmitoylate/depalmitoylate the queried protein

This information and other information from swissPalm() is useful for determining the pattern of palmitoylation in your own samples.

To get data on S-palmitoylation irrespective of any other factors, provide a character vector of protein identifiers to swissPalm(). These identifiers should be in one of the formats in the dropdown menu below:

Valid formats
  • UniProt AC
  • UniProt secondary AC
  • UniProt ID
  • UniProt gene name
  • Ensembl protein
  • Ensembl gene
  • Refseq protein ID
  • IPI ID
  • UniGene ID
  • PomBase ID
  • MGI ID
  • RGD ID
  • TAIR protein ID
  • EuPathDb ID

Let’s say you have a UniProt accession, a UniProt ID, and an Ensembl ID. As SwissPalm recognises all of these, swissPalm() will give you information on each. Where a single gene name (e.g. "CALML5") maps onto multiple proteins, each of these proteins will have their own row.

inputs <- c("Q4WCM2", "ENSP00000453745", "CALML5")
# Include only first five columns to restrict printed output
swissPalm(inputs)[, 1:5]
#>   Query_identifier UniProt_AC   UniProt_ID UniProt_status
#> 1  ENSP00000453745     P08912   ACM5_HUMAN       Reviewed
#> 2           CALML5     Q9NZT1  CALL5_HUMAN       Reviewed
#> 3           CALML5     E2RHU8 E2RHU8_CANLF     Unreviewed
#> 4           CALML5     A4IFQ6 A4IFQ6_BOVIN     Unreviewed
#> 5           Q4WCM2     Q4WCM2 Q4WCM2_ASPFU     Unreviewed
#> 6           CALML5     F7HMI0 F7HMI0_MACMU     Unreviewed
#> 7           CALML5     F6ZF46 F6ZF46_HORSE     Unreviewed
#> 8           CALML5     G1T3Q4 G1T3Q4_RABIT     Unreviewed
#> 9           CALML5     F1RYW2   F1RYW2_PIG     Unreviewed
#>                                                        Organism
#> 1                                                  Homo sapiens
#> 2                                                  Homo sapiens
#> 3                                              Canis familiaris
#> 4                                                    Bos taurus
#> 5 Neosartorya fumigata (strain CEA10 / CBS 144.89 / FGSC A1163)
#> 6                                                Macaca mulatta
#> 7                                                Equus caballus
#> 8                                         Oryctolagus cuniculus
#> 9                                                    Sus scrofa

Any query identifiers not found in SwissPalm will have NA in most columns, but the Found_in_SwissPalm column will tell you why they were not found in the SwissPalm search:

inputs <- c("Q4WCM2", "BAD_ID")
swissPalm(inputs)[, 1:5]
#>   Query_identifier UniProt_AC   UniProt_ID UniProt_status
#> 1           Q4WCM2     Q4WCM2 Q4WCM2_ASPFU     Unreviewed
#> 2           BAD_ID       <NA>         <NA>           <NA>
#>                                                        Organism
#> 1 Neosartorya fumigata (strain CEA10 / CBS 144.89 / FGSC A1163)
#> 2                                                          <NA>

Limiting your searches

Sometimes, you may want to limit the search space to only the species against which you are checking, or to only check for data from specific sources. The species and dataset parameters in swissPalm() let you do this.

You can check the package objects swisspalmR::species or swisspalmR::datasets to see what the available species/dataset parameter values are:

# There are 92 species available as of Sep 12th 2023
swisspalmR::species[1:5]
#> All species in SwissPalm             Homo sapiens             Mus musculus 
#>                       ""                      "1"                      "2" 
#>        Rattus norvegicus Saccharomyces cerevisiae 
#>                      "3"                      "6"
# There are 7 datasets available as of Sep 12th 2023
swisspalmR::datasets
#>                                                                                  Dataset 1: All proteins 
#>                                                                                                    "all" 
#>                                                        Dataset 2: Proteins predicted to be palmitoylated 
#>                                                                                                   "pred" 
#>      Dataset 3: Palmitoylation validated or found in at least 1 palmitoyl-proteome (SwissPalm annotated) 
#>                                                                                                   "palm" 
#>                                                             Dataset 4: Palmitoylation validated proteins 
#>                                                                                                   "targ" 
#> Dataset 5: Palmitoylation validated proteins or found in palmitoyl-proteomes using 2 independent methods 
#>                                                                                                   "meth" 
#>                                      Dataset 6: Found in palmitoyl-proteomes using 2 independent methods 
#>                                                                                                  "meth2" 
#>                                                                     Dataset 7: Dataset 6 grouped by gene 
#>                                                                                      "validated_dataset"

For example, a gene name can be shared by many species. I could filter the results of a SwissPalm search down to a species of interest, such as horses, by telling swissPalm() I want only that data:

gene_names <- c("ADAMTS1", "AGL", "ANGPTL4", "CALML5", "CEP131", "CD70", "CD97")
swissPalm(gene_names, species = swisspalmR::species["Equus caballus"])[, 1:5]
#>   Query_identifier UniProt_AC   UniProt_ID UniProt_status       Organism
#> 1              AGL     A8BQB4    GDE_HORSE       Reviewed Equus caballus
#> 2             CD70     F6XHL7 F6XHL7_HORSE     Unreviewed Equus caballus
#> 3           CALML5     F6ZF46 F6ZF46_HORSE     Unreviewed Equus caballus
#> 4          ADAMTS1       <NA>         <NA>           <NA>           <NA>
#> 5          ANGPTL4       <NA>         <NA>           <NA>           <NA>
#> 6             CD97       <NA>         <NA>           <NA>           <NA>
#> 7           CEP131       <NA>         <NA>           <NA>           <NA>

I can also further limit SwissPalm to only giving me information when proteins were predicted to be S-palmitoylated using the dataset parameter:

gene_names <- c("ADAMTS1", "AGL", "ANGPTL4", "CALML5", "CEP131", "CD70", "CD97")
swissPalm(
  gene_names,
  species = swisspalmR::species["Equus caballus"],
  dataset = swisspalmR::datasets["Dataset 2: Proteins predicted to be palmitoylated"]
)[, 1:5]
#>   Query_identifier UniProt_AC   UniProt_ID UniProt_status       Organism
#> 1              AGL     A8BQB4    GDE_HORSE       Reviewed Equus caballus
#> 2             CD70     F6XHL7 F6XHL7_HORSE     Unreviewed Equus caballus
#> 3          ADAMTS1       <NA>         <NA>           <NA>           <NA>
#> 4          ANGPTL4       <NA>         <NA>           <NA>           <NA>
#> 5           CALML5       <NA>         <NA>           <NA>           <NA>
#> 6             CD97       <NA>         <NA>           <NA>           <NA>
#> 7           CEP131       <NA>         <NA>           <NA>           <NA>