Scripting PDB-REDO
The PDB-REDO service offers an API to submit and retrieve jobs. The service is a REST
based service.
In order to access this service you will have to create an API token first. You can do that at the
https://pdb-redo.eu/token page.
You can write your own client code to access this API, but you can also use the basic Python script that
is located in
the github repository for this server. This python script can be used to submit jobs, retrieve their
status and fetch
the zipped results. Using this python script is as easy as:
$ python ./pdb-redo.py submit --xyzin=1cbs.cif --hklin=1cbs.mtz --token-id=1 --token-secret=123456789a
Job submitted with id 17
$ python ./pdb-redo.py status --job-id=17 --token-id=1 --token-secret=123456789a
Job status is running
...
$ python ./pdb-redo.py status --job-id=17 --token-id=1 --token-secret=123456789a
Job status is ended
$ python ./pdb-redo.py fetch --job-id=17 --token-id=1 --token-secret=123456789a
$ ls
pdb-redo-result-17.zip ...
In order to use these Python scripts you have to download the following two files: PDBRedoAPIAuth.py and pdb-redo.py
Likewise there are Perl versions of these scripts, albeit these are a bit more limited.
The files you need are
lib/PDBRedo/Api.pm and
pdb-redo.pl. Note that the Api.pm
needs to be placed in directory called lib/PDBRedo located relative from the directory containing
the pdb-redo.pl file.
And there is of course a Javascript implementation available: PDBRedoRequest.js
and pdb-redo.js.
PDB-REDO API
The API offers a simple CRUD interface to Create, Retrieve, Update and Delete jobs that are named
runs in this API. All HTTP requests need to have a set of extra headers which are described
in the section Security.
The API end points are as follows:
GET https://pdb-redo.eu/api/run
-
This call will return a JSON array of JobInfo objects for the
jobs that are currently known.
POST https://pdb-redo.eu/api/run
-
This call will create a new job. The body of the request can contain these parameters:
- mtz-file
- A file containing the reflection data.
- pdb-file
- A file containing the coordinates of the atoms. This file should preferrably be in mmCIF
format, but the legacy PDB format is also supported, albeit limited.
- restraints-file
- A file containing additional restraint information. This parameter is optional.
- sequence-file
- A file containing the sequence of the protein in the coordinates file in FastA format. This
parameter is optional.
- parameters
- A JSON object containing additional parameters. This parameter is optional. See JobParams below.
GET https://pdb-redo.eu/api/run/{id}
-
This call will return the JobInfo object for the job with the specified
ID.
GET https://pdb-redo.eu/api/run/{id}/output
-
This call will return an array of strings, specifying the output files for the job with the
specified ID.
GET https://pdb-redo.eu/api/run/{id}/output/{file}
-
This call will return the contents of the specified file for the job with the specified ID.
GET https://pdb-redo.eu/api/run/{id}/output/zipped
-
This call will return all output files for the job with the specified ID in a ZIP archive.
DELETE https://pdb-redo.eu/api/run/{id}
-
This call will delete all files associated with the job with the specified ID from the server.
Data types
The API service uses the following data types in JSON format.
JobParams
-
A JSON object containing one member object called parameters
. This
parameters object can contain the following fields. Each has a boolean value unless
otherwise specified.
- noloops
- Do not try to add missing loops.
- nofixdmc
- Do not add missing backbone atoms.
- nopepflip
- No peptide flips are performed.
- noscbuild
- Side chains will not be rebuilt.
- nocentrifuge
- Waters with poor density will not be deleted.
- nosugarbuild
- No (re)building of carbohydrates.
- norebuild
- All rebuilding steps are skipped.
- tighter
- Try tighter restraints than usual, the value should be a positive integer that indicates
the increase in tightness.
- looser
- Try looser restraints than usual, the value should be a positive integer that indicates the
increase in looseness.
- nometalrest
- Do not generate special metal restraints.
- nohomology
- Do not use homology-based restraints.
- homology
- Force homology-based restraints irrespective of data resolution.
- nonucrest
- Do not use nucleic acid restraints.
- hbondrest
- Use hydrogen bond restraints.
- noncs
- No NCS restraints are applied.
- nojelly
- Switch off jelly body refinement.
- paired
- Force paired refinement
- crossval
- Performs (very lengthy) k-fold cross validation on the final results.
- isotropic
- Force isotropic B-factors
- notwin
- No detwinning is performed.
- notls
- No TLS refinement is performed.
- notlsupdate
- Use TLS, but do not update the tensors in the final refinement.
- noocc
- Do not refine occupancies.
- nohyd
- Do not add hydrogens (in riding postions) during refinement.
- legacy
- For legacy PDB entries. The initial R-factor is not checked and the number of refinement cycles is
increased (a lot).
- newmodel
- Always take an updated model from the re-refinement for the rebuilding steps. Only use this
option when all else fails.
- maxres
- Cut the resolution at the given value. The value should be a real number.
- noanomalous
- Ignore all anomalous data if Fmean or Imean are available.
- fewrefs
- Deals with very small data sets by switching off R-free set sanity checks.
- intens
- Force the use of the intensities from the reflection file.
- notruncate
- Do not use truncate to convert intensities to amplitudes.
- nosigma
- Do not use sigF or sigI for scaling.
JobInfo
-
This object contains all publicly available information about a job.
- id
- The ID of the job. This is a integral number.
- status
- A string describing the status of the job. The value can be one of these:
undefined
,
registered
,
starting
,
queued
,
running
,
stopping
,
stopped
,
ended
,
deleting
Of these, ended
indicates the job has finished successfully.
- date
- The submission date and time of the job.
- started-date
- The date and time at which the job started.
- score
- A JSON object describing the results of the PDB-REDO results.
- input
- An array containing the file names of the input files.
Security
A request to the PDB-REDO API needs to have a set of headers in order to get access. The
design of these headers is similar to the way requests are signed for
Amazon Web
Services.
To start with, we create a so-called canonical request. This request consists of the following string
separated by newline characters:
- HTTP method
- URI Path
- Query string
- Host with port
- Content hash
The content hash is a base64 encoded SHA256 hash of the payload of the HTTP message. From this canonical
request
we create a canonical request hash. We then create the string-to-sign. This string
consists of
the following substrings separated by newlines:
- The string 'PDB-REDO-api'
- A timestamp, ISO formatted
- A credentials string
- The canonical request hash as specified above
Here the credential string is composed from the token-id, a date string in the
format YYYYMMDD
and the string 'pdb-redo-api' separated by slash characters ('/').
A key is now created using the HMAC<SHA256> protocol where the message is the YYYYMMDD
date string and the key is the concatenation of the string 'PDB-REDO' with the token-secret.
The 'string-to-sign' is then signed with this key, again with the HMAC<SHA256> protocol and then
base64 encoded.
Now we can assemble the Authorization header
PDB-REDO-api Credential=${credential},SignedHeaders=host;x-pdb-redo-content-sha256,Signature=${signature}
Addtionally, a header named X-PDB-REDO-Date is added containing as value the
timestamp string.
See the example scripts or the AWS documentation for more information.