Spotting a needle in a haystack

Following the successful implementation of Diamond’s fast_dp data processing pipeline, it became apparent that the bottleneck in structure analysis had been shifted from the data processing to the difference map calculation. While the data processing may be performed ab initio, the difference map calculation requires selection of an appropriate reference structure and preparation of the data. These interactive steps are followed by restrained refinement to generate an electron density map, which may be further inspected to determine the presence or absence of ligands. This process is straightforward for a few data sets but becomes challenging and time limiting in a high throughput environment. Rapid feedback about ligand binding is valuable for guiding subsequent data collection, especially when many samples of a particular protein-ligand complex are available.

In response to this situation, a new pipeline, DIMPLE, has been developed to:

Select an isomorphous model based on a given reflection file
Prepare the data for refinement
Perform rigid body refinement, molecular replacement (if needed) and restrained refinement to generate electron density map
Identify blobs of positive electron density and generate static images to simplify the inspection of the results

Figure 1: The workflow of Dimple.

DIMPLE’s workflow (Figure 1) is split into two phases: the selection of the reference model and the difference map calculation. The selection of the reference model compares the available models provided by the user to the output of fast_dp. The matching is based on the naming of the data, the pointgroup symmetry derived from the CRYST1 record in the coordinate file and the unit cell constants. Should the pointgroups from the model and from fast_dp be inconsistent the corresponding coordinate file will be ignored. If several coordinate files remain possible, the unit cell constants are compared and the closest matching file is selected. For orthorhombic lattices, permutations of a, b and c are allowed.

Figure 2: Two views of a tartrate ion identified automatically through DIMPLE analysis of data from a thaumatin sample.

Once a coordinate file has been selected, the difference map may be calculated. Here DIMPLE first compares the origin choice from fast_dp with the coordinates, re-indexing as necessary, before copying the spacegroup from the CRYST1 record to the processed data file. Subsequently, structure factor amplitudes are calculated, after which rigid body refinement is performed to test the compatibility of the coordinates with the data. In some circumstances, the resulting Rcryst will indicate that the match is poor (i.e. > 40%) in which case a limited molecular replacement is used. In the following step, either rigid body refinement or molecular replacement restrained refinement is used to generate refined coordinates and a density map for manual inspection. Finally this map is searched for unmodelled “blobs” of density, which are then rendered as static images (Figures 2a,b).

For details on the Dimple pipeline, please contact [email protected] or [email protected] For assistance with the practicalities during data collection please first contact your local contact, who should be able to assist you.

Practicalities

How to name your model(s) and data:
To be considered as a candidate the name of the reference structure before the first full stop must appear in either the file name template or directory used for the data collection. For example abc.pdb will match /dls/…/in1234-5/abc/xyz_1_0001.cbf or /dls/…/in1234-5/xyz/abc_1_0001.cbf.

Where to place the model(s):
All available models, ideally with ligands and waters removed, have to be placed in /dls/…/in1234-5/processing/pdb

Multiple space groups option:
In situations where multiple spacegroups are possible, the suggestion is to have native files corresponding to each of the likely spacegroups and to name these files name.spacegroup.pdb (e.g. abc.P212121.pdb).

Where to find the results:
The results are available in your processed directory in either /dls/…/in1234-5/processed/xyz/fast_dp/dimple or /dls/…/in1234-5/processed/results/

On this website

About Diamond

For Users

Industry

Science

Instruments

Careers

Public

Procurement

Software & Tools

Sub-navigation

In This Section

Sub Navigation

Want to know more?

Spotting a needle in a haystack

Practicalities