22 May 2017

Extracting more information from crystallographic data

The current poll on how much structural information is needed for fragment optimization is still open - if you haven't done so already, please vote on the right hand side. Last week we discussed new developments in NMR. This week we turn to crystallography.

Fragment screening by crystallography is a little like finding needles in haystacks. Typically, dozens or hundreds of crystals are individually soaked with one or more fragments. Diffraction data gathered from each crystal are used to generate electron density maps, which are iteratively refined by tweaking the conformation of side chains and adding water molecules. In theory, any unexplained electron density that remains after refinement should correspond to bound fragments.

In practice, the process of manually inspecting so many data sets can be both tedious and subjective. Although a narrow focus on the active site reduces the amount of work, doing so risks missing the many fragments that bind at interesting secondary sites. Also, because fragments have low affinities, they may only bind to a fraction of protein molecules; this "partial occupancy" lowers the signal to noise ratio. And fragments sometimes bind in more than one conformation, thereby smearing out the electron density and further reducing the signal.

Of course, even though crystallographic fragment screening can give very high hit rates, most crystals will not have bound fragments. In a new paper in Nat. Comm., Frank von Delf at the Structural Genomics Consortium and collaborators at several institutions describe how these "empty" structures can be turned from lemons into lemonade.

The method, called Pan-Dataset Density Analysis (PanDDA), is essentially a form of background correction. Dozens of datasets from empty crystals are averaged and computationally subtracted from a dataset of interest. This averaging gives much cleaner maps, allowing fragments to be more rapidly and easily detected. It’s almost as if you could subtract all the hay from a haystack to reveal any needles.

The researchers present four case studies of crystallographic fragment screens, each with more than 100 datasets, and the results are stunning: in one case manual inspection revealed just 2 fragment hits, both at a single site, while PanDDA revealed 24 fragments at 5 different sites!

One limitation of PanDDA is that it does require dozens of empty datasets – ideally more than 30. In a new paper in Acta Crystallogr. D Struct. Biol., Dorothee Liebschner at the Lawrence Berkeley National Laboratory and collaborators at other institutions describe an alternative approach suitable for lower throughput applications.

One common tool in crystallography is the OMIT map. Atoms in question (such as from a ligand) are omitted from the model, and the calculated electron density is then compared with the observed electron density; if the density remains, this suggests that the atoms really belong. Of course, there is no truly empty space in a crystal – solvent fills any space not occupied by protein or ligands. Typically this is accounted for by treating “bulk solvent” (ie, water molecules not making specific interactions) as being present at a constant level of background electron density. The problem is that when calculating an OMIT map, this bulk solvent could obscure weak but real electron density.

To address this challenge, the researchers develop “polder OMIT maps,” named after land that is kept dry despite being below the surrounding water level. Essentially, the bulk solvent is not allowed into polder OMIT maps when they are generated, thus enhancing any actual density and allowing low-occupancy ligands to be observed. Several lovely figures in the paper illustrate that the process works well.

It is nice to see that, despite its long history, crystallography continues to make practical and creative advances.

No comments: