Contributed by Jan Jensen
Xiao He and co-workers have in the last few years been working on a fragmentation methodology for computing NMR chemical shifts and this methodology has now been automated an implemented in the David Case's SHIFTS program.
In this method a H-capped model system is created for each residue, while the remaining protein is treated as point charges. Bulk solvent effects are represented by surface charges computed using a Poisson-Boltzmann calculation based on a point charge representation of the protein while explicit water molecules also can be added using the PLACEVENT program. Much of this has been automated in the new AFNMR module in SHIFTS program:
We have attempted to automate the process as much as possible, so that default calculations require only a PDB file as input. The preliminary processing creates fragment input files for the Gaussian, ORCA, Q-Chem or deMon3k programs; analysis programs parse the quantum chemistry output files to create tables of computed shifts and to make comparisons with experimental data if it is available. Optional parameters control the level of calculation and basis set, and the type of explicit or implicit solvent model that is used.
Furthermore, Xiao He tells me that if both SHIFTS and MEAD are installed then AFNMR will generate the surface charges automatically (alternatively they can be generated by Delphi or DIVCON in a non-automated fashion). While the PLACEVENT program is available in AMBER it has not been fully integrated with AFNMR yet.
The authors report that it takes 1-3 hrs on a single 8-core node per residue so calculations on a 100-residue are feasible with relatively modest computational resources and with access to supercomputers one could imagine computing chemical shifts for comparatively large proteins relatively routinely.
The RMSD relatively to conventional full-DFT calculations on very small proteins are 0.2, 0.8, and 1.2 ppm for H, C, and N atoms respectively. However, the agreement with experimental values is considerably worse and empirical predictors such as SHIFTX2 give rise to lower RMSD values. This is a well known "problem" and may, in part, reflect small errors in the protein structures and the neglect of conformational averaging that has been "parameterized away" in the empirical methods. In fact, SHIFTX2-results in particular is rather insensitive to the quality of the structure (see Figure 4 in this paper), which is a plus if the prime objective is to get chemical shifts that agree well with experiment, but may be a minus if the information is to be used for protein structure validation and refinement. This obviously needs further study and the AFNMR method, which promises to make QM-prediction of chemical shifts routine for proteins, is an important step in that direction.
I thank Kresten Lindorff-Larsen for first alerting me to this paper