Protein-potentials from surface and buried triangles
(Snoeyink; Leaver-Fay, Tropsha)

Most methods for protein structure prediction produce a large set of decoy structures, which then are scored in an attempt to find the one that is nearest the native structure. An ideal scoring function would rank the native structure above all decoys. The fitness of a scoring function is often measured by its Z-score, which is the number of standard deviations that the native structure's score lies above the average score of an entire set of decoy structures. One scoring method that considers packing interactions is the Delaunay-based four-body statistical potential. In 3D, the Delaunay tessellation defines vertices, edges, triangles and tetrahedra by a geometric neighbor criterion. Tropsha's lab at UNC describes each residue in a protein by a single point, usually the C_alpha or side-chain centroid, computes the Delaunay, then characterizes tetrahedra by their amino acid content and their primary-structure topology. They examine the statistical properties of tetrahedra in a large set of known structures, and score each kind of tetrahedra by the deviation of its observed frequency from randomness, effectively answering the question what amino acids have a strong preference to be neighbors. Tropsha's lab seeks to expand this potential to capture more of the information in the training set by dividing tetrahedra into finer categories, but this would require a training set larger than the number of available structures with high-resolution and low sequence identity. We investigated an alternate formulation of the Delaunay statistical potentials based on triangles. This enabled him to add a further geometrical description of buriedness using the existing training set. This three-body potential, in tandem with the existing four-body potential, shows a 10% Z-score improvement in decoy discrimination.