Almost-Delaunay Simplices in CGAL : Detailed Code Description / Developer's Manual Deepak Bandyopadhyay %%% Here is the description of the core Almost-Delaunay code for CGAL, along with %%% some MATLAB utilities that interface with it, that are described more fully in %%% file README.ADMATLAB.txt. %%% Parameters and calling ADCGAL: ADCGAL cutoff = maximum limit on perturbation (in the same units as the coordinates, i.e. Angstrom for proteins). This is the parameter epsilon in my definition of AD(epsilon). To try it out, a typical default value is 1.0 Angstrom, though values from 0.01 to 2.0 Angstrom are useful for capturing different ranges of perturbation in proteins. prune = maximum edge length allowed in a simplex. This gets rid of long edges on the convex hull and within interior pockets and cavities. A typical default value in proteins is 10 (Angstrom), though values from 6.0 to 15.0 Angstrom may be useful. "SHORT" edges, triangles and tetrahedra are those with all edges below the prune in length. file prefix = 1xyz if your 3D coordinates are stored as rows of three coordinates in a file "1xyz.out". This parameter can include a full path if the .out file lies elsewhere. If this parameter is missing, input comes from the standard input (hence you can use AD as an input pipe, though minor changes are required to use it as an output pipe). Output is written to files as follows: 1xyz.del: short Delaunay edges, written as rows of SORTED point index pairs starting at 0; example row "0 4" 1xyz.AD: short AD edges with threshold>0, written as rows of sorted edge index pairs and a threshold. Example row: 5 6 0.7773 The files below are generated only by the version of ADCGAL that calculates simplices, and not by the edge version (the symbol ADSIMP_CGAL has to be defined during compilation. 1xyz.del3: Delaunay triangles with short edges, rows of 3 sorted indices 1xyz.AD3: AD triangles, rows of 3 sorted indices and a threshold 1xyz.del4: Delaunay tetrahedra, 4 indices 1xyz.AD4: AD tetrahedra, 4 indices and a threshold %%% Code internals, description of Class hierarchy: The almost-Delaunay codebase can be roughly divided into four parts: * Core modules: - classes ADmainCGAL, ADedgeCGAL, ADsimpCGAL * Support modules: - classes ADPointNeighbors, ADconvex, MyDelaunay - global functions: ADfunc, MySTLExtensions - global funcTORs: ADutil * Library code: - CGAL-3.0 from www.cgal.org (I include patch files to replace static filters, intersection tests and more: see Installation manual on how to copy them). - Boost (graph, pairs, smart pointer in Linux, etc.): from www.boost.org * Open source third party classes integrated in the project: - auto_ref (fast smart pointer by Itay Maman, available at www.codeproject.com/cpp/auto_ref.asp) - sorted_vector (by Martin Holzherr, available at www.codeproject.com/vcpp/stl/sorted_vector.asp) Apart from this there is the useful MATLAB interface to this code, see README.ADMATLAB.txt. -------------------------------------------------------------------------------------------- Here is the current set of classes/files in the ADCGAL implementaion, along with the meaning and role of each module: ADmainCGAL.cpp : instantiates and runs the other classes (there could be one of these running per point set if we are processing many in parallel. Right now I would just run many instances of ADCGAL together) ADedgeCGAL.cpp : ADedge objects are created in a loop (once for each potential short AD edge), and they perform all computations for an AD edge, and store the data that needs to be used by ADsimpCGAL in an EdgeInfo object before being destroyed. Thus if we are doing parallel computation, several ADedges could be computed in parallel, but the output must be merged and only the valid edges (with threshold where cells have vertices 0..3, and neighbor cells 0..3 opposite to corresponding vertices. Edges are represented as Triple(Cell,index,index), and Faces as pair(Cell,index). Turns out that pruning the triangulation by deleting cells with long edges (which would lead to a pruned-Delaunay class that biologists working with CGAL could use) led to all kinds of weird crashes since algorithms for DT edge and face traversal were not written to handle random missing tetrahedra (or worse, disconnected components). So I changed the code to introduce a Boolean (proxy class for boolean) info in each cell, and I report each edge or facet traversed only if one of the cells it touches is still live. It won't work to test the cell included in the edge/facet representation, since the scheme used to avoid edge/facet repetitions during traversal, which compares cell pointer values, may end up reporting an edge or facet in a dead cell. There has to be a more efficient way to do this, but it was the most obvious extension of what I'd already written that could work, and it's only done once at the beginning. ADPointNeighbors.{cpp,h}. This class returns sets of points near one or two vertices, pairs of points near 2 vertices and each other, and in general manages proximity information. It uses adjacency list graphs (from Boost Graph Library) of short and non-Delaunay short edges that I create initially, and remove edges from these graphs if their thresolds are found to be over the cutoff. It could be modified to use kd-tree's spherical range query eventually, though I thought the graph representation would be simpler and allow persistent storage and removal of edges on demand. ADutil.h: Mostly when I try to think how something that I did in MATLAB should be done (fast) in C++, I come up with templated code, usually a function object (sometimes an algorithm) that I then place here, since templated code needs to be in the header. There's stuff for indexing a sequence with a range of indices to produce a subsequence, finding the minimum over a range of indices, modeling a constraint equation and what its test on points and segments, lifting, and verifying that a displaced point was within the computed hull (which actually taught me that cases occur where "center of bounding box" was not in the convex hull, so I replaced it by "centroid" which always is). ADconvex.{cpp,h}: This is a class, that computes the convex hull in two ways, depending on the value of the constant ADCGAL_DELAUNAY_HULL in ADcommon.h (1 for using Delaunay insertion as described in the CGAL paper, and 0 for using QuickHull as implemented by CGAL's Convex_hull_3. We could go back to this class once a good static-filtered version of Convex_hull_3 becomes available. MySTLExtensions.h: input/output operators and iterators for pair, if we know how to handle T (useful to read/write various edge lists). Various other small extensions. SortedTuples.h: implementations of sorted_pair, sorted_triple and sorted_quad that get sorted on creation. ADCGAL.h: Header file that defines all the main classes (ADmain, ADsimp, and ADedge) and the data types of packets they exchange with one another, in one place. ADcommon.h: Include file with short names for kernel and data types defined. ADdefs.h: Include file with many of the parameter settings to change the behavior of AD. Eventually, all of these should be out of this file and in the metadata (i.e. read in from a config file at runtime). Here are the switches in this file: ADCGAL_VERBOSE: Defines if overall the program should spew out trace messages as it runs (#defined 1), or not (0). This global definition can be overridden by #define statements in individual .cpp files, though that is a gruesome hack. BOOST_NO_INTRINSIC_WCHAR_T had to be defined to make the Boost Graph Library compile on Windows/MSVC7.0. Perhaps it is not needed now, but it does not hurt. AD_COMPARE_EPSILON, AD_ZERO and AD_INFINITY are appropriate small constants for fuzzy floating point comparisons and checks for zero and infinity to succeed. ADCGAL_DELAUNAY_HULL if 1 leads to the Delaunay insertion convex hull implementation that is stable but slower (needs Static_filters), and if 0 leads to the Convex_Hull_3 implementation that is faster (can work with Cartesian) but crashes a lot because of numerical issues. External Libraries used: CGAL (tested with v3.0 downloaded Oct 2003), Boost Graph Library (v.2.2 downloaded Nov 2003) External modules used: sorted_vector.hpp by Martin Holzherr, available from www.codeproject.com/vcpp/stl/sorted_vector.asp Used to implement sets in a fast way, as well as keep other integer lists and tuple lists sorted. There are two variants, I use the one that does not allow repeats. auto_ref.{cpp,hpp} by Itay Maman, available from www.codeproject.com/cpp/auto_ref.asp Used to keep smart pointers to large objects (eg. ADedge,EdgeInfo,ADsimp) within an STL container, rather than the object itself. Smart pointers would makes sure they get duplicated/destructed when needed by the internal container mechanisms, whereas raw pointers would lead to a memory leak. See "Effective STL" by Scott Meyers for more information on this.