The coreset framework was proposed by Agarwal, Har-Peled and Varadarajan in an effort to setup a unified approach for developing fast approximation algorithms for shape fitting. Roughly speaking, a coreset is a small subset that approximates the input in terms of a certain extent measure within a prescribed error. Many computational problems related to extent measures (e.g., computing minimum bounding box, smallnest enclosing ball or cylinder, fitting a sphere through a point set) can be solved very efficiently by first computing a coreset and then solving the problem on that coreset.

Computing coresets with respect to most extent measures can be reduced to the problem of computing the kernel of a point set, where an ε-kernel of a set S of points is a subset C of S so that for any slab (a region enclosed between two parallel hyperplanes) that covers C, an ε-expansion of the slab also covers S. We developed and implemented a simple algorithm for computing the kernels in low dimensions. We also developed simple algorithms for computing robust kernels, i.e., kernels that can handle a small number of outliers.

The general technique for using coresets to solve shape fitting problems is to compute the coreset first and then solve the problem on the coreset. We proposed and implemented a simpler incremental algorithm for shape fitting. Our algorithm is guaranteed to terminate in a small number of iterations that is independent on the input size in the worst case. In practice, it runs much faster than the standard coreset-based counterparts.

We also applied the coreset technique in the context of moving points where one wishes to maintain a geometric or topological structure on a set of moving points, e.g., minimum bounding box, smallest enclosing disk, diametral pair, Delaunay triangulation. Since the combinatorial description of the exact structure may change many times (e.g., diametral pair may change quadratic number of times even if the points moved with fixed velocity), we use coresets to maintain a close-approximation of the structure that is more robust and that undergoes much fewer changes.

available software

Computing ε-kernels of a point set in low dimensions

Incremental algorithm for shape fitting (box, cylinder, annulus)

Kinetic data structures using coresets (minimum bounding box, convex hull)

Robust coresets for handling outliers [coming soon]

download

Coreset software will soon be available for download from this site. Please check back or contact Hai Yu <fishhai@cs.duke.edu>.

PLATFORMS: SunOS, Linux.

notes

This software is built upon the ANN library by D. Mount and S. Arya. See README_ANN or http://www.cs.umd.edu/~mount/ANN for more information on ANN.

"Coreset", "cylinder", "kds", and "generator" were developed by Hai Yu; and "annulus", "box" were developed by Raghunath Poreddy. Other directories were inherited from the ANN library. The source codes are in preliminary form, and more polished and well-commented versions will be released soon. Stay tuned!

warnings

(1) "Coreset" works in any dimension, though it becomes impatiently slow in dimensions larger than 6. "cylinder" and "box" only work in 3D. "annulus" and "kinetic convex hull" only work in 2D.

(2) These implementations are in very preliminary form (primarily for generating experimental results for the paper :). Developer Hai Yu will clean them up and polish some part of the implementations in the future, and hopefully version 2.0 will be out someday. Meanwhile, if you encounter serious problems or need urgent help with the code, please do not hesitate to contact Hai by email <fishhai@cs.duke.edu>.

(3) In directory "box", gdiam_testmid.C and gdiam.C cannot be compiled using gcc with versions greater than 2.96.

(4) Sometimes the "-O3" optimization option causes strange phenomenons. For example, if you compare two almost equal double type reals a and b with "a < b", "a == b", and "a > b", you may all get "true". This issue becomes serious in kinetic data structures, where you need to schedule bunches of events in a priority queue. I don't know why this would happen or how to handle this, except for recommending not using this option for compilation if you encounter such strange things.

publications

(1) P.K. Agarwal, S. Har-Peled, and K.R. Varadarajan. Approximating Extent Measures of Points. J. ACM, vol. 51, pages 606-635, 2004.

(2) P.K. Agarwal, S. Har-Peled, and K. Varadarajan. Geometric approximation via coresets. In Current Trends in Combinatorial and Computational Geometry (E. Welzl, ed.), Cambridge University Press, 2005, in press.

(3) H. Yu, P.K. Agarwal, R. Poreddy, and K.R. Varadarajan. Practical Methods for Shape Fitting and Kinetic Data Structures using Coresets. In Proc. 20th Annu. ACM Sympos. Comput. Geom., pages 263-272, 2004.

newsletter article

Coreset software was featured in BioGeometry News, July 2005 issue (pdf)

developer

Hai Yu <fishhai@cs.duke.edu>