ANN: A Library for
Approximate Nearest Neighbor Searching
David M. Mount and Sunil Arya
Version 1.1.2
Release Date: Jan 27, 2010

What is ANN?
ANN is a library written in C , which supports data structures and algorithms for both exact and approximate nearest neighbor searching in arbitrarily high dimensions.

In the nearest neighbor problem a set of data points in d-dimensional space is given. These points are preprocessed into a data structure, so that given any query point q, the nearest or generally k nearest points of P to q can be reported efficiently. The distance between two points can be defined in many ways. ANN assumes that distances are measured using any class of distance functions called Minkowski metrics. These include the well known Euclidean distance, Manhattan distance, and max distance.

Based on our own experience, ANN performs quite efficiently for point sets ranging in size from thousands to hundreds of thousands, and in dimensions as high as 20. (For applications in significantly higher dimensions, the results are rather spotty, but you might try it anyway.)

The library implements a number of different data structures, based on kd-trees and box-decomposition trees, and employs a couple of different search strategies.

The library also comes with test programs for measuring the quality of performance of ANN on any particular data sets, as well as programs for visualizing the structure of the geometric data structures.

Why approximate nearest neighbors?
Computing exact nearest neighbors in dimensions much higher than 8 seems to be a very difficult task. Few methods seem to be significantly better than a brute-force computation of all distances. However, it has been shown that by computing nearest neighbors approximately, it is possible to achieve significantly faster running times (on the order of 10's to 100's) often with a relatively small actual errors. ANN allows the user to specify a maximum approximation error bound, thus allowing the user to control the tradeoff between accuracy and running time.

Conditions of Use
As of Version 1.0, ANN is distributed under the terms of the GNU Lesser Public License. Please also check out the Copyright Notice and License Terms.

The University of Maryland and the authors make no representations about the suitability or fitness of this software for any purpose. It is provided "as is" without express or implied warranty.

System Requirements
Compiling ANN requires an ANSI C compiler. It has been successfully compiled and run on a number of platforms, including Sun workstations running SunOS 5.x (Solaris) and Linux 2.x platforms using the g compiler, and under Microsoft Windows using VisualStudio 2005 (Version 8.0) and Visual C 2005.

How can I get ANN?
The latest version of ANN can be downloaded here. The changes in version 1.1.2 are very minor, and primarily involve fixing compilation errors with modern C compilers. See ReadMe.txt for more information.

Older versions of ANN are also available, but not supported.

Questions/Comments?
If you have questions or comments, please email them to Dave Mount: [email protected].
Acknowledgments
We would like to thank the numerous user's of ANN who have made contributions in the form of suggestions, makefile entries, and finding bugs (and waited so patiently for this release). We would also like to acknowledge the support of the NSF under grants CCR-9712379 and CCR-0098151.


Back to Dave Mount's home page.

Last updated on Jan 28, 2010.