| 1 |
clustr - construct polygons from tagged points |
|---|
| 2 |
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- |
|---|
| 3 |
written by Schuyler Erle <schuyler@nocat.net> |
|---|
| 4 |
(c) 2007-2008 Yahoo! Inc. |
|---|
| 5 |
|
|---|
| 6 |
Overview |
|---|
| 7 |
-------- |
|---|
| 8 |
Clustr takes a text file containing longitude/latitude points, tagged with |
|---|
| 9 |
a bit of text, and attempts to generate minimal polygons that "cover" those |
|---|
| 10 |
points, using an "alpha" parameter to determine the notion of "coverage". |
|---|
| 11 |
The polygons are written to an ESRI Shapefile, suitable for use in GIS |
|---|
| 12 |
software. |
|---|
| 13 |
|
|---|
| 14 |
Building clustr |
|---|
| 15 |
--------------- |
|---|
| 16 |
You will need to have the GDAL library (http://www.gdal.org/) installed, |
|---|
| 17 |
with OGR support enabled, for writing the Shapefiles. You will also need |
|---|
| 18 |
the CGAL library (http://www.cgal.org/) installed. |
|---|
| 19 |
|
|---|
| 20 |
Depending on your system configuration, you may need to edit the Makefile |
|---|
| 21 |
to point to the location of the GDAL and CGAL libraries and C/C++ headers. |
|---|
| 22 |
|
|---|
| 23 |
To build clustr, simply run 'make'. The 'clustr' executable is |
|---|
| 24 |
self-contained (aside from shared libraries) and does not rely on any |
|---|
| 25 |
supporting files. The code has been tested with GCC 3.4 and 4.2. |
|---|
| 26 |
|
|---|
| 27 |
Using clustr |
|---|
| 28 |
------------ |
|---|
| 29 |
|
|---|
| 30 |
$ clustr [-a <n>] [-p] [-v] <input> <output> |
|---|
| 31 |
|
|---|
| 32 |
-h, -? this help message |
|---|
| 33 |
-v be verbose (default: off) |
|---|
| 34 |
-a <n> set alpha value (default: use optimal value) |
|---|
| 35 |
-p output points to shapefile, instead of polygons |
|---|
| 36 |
|
|---|
| 37 |
If <input> is missing or given as "-", stdin is used. |
|---|
| 38 |
|
|---|
| 39 |
* The input file should be formatted as: <tag> <lon> <lat>\n |
|---|
| 40 |
|
|---|
| 41 |
* The input file *must* be sorted. Piping it through sort(1) will |
|---|
| 42 |
suffice. |
|---|
| 43 |
|
|---|
| 44 |
* Tags must not contain spaces. |
|---|
| 45 |
|
|---|
| 46 |
If <output> is missing, output is written to "clustr.shp". The output is |
|---|
| 47 |
always an ESRI Shapefile. |
|---|
| 48 |
|
|---|
| 49 |
The -p option simply converts the input file to a Shapefile containing the |
|---|
| 50 |
input points. This can be useful for comparing the polygon output of clustr |
|---|
| 51 |
to its input dataset in a GIS browser. |
|---|
| 52 |
|
|---|
| 53 |
The -a option sets the value of "alpha", which in turn determines the |
|---|
| 54 |
convexity of the output polygons. Larger values make more convex polygons, |
|---|
| 55 |
smaller values make less convex polygons. Smaller values will also filter |
|---|
| 56 |
out outlying points, which may be what you want. |
|---|
| 57 |
|
|---|
| 58 |
Omitting -a or setting it to zero will cause clustr to use the CGAL |
|---|
| 59 |
library's idea of "optimal alpha", which, if there are significant outliers |
|---|
| 60 |
in the input file, may not actually be what you want. In general, we |
|---|
| 61 |
recommend starting with a value of 0.001. |
|---|
| 62 |
|
|---|
| 63 |
Bugs |
|---|
| 64 |
---- |
|---|
| 65 |
Clustr does not use an exceptionally sophisticated algorithm for extracting |
|---|
| 66 |
polygons from the generated alpha shape, with the result that on occasion |
|---|
| 67 |
improper or self-intersecting polygons get generated. This can confuse GIS |
|---|
| 68 |
programs that expect better behaved data sets, and ought to be fixed someday. |
|---|
| 69 |
|
|---|
| 70 |
Also, the total count of points input is written to every shape record, which |
|---|
| 71 |
will be wrong if there are multiple components for a given input tag. This |
|---|
| 72 |
could be fixed by testing inclusion of each point after the polygons are |
|---|
| 73 |
constructed, which would lengthen runtime. Cost/benefit? Not sure. |
|---|
| 74 |
|
|---|
| 75 |
License |
|---|
| 76 |
------- |
|---|
| 77 |
|
|---|
| 78 |
Copyright (c) 2007-2008 Yahoo! Inc. |
|---|
| 79 |
|
|---|
| 80 |
All rights reserved. This code is free software; you can redistribute it |
|---|
| 81 |
and/or modify it under the terms of the GNU General Public License (GPL), |
|---|
| 82 |
version 2 only. This code is distributed WITHOUT ANY WARRANTY, whether |
|---|
| 83 |
express or implied. See the GNU GPL for more details |
|---|
| 84 |
(http://www.gnu.org/licenses/gpl.html) |
|---|
| 85 |
|
|---|
| 86 |
AS A SPECIAL EXCEPTION, YOU HAVE PERMISSION TO LINK THIS PROGRAM WITH THE |
|---|
| 87 |
CGAL LIBRARY AND DISTRIBUTE EXECUTABLES, AS LONG AS YOU FOLLOW THE REQUIREMENTS |
|---|
| 88 |
OF THE GNU GPL VERSION 2 IN REGARD TO ALL OF THE SOFTWARE IN THE EXECUTABLE |
|---|
| 89 |
ASIDE FROM CGAL. |
|---|
| 90 |
|
|---|
| 91 |
- Fin - |
|---|