root/trunk/clustr/README

Revision 617, 3.5 kB (checked in by aaron, 8 months ago)

update to 0.21: patch from mattb to make multi-tags files work

Line 
1 clustr - construct polygons from tagged points
2 -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
3 written by Schuyler Erle <schuyler@nocat.net>
4 (c) 2007-2009 Yahoo! Inc.
5
6 Overview
7 --------
8 Clustr takes a text file containing longitude/latitude points, tagged with
9 a bit of text, and attempts to generate minimal polygons that "cover" those
10 points, using an "alpha" parameter to determine the notion of "coverage".
11 The polygons are written to an ESRI Shapefile, suitable for use in GIS
12 software.
13
14 Building clustr
15 ---------------
16 You will need to have the GDAL library (http://www.gdal.org/) installed,
17 with OGR support enabled, for writing the Shapefiles. You will also need
18 the CGAL library (http://www.cgal.org/) installed.
19
20 Depending on your system configuration, you may need to edit the Makefile
21 to point to the location of the GDAL and CGAL libraries and C/C++ headers.
22
23 To build clustr, simply run 'make'. The 'clustr' executable is
24 self-contained (aside from shared libraries) and does not rely on any
25 supporting files. The code has been tested with GCC 3.4 and 4.2.
26
27 Using clustr
28 ------------
29
30 $ clustr [-a <n>] [-p] [-v] <input> <output>
31
32    -h, -?      this help message
33    -v          be verbose (default: off)
34    -a <n>      set alpha value (default: use optimal value)
35    -p          output points to shapefile, instead of polygons
36
37 If <input> is missing or given as "-", stdin is used.
38
39   * The input file should be formatted as: <tag> <lon> <lat>\n
40
41   * The input file *must* be sorted. Piping it through sort(1) will
42     suffice.
43
44   * Tags must not contain spaces.
45
46 If <output> is missing, output is written to "clustr.shp". The output is
47 always an ESRI Shapefile.
48
49 The -p option simply converts the input file to a Shapefile containing the
50 input points. This can be useful for comparing the polygon output of clustr
51 to its input dataset in a GIS browser.
52
53 The -a option sets the value of "alpha", which in turn determines the
54 convexity of the output polygons. Larger values make more convex polygons,
55 smaller values make less convex polygons. Smaller values will also filter
56 out outlying points, which may be what you want.
57
58 Omitting -a or setting it to zero will cause clustr to use the CGAL
59 library's idea of "optimal alpha", which, if there are significant outliers
60 in the input file, may not actually be what you want. In general, we
61 recommend starting with a value of 0.001.
62
63 Bugs
64 ----
65 Clustr does not use an exceptionally sophisticated algorithm for extracting
66 polygons from the generated alpha shape, with the result that on occasion
67 improper or self-intersecting polygons get generated. This can confuse GIS
68 programs that expect better behaved data sets, and ought to be fixed someday.
69
70 Also, the total count of points input is written to every shape record, which
71 will be wrong if there are multiple components for a given input tag. This
72 could be fixed by testing inclusion of each point after the polygons are
73 constructed, which would lengthen runtime. Cost/benefit? Not sure.
74
75 License
76 -------
77
78 Copyright (c) 2007-2009 Yahoo! Inc.
79
80 All rights reserved.  This code is free software; you can redistribute it
81 and/or modify it under the terms of the GNU General Public License (GPL),
82 version 2 only.  This code is distributed WITHOUT ANY WARRANTY, whether
83 express or implied. See the GNU GPL for more details
84 (http://www.gnu.org/licenses/gpl.html)
85
86 AS A SPECIAL EXCEPTION, YOU HAVE PERMISSION TO LINK THIS PROGRAM WITH THE
87 CGAL LIBRARY AND DISTRIBUTE EXECUTABLES, AS LONG AS YOU FOLLOW THE REQUIREMENTS
88 OF THE GNU GPL VERSION 2 IN REGARD TO ALL OF THE SOFTWARE IN THE EXECUTABLE
89 ASIDE FROM CGAL.
90
91 - Fin -
Note: See TracBrowser for help on using the browser.