Introduction

A polygonal chain is a series of points connected by line segments. They are commonly used to represent cartographic features (e.g. coastlines, rivers, roads) in the context of GIS (Geospatial Information System), vectorization of digital images in applications like computer vision, or in any application that uses vector graphics.

It is often infeasible to maintain the original complexity of a polygonal chain, either because of size and/or computational constraints. Algorithms to simplify polygonal chains are therefore needed to reduce the complexity of the original chain by reducing the number of points while minimizing the variance from the initial structure.

The Ramer-Douglas-Peuker algorithm is commonly used for this task.

Also known as the iterative end-point fit algorithm, it works by recursively processing the line segment between endpoints, dividing the segment until all original points are with the error distance from the approximating segments. The expected running time of the algorithm is Θ(n log n), with a worst case of O(n2).

While generally fast, the algorithm doesn't minimize the number of the segments in the approximation.

Here, we examine an algorithm introduced by Iri and Imai that addresses the problem of minimizing line segments. While the original method has a time complexity of O(n3), we will also present an extension to the algorithm given by Eu and Toussaint in their paper, On Approximating Polygonal Curves in Two and Three Dimensions that improves the time complexity to O(n2), perhaps making it a practical alternative to the Ramer-Douglas-Peuker algorithm for certain applications.

We will present salient concepts associated with the algorithm, as well as an interactive demonstration.

An aside about min-# vs. min-ε

There are two general formulations of this problem.
  • min-#: a threshold for variance (or error), ε, is specified, while the number of line segments is minimized
  • min-ε: the number of line segments is specified, while error, ε, is minimized

The Iri-Imai Algorithm for min-#

The Iri-Imai Algorithm for minimizing the number of line segments in an approximation, given error ε, works by first constructing a DAG (Directed Acyclic Graph), G, that contains all possible edges between vertices in the input polygonal chain. (n2 edges).

The weight of each edge is set to the number of vertices from the original chain that would be skipped by following the edge.

For each edge, the distance between the edge and each vertex it skips is compared with ε. If all of the skipped vertices are within distance ε of the edge, the edge is kept as a potential approximating edge. Otherwise, the edge is discarded.

On this reduced set of edges, find the maximum weight path between the start and end points. The path is the simplified version of the input chain. This algorithm has a time complexity of O(n3).

The illustration below steps through the process, but simplifies the path finding step by using a breadth-first search to find the minimum length path.

This visualization illustrates the main steps involved in the Iri-Imai algorithm, at a high level. Use the controls below to step through.

The "Parallel Strip" criterion

Eu and Toussaint improve the time complexity of Iri and Imai's algorithm for the min-# problem by using the "Parallel Strip" criterion for ε. By using this relaxed definition for measuring error, the directed graph described in Iri and Imai's algorithm can be constructed in an asymptotically faster way.

The idea of the parallel strip is to buffer the line containing a potential approximating edge by ε. If all of the points that would be approximated by the edge lie within the parallel strip, then the edge is kept. Toussaint's first paper utilizing the parallel strip used this concept in an algorithm that solves the min-# problem in O(n2 log n) time.

In the current paper, the authors show that the same criterion is met by buffering the vertices (creating error discs) that would be skipped by an approximating edge. Using this new formulation, they find a new way to construct G, yielding an overall time complexity of O(n2) for the algorithm.

Interacting with the visualization below, it becomes clear that the error disc and parallel strip criteria are equivalent in terms of determining distance between a vertex and a potential approximating edge. Note that as a disc touches the approximating segment, the parallel strip touches the vertex associated with that disc.

Vary ε to see the interaction between a disc and the parallel strip.

Using Discs and Double Wedges to Construct G

The new formulation of the parallel strip criterion, as 'discs' of radius ε centered on the vertices of the polygonal chain, allows a way to construct G that visits each vertex, rather than each of n2 edges.

Each 'double wedge' is defined by a vertex, Vi, and the error disc of one of the following vertices, Vi + j. Each is constructed by finding the 2 lines that go through Vi that are tangent to the error disc. The double wedge is the region inscribed by those 2 lines.

The algorithm to create G, given a set of n vertices:

For every vertex, Vi to Vn - 1:
  1. Initialize a variable that will maintain the intersection of all generated wedges (IW). Starting value is the 'open cone' (the plane).
  2. Iterate over the rest of the vertices in the polygonal chain. (j=i + 1 to n)
    1. If Vj is inside the intersection of wedges (IW), then add the edge between Vi and Vj to G. Add an edge weight equal to the number of 'skipped' vertices (j - i - 1). Notice that this will always include the original edges of the polygonal chain. If the vertex is not inside, break the inner loop and return to step 1 for Vi + 1.
    2. Update the value of IW to the intersection of the current IW with the wedge defined by Vi and the error disc of Vj. If this intersection results in a null region, break the inner loop and return to step 1 for Vi + 1
This visualization illustrates how error discs and double cones can be used to construct G.

The Simplified Polygonal Chain

Once G has been constructed as above, Toussaint and Eu use forward dynamic programming to find the maximum weight path in O(n2) time.

Here, as above, we instead use a breadth-first search to find the minimum length path.

To open an interactive demo of the algorithm, where you can try out your own points and error values, click the button below:

The Closed Curve case

If the polygonal chain is closed (V1 connects to Vn), then you must apply the process n times, using each vertex as a starting point, in order to get a global min-# value. Time complexity O(n3).

Resources

Link to the repository with the code for this demo.
There are some websites that describe other algorithms for polygonal chain simplification.

Ramer-Douglas-Peucker interactive visualization:

Mike Bostock's Visvalingham-Whyatt interative visualization:

And a nice extension that provides non-crossing output:

There a great many papers that discuss the topic. Here are some that I found interesting:

Hiroshi IMAI, Masao IRI, Polygonal Approximations of a Curve — Formulations and Algorithms, Editor(s): Godfried T. TOUSSAINT, Machine Intelligence and Pattern Recognition, North-Holland, Volume 6, 1988, Pages 71-86, ISSN 0923-0459, ISBN 9780444704672,

https://doi.org/10.1016/B978-0-444-70467-2.50011-4

D. Eu, G.T. Toussaint, On Approximating Polygonal Curves in Two and Three Dimensions, CVGIP: Graphical Models and Image Processing, Volume 56, Issue 3, 1994, Pages 231-246, ISSN 1049-9652.

https://doi.org/10.1006/cgip.1994.1021

This is the paper that introduces the algorithm explored on this page.

Chan W.S., Chin F. (1992) Approximation of polygonal curves with minimum number of line segments. In: Ibaraki T., Inagaki Y., Iwama K., Nishizeki T., Yamashita M. (eds) Algorithms and Computation. ISAAC 1992. Lecture Notes in Computer Science, vol 650. Springer, Berlin, Heidelberg

https://doi.org/10.1007/3-540-56279-6_90

This paper describes another way to improve the time complexity of the Iri-Imai approach to O(n2)