# Project #2: Convex Hull

## Overview

In this project, you will implement a divide and conquer algorithm for finding the convex hull of a set of points and you will analyze the algorithm both theoretically and empirically.

## Objectives

• Implement a divide and conquer algorithm
• Solve a non-trivial computational geometry problem
• Mature as problem-solvers
• Understand the relationship between theoretical and empirical analysis

## Background

The convex hull of a set $Q$ of points is the smallest convex polygon $P$ for which each point in $Q$ is either on the boundary of $P$ or in its interior.

To be rigorous, a polygon is a piecewise-linear, closed curve in the plane. That is, it is a curve, ending on itself that is formed by a sequence of straight-line segments, called the sides of the polygon. A point joining two consecutive sides is called a vertex of the polygon. If the polygon is simple, as we shall generally assume, it does not cross itself. The set of points in the plane enclosed by a simple polygon forms the interior of the polygon, the set of points on the polygon itself forms its boundary, and the set of points surrounding the polygon forms its exterior. A simple polygon is convex if, given any two points on its boundary or in its interior, all points on the line segment drawn between them are contained in the polygon’s boundary or interior.

We denote the convex hull of $Q$ by $CH(Q)$. Intuitively, we can think of each point in $Q$ as being a nail sticking out from a board. The convex hull is then the shape formed by a tight rubber band that surrounds all the nails. The following figure shows a set of points $Q = \{p_0, p_1, \ldots, p_{12}\}$ with its convex hull $CH(Q)$:

## Applications

More generally beyond two dimensions, the convex hull for a set of points $Q$ in a real space $V$ is the minimal convex set containing $Q$.

Algorithms for some other computational geometry problems start by computing a convex hull. Consider, for example, the two-dimensional farthest-pair problem: we are given a set of n points in the plane and wish to find the two points whose distance from each other is maximum. This pair is also referred to as the diameter of the set of points. You can prove that these two points must be vertices of the convex hull.

The problem of finding convex hulls also finds its practical applications in pattern recognition, image processing, statistics, and GIS.

## Divide and Conquer

The high-level requirement for this project is to implement a divide and conquer algorithm for finding the convex hull of a set of points. If that is sufficient and you want to take it from here without any further guidance, you may skip the rest of this section and pick up at “Geometry for Common Tangents” (or even thereafter). This instructions document does not constitute a specification for a solution and, therefore, does not spell out a full-fledged solution. You will need to do some problem solving of your own. If you would like some help, read on.

### Possible Approach #1

In one possible divide-and-conquer method for finding the convex hull,

1. The set of $n$ points is divided into two subsets, $L$ containing the leftmost points and $R$ containing the rightmost points.
• To simplify matters, all points with the same $x$ coordinate should belong to the same subset, even if this makes the division not exactly into halves.
• Note: the point generator included with the project will sometimes generate points with duplicate $x$ coordinates (and duplicate $y$ coordinates).
2. The convex hulls of the subsets $L$ and $R$ are computed recursively
• Representation: How should you represent a convex hull? It is recommended that you simply represent a hull as an ordered list of points.
3. The algorithm requires solutions for the base-cases (i.e., the leaves of your recursion).
• How small should the base-cases be? The base-cases can be a small number of points (possibly one, two, or three points). You can easily define the hulls for the base cases to be #ordered lists that proceed in clockwise order.
4. When you return from two recursive calls with a left hull and a right hull, your job is to merge the hulls.
• To merge the left hull $CH(L)$ and the right hull $CH(R)$ it is necessary to find the two edges known as the upper and lower common tangents (shown in red below). A common tangent #of two simple convex polygons is a line segment in the exterior of both polygons intersecting each polygon at a single vertex. If continued infinitely in either direction, the #common tangent would not intersect the interior of either polygon.
• The upper common tangent can be found by scanning upward along the left hull and along the right hull. Some guidance with regard to finding the common tangents is given below; #although you will need to work out some additional details.
• Can you merge the two hulls in such a way that you construct a new merged hull that is also in clockwise order? What direction should the upper common tangent be in order to #maintain clockwise order? What about the lower common tangent?
• You can trace around the two sub-hulls and the common tangents to produce the new merged hull. The right edges belonging to the left subset and the left edges belonging to the #right subset will simply fall away.
• The bottom-line: you don’t need to sort the points in your hull to be in clockwise order. Just maintain a clockwise order as you merge smaller hulls into larger hulls.
• Note that you decide whether you want clockwise *OR* counter-clockwise order. You decide once and stick with your decision. It’s like establishing a country in the age of the #automobile and deciding whether your citizens will drive on the left-hand side of the road or the right.
5. In the example shown above, the final hull appears as follows:

### Possible Approach #2

An entirely other divide and conquer approach is called QuickHull. This approach is certainly also a possible way to attack the problem that would be valid for this project.

## Geometry for Common Tangents

To shed some light on the requisite geometry for testing whether a line segment joining two hulls is a common tangent, consider the following sketch of an approach to identifying an upper common tangent. This approach can also be applied in a symmetric fashion to identify a lower common tangent.

### A Test for determining on which side of a line a point can be found

Question: how do we know that $pq$ is not an upper common tangent? How to answer:

• Assume $p$ is a point with coordinates $(p.x, p.y)$
• Assume $q$ is a point with coordinates $(q.x, q.y)$
• Then slope of line connecting $p$ and $q$ is $m=rise/run=(p.y-q.y)/(p.x-q.x)$
• Then point-slope form of that line is $y-p.y = m \cdot (x - p.x)$; use the real values of $(p.x, p.y)$ and the slope $m$ found above
• Then solve for the variable $y$ to convert the line to slope-intercept form: $y = m \cdot x + b$; consequently, $b = -m \cdot p.x + p.y$
• Now define $test(x,y) = m\cdot x + b - y$
• You can apply this test to any point, such as $r = (r.x, r.y)$
• $test(r) = m\cdot r.x + b - r.y$
• if $r$ is above the line $pq$, then this test will be negative. Why?
• Likewise, if $r$ is below the line $pq$, then this test will be positive
• There should be no points above $pq$, but $r$ is above the line, as seen in the figure and according to this test
• Thus, $pq$ is not an upper common tangent

You can apply this test to any point in either the left convex hull $L$ or the right convex hull $R$ as needed.

### A Better Test

Floating-point precision can be an issue. When computing the slope (“rise” over “run”), the denominator (the “run”) can be very small for a near-vertical line, which may lead to errors in your algorithm. To avoid this situation, we need to refine our test function by avoiding division altogether:

• Substitute the definitions of $m$ and $b$ as follows: $test(r) = \frac{p.y-q.y}{p.x-q.x} \cdot r.x - \frac{p.y-q.y}{p.x-q.x} \cdot p.x + (p.y - r.y)$
• Factor out the slope ratio: $test(r) = \frac{p.y-q.y}{p.x-q.x} \cdot (r.x-p.x) + (p.y - r.y)$
• Multiply both sides by the denominator: $test(r) \cdot (p.x-q.x) = (p.y-q.y) \cdot (r.x-p.x) + (p.y-r.y) \cdot (p.x-q.x)$
• Define a new test modified by the sign of $(p.x-q.x)$ as follows: $test2(r) = \left[(p.y-q.y) \cdot (r.x-p.x) + (p.y-r.y) \cdot (p.x-q.x)\right] \cdot sign$
• where $sign$ could be computed using the C-style ternary conditional expression: $sign = \{(p.x-q.x) \ge 0$ ? $1 : -1\}$

### Searching for the Common Tangent

In the search for an upper common tangent, consider making your first hypothesis near the top moving up, as necessary, by walking up a point at a time, alternating on either side. In the illustrated example, the point labeled $r$ is next on the right hand side, if your hull is an ordered list, oriented in clockwise order.

Also, design your tangent finding procedure in such a way that it avoids considering possible tangent lines that are nearly vertical. In order to do so, make sure that your initial hypothesis about a common tangent starts at a place that would likely be roughly horizontal: e.g., top point on the left connected to top point on the right. A bad starting place would be: bottom point on the left connected to top point on the right (or vice versa).

When you are checking a point to detect whether it is above or below the line, be sure not to check the points used to calculate the line: such a point is on the line, not above or below the line.

### Cautions

The .NET pictureBoxView orients the x-axis in the usual way (from left to right), but the y-axis is oriented from top to bottom. In other words, the origin for the space is in the upper left corner. This will reverse how you think about slope. In particular, differences between y-coordinates are counter-intuitive: if $y_1 – y_2$ is negative, then $y_1$ is higher on the canvas. If it is positive, then $y_1$ is lower on the canvas. Make sure you are oriented properly.

You may need to use doubles instead of floats to represent points in order to give greater precision when points are nearby one another.

## Provided Framework

We provide a framework in C# to get you started and allow you to focus on the primary tasks of the project. In the framework you will find:

• A Graphical User Interface that generates a specified number of random points.
• A hook: the event handler buttonSolve_Click() for the “solve” button calls the FormMain.solveAction() method, which calls the method ConvexHullSolver.Solve() that you need to implement.

Be sure that you are using the latest .NET framework: Bring up the properties of your project (can use the key-chord Alt+F7). For the property “Target framework:”, specify .NET Framework 4 (or higher).

## Required: Whiteboard Experience (TM)

You are developing as a problem solver. You are also learning to solve a problem before you start writing code. In this project, we require you to step up to a whiteboard after reading these instructions before writing any code. Use the whiteboard to walk through the solution with a friend. Your friend could be another CS major, a CS 312 classmate, or another technical person with an eye for detail.

Make sure you sketch out and understand how to represent a convex hull. What will you do to divide, conquer, and merge? Simulate your merge algorithm with your marker. Think out loud and with sketches for simple examples.

In short, make sure you understand what you are going to do before you write a stitch of code. As you will note below, including a photo of your “whiteboard experience” is a required element of your project report.

Have a “whiteboard experience” and document it with a photograph of you next to the whiteboard sketches, etc.

Prepare a whiteboard experience report as a .pdf file. It should consist of one page with the following elements:

• [5 points] a caption (2-3 sentences) explaining why you are ready to proceed to write code
• the names of the people in your audience

Submit the whiteboard experience report online by the whiteboard experience deadline for this project.

## To Do

1. Have a “whiteboard experience” as directed above, and submit it online by the deadline for the whiteboard experience.
2. Write the full, unambiguous pseudo-code for your divide-and-conquer algorithm for finding the convex hull of a set of points $Q$.
• Pseudo-code is sufficient when it captures every block of code and each step is no more than a constant number of elementary operations.
• Be sure to label all of the procedures / functions belonging to your algorithm.
• Also, analyze and annotate each block of code with a Big-O (or Big-Theta) bound on its worst-case running-time.
• Document your algorithm with legible comments in important places in the pseudo-code that make it obvious to the reader what your solution is doing. This documentation also provides evidence of your comprehension. With the aid of adequate documentation, the correctness of your approach should be easy for the reader (e.g., TA) to verify. If the TA cannot easily determine the code's correctness, then fewer points will be awarded.
3. Analyze the whole algorithm for its worst-case running-time. State a Big-O (or Big-Theta if possible) asymptotic bound. Because your implementation is divide-and-conquer, you can employ the master theorem. If you'd rather use the method of solving recurrence relations, that will also satisfy the requirement.
4. Implement your divide and conquer algorithm in C# in the following method: ConvexHullSolver.Solve(Graphics g, List pointList)
• Use the divide and conquer algorithm from step #1 to find the convex hull of the points in pointList.
• You may use the Graphics method Graphics.DrawLine() to draw the line segments of the convex hull on the UI once you have identified them. You may also use Graphics.DrawPolygon()
• Call pictureBoxView.Refresh() at the end of the FormMain.solveAction() method after the timer stops; doing so facilitates time measurements that reflect only the algorithm without drawing times.
• Optional: if you wish to animate the whole process, you can refresh more often. Doing so will require the following changes:
• Add an argument of type System.Windows.Forms.PictureBox to ConvexHullSolver.Solve() to allow you to pass the FormMain.pictureBoxView object as an extra argument from FormMain.solveAction().
• Note that System.Windows.Forms.PictureBox inherits from System.Windows.Forms.Control, where the Refresh() method is defined.
• Then you can add calls to pictureBoxView.Refresh() as needed inside ConvexHullSolver.Solve() to update the display. Drawing will adversely affect your running time, so condition these refresh calls on some boolean flag (possibly triggered from a new control in the GUI).
• Document your algorithm with legible comments in important places in the code that make it obvious to the reader what your solution is doing. This documentation also provides evidence of your comprehension. With the aid of adequate documentation, the correctness of your approach should be easy for the reader (e.g., TA) to verify. If the TA cannot easily determine the code's correctness, then fewer points will be awarded.
5. Conduct an empirical analysis of your algorithm by running several experiments as follows:
• For each value $n$ in {10, 100, 1000, 10,000, 100,000, 200,000, 300,000, 400,000, 500,000, 600,000, 700,000, 800,000, 900,000, 1,000,000}
• Generate 10 sets of $n$ points $(x, y)$ in the plane. You may use either provided point generator: the 2-D Gaussian (Normal) distribution or the uniform distribution (choose one and stick to it). For every point, $x$ and $y$ are real numbers (doubles). You may use the provided project scaffold code to assist you.
• For each point set,
• find the convex hull
• record the elapsed time
• It is highly recommended that you at least partially automate your experiments by adding a “10X” button, the handler for which runs the solver 10 times, stores the 10 results, and displays a report.
• For each size $n$, compute the mean time $t$ required.
• Plot $n$ (independent variable) versus $t$ (dependent variable).
6. Find the relation of your plot to your theoretical analysis. In other words, if your theoretical analysis says that for a set of n points Q, the convex hull $CH(Q) \in O(g(n))$, does $g(n)$ actually fit your empirical data? If so, what is the constant of proportionality $k$ so that $CH(Q) = k \cdot g(n)$? If not, then which order of growth $g(n)$ best fits your empirical data, and what is the constant of proportionality? You can fit a function representing an order of growth analytically using software or by trial and error in a spreadsheet, for example.
7. If your theoretical and empirical analyses differ, discuss the reason(s) for the difference.

## Report

Write a type-written report containing the following distinct sections as a single PDF document, and submit it by following the submission directions in the course syllabus.

1. Place your name, net ID, and section number at the top of the first page.
2. Also include an estimate of the amount of time you spent on this project at the top of the first page.
3. [15 points] Include correct, concrete, readable pseudo-code, along with the worst-case analysis of each procedure / function.
• Document your algorithm with legible comments in important places in the pseudo-code that make it obvious to the reader what your solution is doing. This documentation also provides evidence of your comprehension. With the aid of adequate documentation, the correctness of your approach should be easy for the reader (e.g., TA) to verify. If the TA cannot easily determine the code's correctness, then fewer points will be awarded.
5. Experimental results:
• [10 points] Include a table containing your raw experimental results (results from all 10 individual runs) and the mean times for each problem size.
• [5 points] Include your plot.
6. Empirical analysis: Include all work and explain your assumptions.
• [5 points] Include your discussion of the pattern in your plot.
• [5 points] Which order of growth fits best?
• [5 points] Give an estimate of the constant of proportionality.
7. [10 points] Discuss any differences between your theoretical and empirical analyses. If there are none, discuss why that is so.
8. [5 points] Include a correct screenshot of an example with 100 points. You can capture the image of the window (using the ctrl-alt- shift-PrtSc facility for capturing an image of the window in focus).
9. [10 points] Include a correct screenshot of an example with 1000 points.
10. [15 points] Include your source code (just the parts you wrote, none of the code provided to you)
• Document your algorithm with legible comments in important places in the code that make it obvious to the reader what your solution is doing. This documentation also provides evidence of your comprehension. With the aid of adequate documentation, the correctness of your approach should be easy for the reader (e.g., TA) to verify. If the TA cannot easily determine the code's correctness, then fewer points will be awarded.

## Further Exploration

Here are suggestions for possible algorithmic improvements. They are listed without prejudice against things that are or are not on this list:

Implement another algorithm for computing the convex hull, CH(Q). Conduct a similar theoretical and empirical analysis to determine how your new algorithm compares with the divide-and-conquer algorithm.

Possibilities include:

• the incremental method (see p. 948 of Cormen et al.)
• the prune-and-search method (also see p. 948 of Cormen et al.)
• Graham’s scan (p. 949 of Cormen et al.)
• Jarvis’s march (using a technique known as “package wrapping” or “gift wrapping”) (see p. 955 of Cormen et al.).

You may find that beyond your algorithmic improvement, animating your algorithm will reveal interesting properties of the nature and efficiency of your algorithm.

## Acknowledgments

• Adapted from Cormen, Leiserson, Rivest, & Stein. Introduction to Algorithms. Second Edition. MIT Press. 2001. pp. 939, 947-957.
• Additional input from the Wikipedia article titled “Convex hull”.
• Some figures from Tim Lambert of the University of New South Wales (UNSW). 