Solving Jigsaw puzzles with Python and OpenCV

Published in

Towards Data Science

7 min readSep 29, 2018

At the beginning of 2018 I was gifted an awesome Star Wars 5000 pieces jigsaw puzzle (you can find it on Amazon here). Completing the puzzle took me about 2 months of patience and perseverance, but now I can look at my masterpiece with satisfaction and joy.

However, I still remember when I had to complete the central part of the puzzle, which is composed by a massive Darth Vader and Luke Skywalker (spoiler alert: Darth Vader’s son!!). I basically found myself sitting in front of a thousand pieces in all possible shades of black and dark blue, and finding matching pieces became a real pain.

This is when I decided to give Computer Vision a chance and try to write a program that would be able to find matching pieces by looking at their shapes.

In this first part I’ll explain how I was able to extract the four sides from each piece, in order to match the shapes in future. Here I’ll show one output image to make clear what I’m trying to achieve here:

1. Create a dataset

The first thing to do was taking pictures (with my phone) of more than 200 pieces in optimal light conditions. Notice that I took pictures of the back of the pieces since I won’t need the puzzle content bust just its shape. I also fixed the camera and the puzzle positions so that cropping the pieces out the entire image became a trivial task:

img = cv2.imread(join('images', filename))
img = img[1750:2500, 1000:2000]

Example of picture of puzzle piece and cropping

2. Piece Segmentation

Since both light conditions and piece color don’t change inside the dataset, segmentation is achieved using simple binary thresholding.
Before applying the binarization, a median filter is applied to the grayscale image in order to remove white noise on the puzzle piece. The binarized image is then smoothed using a mean filter:

gray   = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
gray   = cv2.medianBlur(gray, ksize=5)
thresh = cv2.threshold(gray, 130, 255, cv2.THRESH_BINARY)[1]
thresh = cv2.blur(thresh, ksize=(3, 3))

Once obtained the thresholded image, I clear potential false positives areas (background pixels marked as puzzle piece pixels) by using the cv2.connectedComponents OpenCV function and taking the connected component with the maximum area (i.e. the puzzle piece itself).
Finally I further crop the piece into a square image that allows for piece rotations without losing part of it. Here’s an example of the output image from this phase:

Puzzle piece segmentation and post-processing

3. Find the 4 piece’s corners

In order to separate each side of a puzzle piece, we need to correctly find the 4 main corners of the puzzle piece. This is the most critical part since all the following steps will work perfectly if the corners are correct.

During the first step I found an approximate location of the corners, while in section 4 I’ll explain how to refine the corner detections in order to obtain the exact locations.

3a. [Not successful] Canny, Hough Transform and KMeans

In my first attempt I used the Hough Transform to find the main lines of the puzzle piece and tried to cluster them into 4 different subsets (one for each side). Here I’ll explain briefly the algorithm steps:

Perform edge detection on the binarized image by applying the cv2.Canny function.
Apply the Hough Transform for line detection: cv2.HoughLines . For each line returned by this function, we get its coefficients in parametric form.
Remove unwanted lines by estimating the piece orientation: since we expect the four lines passing through the four piece corners to form a rectangle, I grouped together lines with the same orientation in order to prune lines that do not belong to this group.
Clustering the lines with KMeans: we can group all lines into 4 different clusters, one for each side of the piece, based on their coefficients.
Compute the mean line for each cluster.
Compute the intersections between the four mean lines to finally obtain the 4 piece corners.

Blue lines are all the hough lines after the pruning phase. Green lines are the mean lines for each cluster of lines.

Although straightforward, this algorithm didn’t prove to be robust: this happened especially when a few or no lines where found on one side of the puzzle piece, making the clustering result become unpredictable. Also, the lines do not always approximate enough the true side lines, making their intersection too far from the real corner.

3b. Harris corner detector and maximum rectangle estimation

In this second attempt, which proved to be accurate and robust, I applied the Harris corner detection in order to find the best corner candidates and refined this estimation with an algorithm that is able to always detect the right corners.

I first applied the cv2.Harris function, which returns a new floating-point image, where at each pixel a ‘cornerness’ value is computed (the higher, the more robust
I then found the local maxima of the Harris image above a certain threshold. After this passage I have discrete points which represent candidate corners of the puzzle piece. The good thing is that for each puzzle piece in my dataset the algorithm returns a candidate corner where a real corner is present. See an example here:

Candidate corners obtained by finding local maxima of the Harris corner detection function

3. Given all the candidate corners, find the set of four corners which maximizes a function that takes into account:
- the ‘rectangularness’ of the shape formed by the four points: the more the four angles are close to 90°, the better
- the area of the shape: the bigger the shape, the better (that’s because we expect that the furthest points that we can find are the real corners themselves)

This algorithm proved to be successful for all my dataset images. Yay!

4. Refine the corner detections

Before entering the next phases, I rotated the puzzle piece to make it horizontal (or vertical, depending on your point of view…) and computed its edges by using the Canny edge detector.

I then refined the detected corners’ positions by selecting a window centered on the detected corner and finding the furthest point from a 45° or 135° oriented line passing through the center of the puzzle piece.
As you can see, the results are quite accurate:

Refinement of the 4 puzzle corners. The small circles are centered on the refined corner.

5. Separate the four sides

Now that we have some good puzzle corners, we need to separate the perimeter of the puzzle piece into its four sides. Each side is a connected curve which starts from one corner and ends into another.

The most simple idea works pretty well: compute the four lines passing through the four corners and classify each perimeter point based on the closest line to the point.
However, there are cases where a side has a big protrusion and part of it gets classified as belonging to the wrong side, since it is closest to the wrong line!
To solve this, I modified the simple approach into a more robust one:

Apply the ‘closest line’ idea until a maximum threshold: if a point is far from all four lines, it doesn’t get classified
Apply a hysteresis algorithm to all unclassified points: for each unclassified point, check if a perimeter point in its neighborhood has been classified; if that is true, set the same ‘side class’ to the point and continue. The algorithm ends when all points have been classified.

The final result is amazing! Here’s one example:

In the final classification step each perimeter pixel gets assigned a different ‘side class’. In this image the four classes are shown with different colors.

The last information that we need is the orientation of each side: we need to know if the side is going in or going out! (It would be bad if the algorithm would tell us to connect two sides that are both going out…)
This step is quite simple, since we just need to check for each side if the average point, which is obtained by averaging all point coordinates belonging to the same side, and the barycentre of the puzzle piece lie on the same half-plane identified by the side line connecting its two corners; if this condition is true, then the side is going in, otherwise it’s going out.

Conclusions

Although I decided to finish my Star Wars puzzle using the brute force approach, I really enjoyed applying Computer Vision algorithms to puzzle pieces. Even if the task seems simple at first glance, a lot of algorithms need to be used (binarization, mean filtering, edge detection, corner detection, connected components, … and a lot of geometry!).

Please share your opinions and possible improvements of this work, and maybe together we will be able to build an entirely autonomous puzzle-solver robot :)