Ruining Sudoku — A Data Science project

Part 1: Introduction and Project Design

Published in

Towards Data Science

7 min readOct 12, 2020

This is _exactly_ how you will look and feel at the end of this series. No promises on the goatee though.
Photo by Minh Pham on Unsplash

Back when I was young and stupid (I have retained only one of those two features) I imagined that working with AI consisted in spending a lot of time designing and implementing a software that showed some kind of intelligence, whatever that might mean. The Matrix came out in 1999, I was 15 and easily impressed by those visuals: it is fair to say that that movie had quite an impact on my future life’s direction.

And on my wardrobe, but that’s a story for another time.

I have been working as a data scientist for around 7 years at this point, and especially during the last two and a half, after leaving academia for a job in the industry, my view on the subject has changed significantly. If I had to describe what I do on a daily basis, I would say that I essentially spend my time trying to solve problems which are somehow related to data and that often require some kind of AI magic chugged in the mix.

However I realized that it is probably not obvious from an external point of view the complexity of taking a request, an idea, and turning it into an actual product. I think it’s easy to focus on the juicier parts such as fancy neural networks and exotic datasets and forget about all the other aspects of a data science project that are crucial to its success: I’m referring to things like deployment, data preprocessing and most importantly project design, that is laying down a roadmap of what is needed to go from zero to a working final product.

So I thought it would have been interesting to write about the process as a whole: I see quite a lot of tutorials on the individual parts (especially on the aforementioned deep learning related stuff) but not many about how to put the pieces together, and even less that spend a reasonable amount of time explaining the reasons behind some choices or exploring why some roads should not be taken. Allow me to fill that gap.

Intro

Do or do not [a complete data science project from scratch], there is no try (actually I encourage trying).
Photo by Jimmy Nguyen on Unsplash

I’m a huge fan of David Silver’s work on reinforcement learning and I often joke about the fact that the goal of DeepMind is to ruin games for the rest of us by creating AIs which are essentially way better at a given game than any human could ever be.

So I thought it would be interesting to go somewhat along that same way, although from a very different angle.

Imagine you have a very racist old aunt who blames immigrants for climate change (which she denies: logic is above her), spreads all kinds of misinformation through her facebook account and is a hardcore Westboro Baptist Church affiliate. Birds stop singing when she’s around. If you don’t have to imagine that, just focus on the pleasant thought that time will eventually fix that problem for you.

Aunt Karen has only one source of joy in her life: every tuesday, the weekly issue of “Sudoku eXXtreme 4000” is delivered to her mailbox and she cherishes every moment she spends on those puzzles.

Our goal is to destroy that happiness.

To do so, we will design and implement a system that will allow you to take a picture from a page of the magazine containing one Sudoku grid and automatically solve it in seconds.

Summary

This article is part of a series:

Part 1: Introduction and Project Design
Part 2: Data preprocessing
Part 3: Digits recognition and Sudoku solver
Part 4: Deployment and retro

You should have a plan

Much like playing with Lego, figuring out what pieces you’ll need is the first step.
Photo by Xavi Cabrera on Unsplash

When I begin working on a new project, I spend a considerable amount of time trying to make sure that a couple of questions have clear answers (or at least as clear as possible):

What am I doing? What is the end goal of this project?
How do I accomplish that?

It’s an iterative process that starts with the high level description of the task (in our case it’s something like “given a picture of a Sudoku, return another picture of the completed puzzle”) and breaks it down in smaller tasks until they become clear, atomic tasks that can be reduced to some specific problem for which there is a solution in the form of an algorithm.

It does not have to be perfect: no matter how good you are at planning there will most definitely be something that you overlooked and you will have to adjust your trajectory mid flight and there’s no point fixating on details at this stage. As long as the original plan is good enough and provides a good starting point, that’s ok.

Let’s draw some shapes.

Project breakdown

Pretty sure he said that at some point.
Source: photo by mirjoran on flickr, licensed under CC BY 2.0

Technical note: for drawing diagrams I use miro, which quite frankly is such a nice invention that makes penicillin look like a 4th grader’s soda volcano. Sorry Alexander, this is simply the truth.

Your starting point, as I said before, is the very definition of the problem, nothing more. So from a blank page you should get to this:

It’s not much, and of course there’s a good chance that you actually have a decent idea of where to go next, but let’s take it step by step.

By the way, I’m going to use colors to code the status of a task:

The next natural thing is of course to split the task into a first set of smaller, more manageable ones. When I started thinking about what components I needed for this project I came up with four main tasks, using a bottom up approach.

At some point, I expect to have a structured representation of the Sudoku, that is a 9 x 9 array representing the 81 cells, which will either contain a digit from 1 to 9 or be blank. Having this information in that format, even though I may not necessarily know how to do that (that’s why that block has a red border), I expect that an algorithm should be able to solve the puzzle.
To obtain that information however it will be necessary to scan the grid and identify the digit present in each cell (or detect that it’s blank). Luckily, if there is something that any data science by law is required to do is the infamous tutorial on handwritten digits recognition, so that part should be covered (green border). But is it really? 🧙
Spoiler from part 3 (which I will publish in around 2 weeks, if all goes well): it’s actually a bit more complicated than that, but I found an elegant solution to the unexpected setback, if I may say so 🧐.
We want our system to be able to work with images of Sudoku grids taken in a “natural” setting, without any too strong constraints. Therefore, we’ll have to do some sort of preprocessing to the input image, in order to neatly crop and rotate the grid from the full page of a magazine. Plus potentially any other adjustment required to facilitate the work of the component responsible for digit recognition.
Finally, it would be a nice touch if we were able to present the work in a nice format, say a web app where you upload your image and it is displayed with another image on the side showing the solution to the puzzle. The border is yellow because while I did something like that in the past, it’s not an aspect of projects that I consider myself to be an expert of, so I want to remember that that part might require more effort or research than the ones I’m more familiar with.

Further expanding the activities described at point 3, we get the final (so to speak) version of the breakdown:

THIS ISN’T EVEN MY FINAL FORM!
Image by author

This is good enough for now. Throughout the project you may find out that some of the tasks will be more or less complicated than you anticipated, or that they will have to be further split in even smaller tasks ones. You may even have forgotten something that you didn’t think of while planning. That’s normal, that’s bound to happen when you go from the idea of a product to actually implementing it.

What comes next

That’s it for now. We’re still at exactly 0 lines of code written but what is arguably one of the most important parts of any project has been completed. Planning is like the bass player in a band, you only notice it when it’s missing¹. In the upcoming articles I will cover how to implement all the tasks that are listed in the diagram above, starting from data ingestion/preprocessing. See you in a week!

Your joy is soon going to be the memory of a distant past, aunt Karen.

[1] Apologies to all bass players who shared a stage with me over the years. You know it’s the truth.