I have been playing far too much Sudoku lately. Therefore, instead of taking the drastic step of kicking the habit, I decided to go a bit deeper into it by writing some code and make a Sudoku solver (or even a few of them). I believe that this project has worked in that I no longer want to solve some puzzles during my morning coffee anymore.
A long time ago I made a really bad recursive backtracking solver. I revisited that code and made some improvements. Also looked at some of the other algorithms out of curiosity. During this project I went a bit into the mathematics of sudoku, however I did not list any of those details here.
Before we begin, for those unfamiliar with the rules of sudoku, they are:
- The grid (or board) is divided into nine blocks, each containing nine squares
- Then each of the nine blocks has to contain all the numbers 1 to 9 within its squares
- Each number can only appear once in a row, column or box.
There are a lot of Sudoku solving algorithms out there. For now I have three implementations and perhaps will add more in the future (such as this one that is rather pythonic, or using the naked twins/triplets technique).
Perhaps recursive backtracking is the easiest to implement. Also having a bit of a computer science mindset, it might be the first technique that one would consider (or think of). The steps needed are:
- Find the first non-empty cell
- Test all values from 1 to 9 and take the first successful value
- If successful, repeat step 1.
- If not successful, go to the next value
- If all numbers have been tested, then go to the previous cell, and try its next candidate value
When implementing this on a computer, it can easily be done by calling a ‘Solve’ function recursively. The success state of a candidate value can be verified by another function, and the backtracking involves going ‘up’ the call stack. This video also explains the process quite well.
Constraint Based Solving
I came across this technique since it is similar to integer programming based techniques. I am not an expert at solving some of these problems but would like to gain more experience in the future. So with constraint satisfaction problems, three elements need to be defined:
- A set of variables that need to be solved for
- A set of domains, that is possible values that each variable can have
- A set of constraints, where each constraint controls what values a variable can take
For sudoku puzzles, one variable is specified for each cell. To keep things orderly, the convention of using the numbers and letters for the rows and columns was used (for example, using A1 for the top left cell). Note that in there are a total of 81 variables.
The domain of each variable is any integer from 1 to 9. For cells with given value, the domain is then just that value.
The constraints are how the sudoku rules are encoded. There are 9 constraints for each row, column, zone/region and are in the form:
To solve the constraints, usually a backtracking search technique that has a lot of heuristics used. Python has a package called python-constraint which can solve such problems. One then simply makes an instance of the solver, then calls addVariable to add variables and their domains, and addConstraint to specify the AllDifferentConstraint.
Crooks algorithm is technique developed by James Crook from Winthrop University a little while ago (2008-2009?). It perhaps is the most intuitive for manual puzzle solving and hence was labeled a ‘Pencil-and-Paper’ algorithm in the original paper.
These steps are taken from the original paper:
- Find all forced numbers in the puzzle.
- Mark up the puzzle.
- Search iteratively for preemptive sets in all rows, columns, and boxes—taking appropriate crossout action as each new preemptive set is discovered —until
(a) a solution is found or
(b) a random choice must be made for continuation.
- (5) If 4(a), then end; if 4(b), then go to step 3.
Note that the algorithm is a fair bit of work to code (much more than the previous two).
Out of curiosity a few different comparisons of the various algorithms and implementations were made. All test cases were taken from websoduku.com using the a few puzzles from the Easy, Medium, Hard, and Evil levels. Note that all cases take microseconds to execute on a 2.4 GHz dual core I-5 from 2013. Even the slower algorithms are fast enough to allow for real time execution.
First the C++ and Python implementations of the recursive backtracking algorithms were compared. One could assume that the C++ would be faster, however it was approximately 20 times faster. I don’t know why the Python version was so much slower, perhaps it was with memory copying issues.
Next, the C++ recursive backtracking version was compared to Crooks algorithm, and the constraint based technique. For very difficult puzzles Crook’s algorithm is doesn’t behave as well as the others. Recursive backtracking has a problem with the order of how empty cells are processed; some empty cell orders are better than others, so sometimes it’s better to start from a different position. Finally, the constraint based solver provided surprising results.
Click here to view the above solver in its own tab..
There is some ‘interesting’ Sudoku stuff out there; including:
- The SudokuProfessor… basically this guy has a website and he offers training courses to solve sudoku puzzles. He has everything from a 3 hour DVD (for $30) to the ‘Sudoku Professors’s Insider’s Club – All Access’ membership for ($500 per year). He also has masters and doctorate level courses as well. I am really hoping this is an April fools day joke or something like that.
- More courses… one called ‘Sudoku Made Easy’ via Udemy and another called the ‘Mathematics of Games and Puzzles: From Cards to Sudoku’. Again this should be free but it isn’t.
- There is a Sudoku World Championship; if there was only a movie about it.
- There are many books on the Sudoku solving. I think such a book would detract from someones bookshelf.
- There are various patents (like this one) on what appears to be a Vegas style Sudoku based casino game.