I downloaded a game but end up learning about search algorithms

Performing a breadth first search in python and optimizing the solution

Published in

Towards Data Science

15 min readDec 1, 2020

Traversing a search tree. Image by the author.

I was watching a sequel of a Hong Kong drama series （使徒行者3） and one of the lead actresses was featured to be playing a Klotzli puzzle (华容道). That got me intrigued as I love solving puzzles and specifically, I enjoy finding the optimal solution for puzzles.

With some effort, I managed to find a similar game on Google Play Store, downloaded it, and started playing. The objective of the game is simple: move the 2x2 block out from the bottom of the frame. Turns out the game is really complicated and not that easily solvable, and finding the optimal solution is much harder. Each stage has optimal solution given by the minimum number of steps. The diagram below shows that the optimal solution for Stage 50 involves 138 moves.

Stage 50 replicated from 华容道: Kloski game. Image by the author

As determined as I was, I decided that I can (perhaps) solve this on python, as I could probably model the state of the puzzle in a pandas DataFrame, plug it into a function that recursively moves each block and stop when the win condition is met (i.e. the 2 by 2 block is at the bottom-center position). Little did I know that it would take me sap up the bulk of my personal time for 3 weeks, figuring out the solution, and then optimizing it. I had little knowledge of graph theory, breadth first search, depth first search and performance of list versus dict.

The rest of this article will be broken down into the following sections:

Attributes of Block
Attributes of Stage
Basic Methods of a Stage
Moving Blocks within a Stage
Solving the Stage
Improvements to the solver
How to use the full code
Afterthoughts

The following section describes how to model the 4 types of blocks. Code snippets have been simplified. Full code here.

1. Attributes of Block

btype (short for type of block) 1,2,3 and 4 refer to the unit, horizontal, vertical and 2 by 2 block respectively.

row & col refers to the number of rows and columns the block occupies. E.g. a horizontal block occupies 1 row and 2 columns.

general is used to distinguish similar blocks. e.g. guanyu and zhuge block have the same btype attribute but have different general attribute.¹

position specifies the location of the block on the Stage. If a block is made up of more than one unit, the position references the top-left corner of the block.

coords is a list of coordinates that block occupies on the Stage.

[1] For those who are not familiar with the Romance of Three Kingdoms, guanyu is a general in the military, hence the attribute is named as such. Strictly speaking, Liubei, CaoCao not generals, more likely warlords.

Next, we need a way to represent the Stage.

2. Attributes of Stage

d1 represents the current state of the Stage in a nested dictionary where the level 1 keys represents the rows, and level 2 keys represents the columns. E.g. d1[1][2] refers to row 1 column 2.
d_blocks is a dictionary of the blocks on the stage. The general attribute of the block is being used as the key, while the block itself is the value.
moves is a counter for the number of moves that we have executed, while max_moves is the maximum number of moves that we want to try.
l_moves stores a list of moves that the Stage has been through.
states is a set that that stores all previous states of the Stage as the blocks are being moved.

3. Basic Methods of Stage

check_boundary checks if the block that is being added will exceed the boundaries of the Stage (e.g. cannot add a horizontal block in row 1 column 4 as it takes up 2 columns.)
check_clash checks if a block is being added on a spot that is occupied by another block.
add_block does the aforementioned checks and then updates d1 , d_blocks and block.position.
image provides a more visual representation of the stage easy for one to check whether the Stage is set up correctly. (recall that d1 is a dict of numbers)
to_tuple converts d1 to a tuple of tuples so that it can be hashed and stored in self.states which is a set. Membership tests ( i in setA ) are much faster in sets than in lists. Subsequently, we will see that by storing d1 we eliminate certain possible moves which will end up in a previous state.

At this juncture, it is probably worthwhile to explain why there is a need for the 2 block attributes (btype & general). In my initial attempts, I stored each state as a Dataframe with general attribute in self.states and I found that the script took a very long time. Upon investigation, I discovered that the script is stuck trying to exhaust all the downstream possibilities whenever such cyclic symmetry exists, because it sees each configuration as a distinct state:

When there are many unit blocks, such possibilities are increased many folds! Therefore, there is a need to let similar blocks share a common value (btype), so that these states can be identified to be the same, to avoid trying such moves.

The general attribute continues to be useful: One, it is used as a key in the d_blocks dictionary. Also, when the solver is complete, it is hard to check the solution without a way to distinguish similar blocks. Using Stage(2) as an example:

>>> import pandas as pd
>>> import hrd as h
>>> s = h.Stage(2)
>>> pd.DataFrame(s.to_tuple(), columns=range(1,5), index=range(1,6))
   1  2  3  4
1  1  4  4  1
2  1  4  4  1
3  1  1  1  1
4  1  1  1  1
5  1  0  0  1

The first 3 steps of the optimal solution for such this puzzle is:

  Move no. General Direction
0        1     Z13        DR
1        2     Z11        DD
2        3     Z12        LD

One will need to guess which block can move DR, DD and LD! *gasps*
s.image() removes the guesswork:

>>> s.image()
     1    2    3    4
1  Z01   CC   CC  Z02
2  Z03   CC   CC  Z04
3  Z05  Z11  Z12  Z06
4  Z07  Z13  Z14  Z08
5  Z09    0    0  Z10

Also, one might ask:

Why not list of lists, numpy array or pandas dataframe?

I’ve tried different approaches and found dictionaries to be fastest. You’ll may be shocked to know that list of lists is the slowest of the three. There are some answers addressing why this is so, but the python documentation on Time Complexity states it best.

4. Moving blocks within a Stage

Moving a block is somewhat complicated. It involves:

Getting the block’s current position
Deleting the block
Calculating the new position of the block, which depends on the direction of the move
Adding the block to the new position

A block can move 1 step: up (U), down (D), left (L) or right (R). This is trivial.

A block can also move 2 steps (this counts as 1 move in the game). Horizontal blocks can move RR or LL. Unit blocks can make 12 possible 2-step moves: UU, DD, RR, LL, UR, UL, DR, DL, RU, RD, LU, LD. Moves in reverse direction (e.g. RL, LR, UD, DU) are not allowed, it’s wasting moves!

`Stage` methods to move a block

del_block deletes a block from d1 by changing the values to zero and removing the corresponding general from the d_blocks dictionary.
move moves a block in a specific direction, by deleting the block, and then adding it on the new position. This method is also responsible for updating the Stage attributes: moves, l_moves, last_moved_general, last_moved_direction and states.
position_block calculates the new position of a block (by taking the original block position and the direction of move), and then adding the block on the new position.

Deriving and eliminating moves

The next code block can probably benefit from some refactoring, but it might make it rather abstract for understanding.

derive_moves derives all the possible moves for a given d1. This is memoryless; it doesn’t consider the last_moved_general or whether the move will result in a state that is the same as a previous state. The output is a dictionary mapping, where the key is the general, and the value is a tuple of block and list of possible directions.

derive_moves

available_moves eliminates the derived moves if:

win-condition is met; OR
max_moves has been reached; OR
the move would result in a previous state; OR
the move involves moving the last_moved_general.

(4) deserves some explanation. Imagine a scenario where a Block (e.g. z10 of Stage 2) is moved L, and then moved L again. Such an approach is, once again, wasting moves and will not be the optimal solution. Also, the exhaustive search would cover the scenario where the block moves LL directly. Similarly, a block which can move U or L, can move U then DL. But this would not be the most optimal solution, hence we can simply forbid all moves that involve the last_moved_general.

is_complete checks if the win-condition is met
is_previous_state checks if the state self.to_tuple() exists in states

5. Solving the Stage

Conceptually, the idea is simple:

With a Stage object, create a dictionary of available_moves.
Execute each of the available_moves.
Store this as dictionary. Key: (general, direction); Value: Stage
Repeat Step 1,2,3 for each Stage generated in Step 3.

Initially, I had stumbled upon the depth first search approach which is extremely slow and that led me to reconsider solving the problem via breadth first search. It was only much later that I learnt that depth first search can solve the problem eventually (assuming I don’t run out of memory), but the first solution is unlikely to be optimal; and breadth first search will get the optimal solution.

d_path executes all the available moves for a given Stage and creates a childStage for each available move:

`d_path` explained in a diagram. Image by the author.

>>> for k, v in s.d_path().items():
...    print(k,v)
...
('Z09', 'RR') <hrd.Stage object at 0x07B3E9E8>
('Z09', 'R') <hrd.Stage object at 0x07B3E2F8>
('Z13', 'DR') <hrd.Stage object at 0x07B3EFA0>
('Z13', 'D') <hrd.Stage object at 0x07B3E610>
...(4 more lines)

__iter__ creates an iterator object such that can be called later using for k, v in self:
BFS_solver sets up a queue which contains the initial Stage, then calls d_path to generate a child Stage for each available move (recording it as Move №1 in self.l_moves). The rest of the code then adds the child Stage to the queue and repeats this process if the state has not appeared in the previous states and returns the solution if the win condition is met. The process of systematically trying all available moves for all Stage(s) at a given Move No. is the defining feature of breadth first search. Contrast this against Depth First Search here.

6. Improvements to the solver

(I) Reduce search time by removing symmetric moves

By inspecting the attempts made by the script to solve the puzzle, I noticed that the number of moves to be searched increases at an increasing rate (rather alarmingly). This led me to think about how to further eliminate the number of moves to be searched.

One thought occurred to me:

If the Stage is symmetrical length-wise, it would suffice to try only moves involving blocks that lie either to the left or right of the block, and not both.

This is important because if a stage starts out to be symmetric, the total number of searches would reduce by 50%. Furthermore, as the search progresses, there could be symmetric child stages as well and there would be further reduction.

I was hoping this would reduce the time by 50% for symmetric Stages, since I’d have reduced the number of searches by at least 50%. But turns out, there is only a 10% reduction in time taken, so there is (probably) an immense cost to check for symmetry and remove symmetric moves.

Symmetry in the puzzle. Image by the author.

is_symmetric checks whether the stage is symmetrical. Note that all positions on the right-half (i.e. column 1 and 2) must have the same value as the positions on the left-half (i.e. column 4 and 3 respectively) in order for a Stage to be symmetric. Conversely, as long as one of positions does not match, we can immediately break out of the for loop and claim that the Stage is not symmetric.
d_path_remove_sym groups the available moves in 2 sets and removes all symmetric moves. Once again there are some intricacies worth mentioning. Given a symmetric Stage:

For each unit block and vertical block, there exist a similar block in a symmetric position. (Otherwise, it would be a contradiction and symmetry doesn’t hold!) One can just simply remove the symmetric block, since the moves are mirror image of each other as well.
For horizontal blocks, there will only be a symmetrical move if it is in the middle (occupying column 2 and 3) AND there is a void to the left (column 1) and right (column 4). I guess this is likely to be rare, and did not cater for this.
For the 2x2 block, there will never be a symmetric move, because there is only one such piece and it is not possible for it to be able to move L and R at the same time. (It would require 4 empty spaces!)

This was written as a standalone method initially and I’ve added it to the d_path method in the final code.

(II) Timing the solver and counting the number of Stages in the queue

It can get a bit worrying if the script takes a bit too long to run and, oftentimes, is very comforting to print some output messages so that the script can inform you that it is still progressing (in a sane fashion).

Hence, I added some print commands indicate which Move No. the script is currently at and to count the number of Stage(s) in the queue. This provides a sense of how rapidly the search tree expands (or contracts) and serves as a motivation to come up with innovative solutions to limit search options. Removing symmetric moves came about because of this. :D

I’ve also added a timer function which prints the time taken to process all moves of the same number. Also, I’ve converted the BFS_solver to a generator, so that those who are keen can examine the Stage that is being yielded at any given point more easily using solve_BFS.

7. How to use the full code

For those who are keen to try out how the code runs, here’s how to use it.

Solving Stage 1 (via Breadth first search approach):

>>> import hrd as h; s = h.Stage(1); s.solve_BFS()
Processing step 1, No. of stages: 1 1.034 ms
Processing step 2, No. of stages: 2 2.986 ms
Processing step 3, No. of stages: 2 3.959 ms
Processing step 4, No. of stages: 3 4.987 ms
Processing step 5, No. of stages: 4 1.1 ms
Processing step 6, No. of stages: 5 1.194 ms
Processing step 7, No. of stages: 6 1.496 ms
Processing step 8, No. of stages: 7 1.895 ms
Processing step 9, No. of stages: 8 1.995 ms
Processing step 10, No. of stages: 7 5.983 ms
Function "solve_BFS" done. Took 10.07 ms
>>> s.solution # prints the solution
   Move no. General Direction
0         1      LB         R
1         2      GY         U
2         3      ZG         L
3         4      ZF         U
4         5      ZY         U
5         6     Z01         U
6         7     Z02         L
7         8     Z04         U
8         9     Z03         R
9        10      CC         D

Here’s the depth first search approach (works well only for Stage 1 and Stage 12).

>>> import hrd as h; s = h.Stage(1); s.solve_DFS()
     1   2   3    4
1   GY  GY  LB   LB
2   ZF  ZG  ZG   ZY
3   ZF   0   0   ZY
4  Z01  CC  CC  Z04
5  Z02  CC  CC  Z03
Function "solve_DFS" done. Took 36.335 ms
    Move no. Direction General
0          0      None    None
1          1         R      LB
2          2         U      GY
3          3         L      ZG
4          4         U      ZF
5          5         U      ZY
6          6         U     Z01
7          7         L     Z02
8          8         U     Z04
9          9         R     Z03
10        10         D      CC

Stage 12 (DFS is surprisingly faster than BFS):

>>> import hrd as h; s = h.Stage(12); s.solve_DFS()
     1    2    3    4
1   ZF   ZY   WY    3
2   ZF   ZY   WY    3
3  Z01  Z02  Z03   HZ
4    0   CC   CC   HZ
5    0   CC   CC  Z04
Function "solve_DFS" done. Took 19.95 ms
    Move no. Direction General
0          0      None    None
1          1         D     Z01
2          2         L     Z03
3          3         L     Z04
4          4         D      WY
5          5         D      ZY
6          6        LL     Z02
7          7         U       3
8          8         U      HZ
9          9         U      ZY
10        10         U      WY
11        11        RR     Z04
12        12        RR     Z03
13        13         D      CC
14        14        DL     Z01
15        15        DD     Z02
16        16         L      ZY
17        17        UU      WY
18        18        UU     Z03
19        19         R      CC
>>> s.solve_BFS()
Processing step 1, No. of stages: 1 ~negligible
Processing step 2, No. of stages: 2 ~negligible
Processing step 3, No. of stages: 2 ~negligible
Processing step 4, No. of stages: 2 ~negligible
Processing step 5, No. of stages: 2 1.565 ms
Processing step 6, No. of stages: 3 ~negligible
Processing step 7, No. of stages: 4 1.559 ms
Processing step 8, No. of stages: 4 ~negligible
Processing step 9, No. of stages: 4 1.562 ms
Processing step 10, No. of stages: 3 6.507 ms
Processing step 11, No. of stages: 3 ~negligible
Processing step 12, No. of stages: 3 1.563 ms
Processing step 13, No. of stages: 4 ~negligible
Processing step 14, No. of stages: 3 1.562 ms
Processing step 15, No. of stages: 5 1.565 ms
Processing step 16, No. of stages: 7 1.559 ms
Processing step 17, No. of stages: 6 1.563 ms
Processing step 18, No. of stages: 7 3.563 ms
Processing step 19, No. of stages: 15 3.328 ms
Function "solve_BFS" done. Took 20.141 ms

Depth first search approach returns the last attempted move when one breaks out of the code (e.g. pressing Ctrl+C on Powershell)

The hardest stage (Stage 50) takes a considerable amount of time to solve using the breadth first search approach. An interesting observation: After 24 moves, there is only 1 available Stage in the queue, which means the rest are likely to be dead-ends.

>>> import hrd as h; s = h.Stage('h50'); s.solve_BFS()
...
Processing step 25, No. of stages: 1 ~negligible
...
...
Processing step 131, No. of stages: 1071 19.3 s
Processing step 132, No. of stages: 1024 17.7 s
Processing step 133, No. of stages: 977 17.2 s
Processing step 134, No. of stages: 944 16.8 s
Processing step 135, No. of stages: 877 15.7 s
Processing step 136, No. of stages: 848 14.8 s
Processing step 137, No. of stages: 775 13.1 s
Processing step 138, No. of stages: 676 1.26 s
Function "solve_BFS" done. Took 9.5 mins

self.playback() can be used to replay the solution step-by-step, after .solve_BFS() has been called.

>>> import hrd as h; s = h.Stage(2); s.solve_BFS()
>>> ...
>>> s.playback()
Starting stage:
     1    2    3    4
1  Z01   CC   CC  Z02
2  Z03   CC   CC  Z04
3  Z05  Z11  Z12  Z06
4  Z07  Z13  Z14  Z08
5  Z09    0    0  Z10Move 1: Z13 DR
     1    2    3    4
1  Z01   CC   CC  Z02
2  Z03   CC   CC  Z04
3  Z05  Z11  Z12  Z06
4  Z07    0  Z14  Z08
5  Z09    0  Z13  Z10Move 2: Z11 DD
     1    2    3    4
1  Z01   CC   CC  Z02
2  Z03   CC   CC  Z04
3  Z05    0  Z12  Z06
4  Z07    0  Z14  Z08
5  Z09  Z11  Z13  Z10
... (16 more steps)

Additional stages can be added by creating list of blocks and positions and adding them to the d_stage dictionary.

Full code is available here (Github) and here (PythonAnywhere).

Some unexplained subtleties

derive_moves: A subtle note here is that the 2-step move (e.g. RR, RD, RU) is added before the 1-step move (e.g. R). There is some minor advantage. Imagine a Stage where the optimal solution comprises of N moves. Now consider the extreme ends of the queue when BFS is processing the N-th move. The very first Stage at the start of the queue would have taken many 2-step moves from Move 1 to Move N-1. On the other hand, the Stage at the end of the queue would have taken all the 1-step moves from Move 1 to Move N-1. Given that we are trying to minimize moves, trying the stages which have taken more 2-step moves first may make more sense. I guess we’ve have achieved some sort of prioritization here?

deque: I have been aware about this container type for some time, but never really found a situation where there a regular list object may be inefficient. Turns out deque is very suitable when setting up a queue for the breadth first search as it a “list-like container with fast appends and pops on either end”.

8. Afterthoughts

While I am very pleased to have solved this problem, what I find extremely valuable is the problem-solving process and learning about the intricacies of python.

At work, I have the luxury of not having to deal with problems where the level of complexity is so great that the performance of set, list and dictionary matters.

Need more processing power? Sure, we can equip you with a computer with a faster processor.
Need more memory? Sure, we can migrate you to workspace with 64GB shared among 4 other users, who use about 2–3GB of RAM on average.

If you’ve enjoyed this article, do drop me a comment and check out my other articles.

Python tips for someone transiting from Excel

Excel is one of the most frequently used application for data analysis. However, due with its clear limitations (i.e…

towardsdatascience.com

How to create a list of files, folders, and subfolders and then export as Excel

A productivity tool to Marie Kondo your folders