Coding an Intelligent Battleship Agent

Example Battleship game using probabilistic search. Image by author.

Battleship seems to be a pretty popular format to test a variety of AI strategies. Online you can find a wide array of approaches, from reinforcement learning to genetic algorithms to Monte Carlo methods. One of the most viewed pieces of media on the topic is a Youtube video by the channel Vsauce2. In the video, they describe a Probability-based algorithm that leads to strategies far more sophisticated than a human player could reasonably use. That video led me to this blog post by Nick Berry of DataGenetics, who describes the algorithm in great detail. His post begins with a relatively weak algorithm, but he builds it into a powerful solution that can dramatically outperform many other methods. His explanation of the algorithm is brilliant, but I was unable to find any source code for his implementation. I decided to recreate his strategies and to post the source code publicly for anyone who wants to play around with or modify it. For those who want it, everything is available at my Github.

Rules of Battleship

I assume anyone reading this is familiar with the basic rules of Battleship, but I’ll give a quick recap of some rules that are particularly important for the AI implementation. First is that no ships can overlap: while ships can be adjacent to other ships, they can not be stacked on top of each other. Second, once all squares of an opponent’s ship have been hit, the opponent must declare that their ship has been sunk. This is crucial information since it reveals that we no longer need to continue firing in this particular area. Lastly, there are five ships total on the board, with lengths of five, four, three, three, and two.

An example Battleship board. Red X signifies a hit, Black X signifies a miss. Image by author.

Initial Strategies

An obvious, if completely awful, strategy that comes to mind is to just try guessing completely randomly. As you might expect, this results in very poor performance. If nothing else, this strategy will give us a lower bound on performance that we can use a benchmark against other strategies.

Example game using random guesses. Image by author.

I simulated games with this strategy 1,000,000 times in order to generate a distribution for the length of a game. For the random strategy, the median length of a game was 97, meaning most of the time it was very close to covering all 100 squares by the time it sunk the final ship. The graphic below displays the distribution of the number of shots needed to complete a game.

The first step to improving our strategy will be to take hits into account. When the algorithm strikes a square with a ship inside, it shouldn’t just move on to the next random square. It should fire on squares adjacent to the hit square so that it will sink the ship. Borrowing terminology from Nick Berry, we will call this the Hunt / Target strategy. When the algorithm is in Hunt mode, it will fire randomly like it did before. However, when it hits a square with a ship in it, the strategy shifts to Target mode. We will maintain a list of "target squares" that the algorithm should prioritize above all others. When a hit is made, all squares adjacent to the hit square are added to the targeting list. On subsequent turns, the algorithm will pop these squares out of the list one at a time and fire on them. When the list is empty, the algorithm switches back to Hunt mode.

	def hunt_target(self):
	# enter hunt mode when no more targets left
	if not self.targets:
	guess_row, guess_col = self.guess_random()
	else:
	guess_row, guess_col = self.targets.pop()

	if self.SHIP_MAP[guess_row][guess_col] == 1:
	# add all adjacent squares to list of potential targets where possible
	potential_targets = [(guess_row + 1, guess_col), (guess_row, guess_col + 1),
	(guess_row - 1, guess_col), (guess_row, guess_col - 1)]
	for target_row, target_col in potential_targets:
	if (0 <= target_row <= 9) and \
	(0 <= target_col <= 9) and \
	(self.SHOT_MAP[target_row][target_col] == 0) and \
	((target_row, target_col) not in self.targets):
	self.targets.append((target_row, target_col))

	return guess_row, guess_col

view raw hunt_target.py hosted with

by GitHub

This is a simple algorithm, but there is already massive improvement over the random guessing strategy. As you can see below, once the algorithm hits a ship, it peppers the adjacent squares with shots. This is more reminiscent of how a (bad) human opponent would play.

Hunt / Target strategy example game. Image by author.

The median game length for the Hunt / Target strategy was 65 moves. Not great, as we’re still covering well over half the board in the search for all enemy ships. However, the algorithm does get lucky on occasion. The shortest game length in the 1,000,000 simulations was 24.

Next, we will seek to improve how the algorithm searches the board for ships. We can do so by imagining that there’s only one ship on the board. Let’s pretend that all we want to do is hunt down the Carrier (length 5), the longest ship in the game. We don’t have to fire on every square to find it, instead we can realize that there are certain squares that the Carrier must be touching while it’s on the board. Take a look at the image below. It is impossible to fit the Carrier on the board without it touching one of the squares marked with an X.

Shot pattern for hunting the length five Carrier. Image by author.

When the algorithm is hunting for the Carrier, it should randomly select shots from the pattern above. Creating this pattern is straightforward. The board is zero-indexed, meaning the rows and columns range from 0–9. We take the sum of the row and column indices and check if they’re divisible by five. If they are, the square is in our shot pattern. We can generalize this idea to any ship length.

	def guess_random(self, length=None):
	while True:
	guess_row, guess_col = random.choice(range(10)), random.choice(range(10))
	if length:
	if (guess_row + guess_col) % length != 0:
	continue
	if self.SHOT_MAP[guess_row][guess_col] == 0:
	break

	return guess_row, guess_col

view raw shot_pattern.py hosted with

by GitHub

Note that the smaller the length of a ship, the less squares this approach will eliminate. However, even for the Patrol Boat (length two), this new shot pattern eliminates half the squares on the board from consideration.

Shot pattern for hunting the length-two Patrol Boat. Image by author.

The algorithm will need to use the pattern for the smallest boat still on the board. At the start of the game, the shot pattern will be the Patrol Boat’s shot pattern. However, when the Patrol Boat is eliminated, the shot pattern will switch to the pattern of the next-smallest ship. For this strategy to have its maximum effect, we need to get lucky and hope that the algorithm hits smaller ships first.

There are some marginal gains between the original and improved Hunt / Target strategy. Notably, the worst case for the original strategy is much worse than for the improved one. At worst, the original strategy hit all 100 squares, while the improved version never went past 88 squares.

Probabilistic Strategy

To further improve the algorithm’s targeting strategy, we turn to a new approach. Instead of choosing random squares to guess in the targeting stage, we can use probability to inform our decision. If we assume that ships are placed randomly, it’s far more likely that we’d see a ship in the middle of the board than at an edge or corner. This is simply because there are many more possible ways to place a ship in the center of the board than on the outside. Consider a central square like the one at row four, column four. The Carrier can be placed on top of it in 10 different ways horizontally, and 10 different ways vertically. This gives the square a weight of 20. By contrast, the corner square at row zero, column zero has only two ways for a Carrier to be placed on it, giving it a weight of two. By doing these calculations for every square on the board, we can generate a "heat map" showing which squares are most likely to have a Carrier on them. We can do this for every ship and combine the probabilities to get a generalized heat map for all the ships on the board. To find which square to fire on next, we choose the square in the heat map with the highest value.

Initial probability heat map for the Carrier (left) vs. heat map for all ships combined (right). Image by author.

Once the computer has fired a shot, it receives information about whether the shot was a hit or a miss. We can use this information to recalculate and update the heat map. After a miss there are fewer possible ways a ship could be placed on the squares near the miss, so squares around the miss have reduced probability.

	def gen_prob_map(self):
	prob_map = np.zeros([10, 10])
	for ship_size in self.SHIP_SIZES:
	use_size = ship_size - 1
	# check where a ship will fit on the board
	for row in range(10):
	for col in range(10):
	if self.SHOT_MAP[row][col] != 1:
	# get potential ship endpoints
	endpoints = []
	# add 1 to all endpoints to compensate for python indexing
	if row - use_size >= 0:
	endpoints.append(((row - use_size, col), (row + 1, col + 1)))
	if row + use_size <= 9:
	endpoints.append(((row, col), (row + use_size + 1, col + 1)))
	if col - use_size >= 0:
	endpoints.append(((row, col - use_size), (row + 1, col + 1)))
	if col + use_size <= 9:
	endpoints.append(((row, col), (row + 1, col + use_size + 1)))

	for (start_row, start_col), (end_row, end_col) in endpoints:
	if np.all(self.SHOT_MAP[start_row:end_row, start_col:end_col] == 0):
	prob_map[start_row:end_row, start_col:end_col] += 1

	self.PROB_MAP = prob_map

view raw gen_prob_map.py hosted with

by GitHub

When a ship is hit, we artificially weight the squares adjacent to the hit square so that the algorithm will seek out the remaining ship squares and sink it. After the ship is sunk, we remove these artificial probability weights since they are no longer useful. This will stop the algorithm from firing at spaces near ships that are already destroyed.

Two potential realities: a miss (left) and a hit (right). When there's a miss, squares around the miss are weighted lower. When there's a hit, squares around the hit are weighted higher. Image by author. — Two potential realities: a miss (left) and a hit (right). When there’s a miss, squares around the miss are weighted lower. When there’s a hit, squares around the hit are weighted *higher. Image by author.*

One final improvement can be made to the way the algorithm selects squares after a hit. If the algorithm has scored hits on a ship horizontally, then it should give preferential weighting to horizontally-adjacent squares. The same is true with vertically-oriented ships.

Game length distributions for all strategies (left), cumulative distribution functions for all strategies (right). Image by author.

We see a massive increase in efficiency with this new strategy. The median length of a game is 43, and the shortest game of the 1,000,000 simulations was 18 moves. Below is a full example game played with the probabilistic strategy.

An example game played with the probabilistic algorithm. GIF by author.

For my Pygame visualization above, I had to code the heat map coloring system from scratch. If any readers have ideas or suggestions on how to improve the coloring to make it closer to plots produced by Matplotlib, I am absolutely open to hearing them.

Potential Improvements

I wanted to note that this algorithm is designed to perform optimally in the presence of randomness. However, most opponents do not randomly place their ships around the board. Some people might prefer to place their ships on the corners or edges of the board. This algorithm would perform poorly against such strategies since it presumes most ships will be centrally-located. To account for this, we might consider artificially weighting the outside squares of the board.

I also thought it would be interesting if the AI were to modify its strategy after many games. If opponents tend to employ similar strategies in ship placement game after game, then we could weight the areas of the board where the opponent prefers to place their ships.

Final Thoughts

I wanted to thank Nick Berry for his great post on Battleship, hopefully I was able to do it justice with my own implementation. If you haven’t read his explanation, I highly recommend it.

If you found this post interesting, you may want to check out my source code, which is publicly available on Github. Thanks for reading!