The world’s leading publication for data science, AI, and ML professionals.

Project Pendragon: An AI Bot for Fate Grand Order

finetuned Pytorch CNNs and trained a custom siamese CNN to form the backbone of a bot to play the game mobile game Fate Grand Order

As a data scientist all too frequently I do not actually end up doing anything with my models. I put in a good amount of time to build datasets, prepare data, train models, tune their hyperparameters, and then I store them never to be seen again. So I figured that this time I would try and integrate them into something.

Project Pendragon uses a backbone of three Pytorch neural networks to make decisions on what cards to play and when within the mobile phone game Fate Grand Order (FGO). For some this might be me overengineering a solution, but for me is a fun project that actually does save me time daily with in-game item farming. Especially in the mornings where it can be useful to have an extra 10–15 minutes as I let the bot run a few quests.

here is a link to the github repo

A previously trained face detection model only detects the center character, Altria Pendragon, FGO's version of King Arthur Pendragon, but not either of the other two characters. The the other two characters are more in the background and faces are partly obstructed. Also the object detection model was trained on season 4 of The Flash... so it doesn't actually have exposure to cartoon characters, I could build a specific network for that though. I'm using that model (calling it Flashpoint for now) as part of a raspberry pi remote greeter which could also be a good blog post...
A previously trained face detection model only detects the center character, Altria Pendragon, FGO’s version of King Arthur Pendragon, but not either of the other two characters. The the other two characters are more in the background and faces are partly obstructed. Also the object detection model was trained on season 4 of The Flash… so it doesn’t actually have exposure to cartoon characters, I could build a specific network for that though. I’m using that model (calling it Flashpoint for now) as part of a raspberry pi remote greeter which could also be a good blog post…

Some background on FGO. It is a mobile device Japanese role playing game where you control characters by selecting action cards out of a deck. At any given time you will have 3 characters on the field each of which has 5 of their own action cards. The action cards come in three varieties, arts, buster, and quick and how many of each type of card depends on the character. Every turn you are dealt 5 cards out of the deck of 15 (5 from each character) without replacement and allowed to select 3 cards to be your actions for the turn. An additional mechanic is that there are bonuses to building card "chains", ie. selecting three of the same type of card. See the two below images for chain examples.

Arts Chain: Pick 3 arts cards, can be from the same or different characters. Can do the same thing for any of the 3 card types
Arts Chain: Pick 3 arts cards, can be from the same or different characters. Can do the same thing for any of the 3 card types
brave chain: pick 3 cards from the same character
brave chain: pick 3 cards from the same character

When I sat back and thought about things I could work on, I realized I could probably build a bot to play FGO using a series of neural networks and by sending input to the game through python to pick cards. A basic but effective game strategy for the bot would be for it to play card chains whenever available. Card chains give damage or other bonuses when played and using them makes turns more efficient in the sense of higher damage and therefore shorter battles.

Before starting coding, I had to figure out the general process I wanted the bot to follow and the following is the logic structure I came up with.

  • upon starting, it looks to see if it can see the "attack" button, if it finds it it presses it and moves into the phase of the turn where it picks 3 of the 5 command cards
  • The bot goes through and checks to see if there are 3 cards by the same character out of the 5 cards. If so then it creates that "brave chain"
  • If no brave chain is found, then it will look at all the cards, check their types and look for 3 cards of the same type to create a card chain.
  • if no card chains are found. Then it will play available in the following order of priority. Arts cards, buster cards, and quick cards. (I tend to prioritize arts cards in my play, because they charge ultimate abilities for the characters called "Noble Phantasms" in FGO, so I just had the bot do this) This is also the order that the bot will try to play cards in for the different chains for the same reason.
  • Once that is done it will start checking for the reappearance of the attack button and the bot’s turn will start again.

So in this process the ability to identify the attack button and the card types are traditional CNNs while looking for 3 cards by the same character is a siamese CNN.

I built all of the networks using Pytorch and trained them on either a 1060 or 1080 Nvidia GPU.

However before all of that, the first step was even getting FGO to run on my PC which was harder than I thought it would be.

Bot identifies that the first, second, and fifth card are all of the same character and so it constructs a chain using those three cards.
Bot identifies that the first, second, and fifth card are all of the same character and so it constructs a chain using those three cards.

Running FGO on a PC

While this was supposed to be nice and straightforward, some googling quickly let me know that at the moment no current emulators supported FGO easily and options like using developer features on Android were not compatible with FGO.

So after a short period of despair, I stumbled onto the idea of using Microsoft TeamViewer to control my android device. Since I normally use TeamViewer to access my computers remotely it seemed like an elegant idea and it turned out to work quite well. Using the android TeamViewer Host app I was able to add my phone to my existing account. The only downside for me here is that my phone overheats after extended usage

Once that was set up I had to confirm that I was able to start up FGO and tested sending mouse clicks to it via python’s pyautogui. Thankfully this was successful, so the next step was to take screen captures of the game screen and isolate sections that the neural networks would be looking at get the appropriate xy coordinates to crop out those sections of the images.

The first section crop out was the "attack" button that appears on the screen at the beginning of the player’s turn. Once it appears, the player presses it and is allowed to select their action "command cards". As of now my strategy is to have the bot check for the attack button every second and if it detects the button it will execute the rest of its code.

clipping of the TeamViewer screen that gets fed into my bot for analysis
clipping of the TeamViewer screen that gets fed into my bot for analysis

The next step after clicking the attack button was to identify where the "command cards" would be located on the next screen. I also noticed that the five command cards are always located in the same locations across the screen which meant I could also get by here with cropping out the command cards from the screen and feeding those as input into a network.

Showing placement of command cards on the screen
Showing placement of command cards on the screen

Now that I had a method to extract the inputs I would need to run them. I was ready to prepare to train the neural networks I would need.

Bot identifies that there are 3 cards of the same color (card 2,3 and 5) and selects them to make a card chain
Bot identifies that there are 3 cards of the same color (card 2,3 and 5) and selects them to make a card chain

Leveraging Pretrained Pytorch CNNs

For the two networks, the first step was to acquire image data for the classes that I cared about.

The first was a binary classification problem to detect the "attack button" I went through gameplay footage and snipped out a few dozen attack buttons and examples where the button was not present which is most of the time.

For the second classifier I went through game footage and snipped around 60 examples of the three different colors of action cards. For this one I expected that although the faces on each card is different, the colors are the same so it would learn to prioritize looking at the card color.

Now that I had my two small datasets in hand, I was ready to move onto building models. It would be silly to try and train a network from scratch for this so I decided to leverage the pretrained networks included with Pytorch. Like most image classification problems one of the first approaches I took was to leverage Pytorch’s pretrained networks and in this case I finetuned existing networks for my classification problem. The base process for this is laid out in Pytorch’s documentation. Doing things like finetuning or using a pretrained network as a feature extractor lets us leverage networks which have learned useful weights for image classification and we hope that these weights will generalize to our specific use case.

For these two models I used variations of Resnet (Resnet 18 and 50), I could likely have gotten away by using two Resnet 18 instances, but decided to try and see how long it would take the larger Resnet 50 model on my graphics card. I trained each network for around 5 epochs and they both reached around 95% accuracy for the two classification tasks

The network to determine card type was the Resnet 50 and that one forms the backbone of my bot. Since once I determine card type I can then form card chains which are probably the most common type of chain to create. See below for examples of the bot identifying and then creating "Arts" chains.

Bot identifies cards 1,3, and 5 as a chain.
Bot identifies cards 1,3, and 5 as a chain.

Finding Brave Chains with Siamese Neural Networks

The second type of card chain is created by selecting 3 cards by the same character. So conceptually this is a little more difficult than the first type of card chain. The goal has to be to identify when at least 3 cards are by the same character out of the 5 that have been dealt.

There are a few ways that you could approach this problem. The first would be to build a classic CNN that has learned to identify all of the characters in the game. This is feasible, but would be a pain to gather data for since there are at least 100 characters and would be difficult to get data at a scale to get a network to converge, or even fine tune appropriately. So in this case I figured that I could apply something I talked about in a previous post, siamese neural networks.

With a siamese network I could use a fairly small dataset to train a network to get good at identifying the similarity of faces on different colored cards. It also means that I would be able to fairly heavily augment the dataset since I would now be doing pairwise comparisons.

To do this I leveraged the dataset function from this repository for facial similarity and modified the network structure to include maxpooling, partly because my computer would run out of memory, but also because I found improvements since the initial network maintained the original image size even as it got deeper, maxpooling is applied to let the network learn more complex image features with additional layers.

The siamese network here is trained to maximize or minimize the euclidean distance between the feature vectors of the two input images depending on whether or not they are the same. So for me to deploy the model I can use it to test one command card against the other 4 command cards and look for similarity matches. One downside to this in a computational sense is that I am comparing each card against all other cards so worst case run time is N².

See below for an illustration of how this would be done. The image pairs can be fed through the network and then by taking the euclidean distance between the two output vectors you can get the dissimilarity of the two images as the network sees them (lower scores means more similar). With this illustration the network does well but leaves room for improvement. I found that a lot of characters end up looking fairly similar with the side view portraits so I would likely need to include more examples of these harder comparisons to help strengthen the network.

These are two cards by the same character and is really the comparison that this network was designed to make. By finding 3 of these matches the bot can create a brave chain and get an extra attack as a result (how the game rewards 3 cards by the same character)
These are two cards by the same character and is really the comparison that this network was designed to make. By finding 3 of these matches the bot can create a brave chain and get an extra attack as a result (how the game rewards 3 cards by the same character)
Example of dissimilar pair, score of 1.17 is quire high
Example of dissimilar pair, score of 1.17 is quire high
another more dissimilar pair
another more dissimilar pair
one where the network marks two characters as being similar while they are different. As discussed above this is probably the main area of improvement. The network was only trained on around 40 characters with 3 or so cards from each.
one where the network marks two characters as being similar while they are different. As discussed above this is probably the main area of improvement. The network was only trained on around 40 characters with 3 or so cards from each.

While there are improvements to be made, the siamese network does perform well in the field. See below for some examples of it cycling through the 5 cards and comparing each card against all the others. When it finds a triplet that triplet is then fed to the bot and the bot selects the appropriate cards to create the chain.

bot uses siamese network to identify cards 3,4, and 5 as all being the same character so it constructs a brave chain
bot uses siamese network to identify cards 3,4, and 5 as all being the same character so it constructs a brave chain

Closing Thoughts

This bot is a good start and it can play the FGO fairly competently in the sense it can clear the high level missions that I put it through. However it does not make use of many additional game mechanics which could increase its effectiveness.

  • Intelligent ways to use the character’s ultimate abilities called "Noble Phantasms" (NPs) in FGO. As of now I have the bot just try and use character’s NPs after a certain number of turns with the idea they might be able to use it against the boss for that battle or harder enemies, rather than waste them on earlier weaker enemies.
  • Make use of the class affinity system in FGO. Essentially a rock paper scissors type setup where every character has a given class with strengths and weaknesses against other classes. Using it to your advantage gives you a damage bonus while a disadvantageous pairing is a detriment.
  • Let bot use character "skills". In FGO character skills are quite helpful when used at the correct time to boost damage, heal allies, debuff enemies etc. The bot currently does not make use of these and thus misses out on a lot.

A next stage for the bot could be an application of reinforcement learning, but it is made difficult by the fact that FGO has an action point system in place where every battle takes X number of action points and when you are out of them you have to let them recharge… it just means I would have a hard time letting it explore things for thousands of iterations. The other option would be to build a simulated FGO environment for it to explore. Which is possible but would be quite time consuming.

edit: As a follow up post I built out the reinforcement learning bot for FGO which I mentioned above and you can check it out here. I built a custom game environment and let the bot learn by playing thousands of games and found it learned some cool behavior!

here is a link to the github repo


Related Articles