DaiLE the Self-Driving Racecar

Josh Cardosi

Published in

Towards Data Science

7 min readJan 25, 2021

Overview

Introduction
What is DaiLE?
Challenges
What’s Next?
Closing Thoughts
About Me

Introduction

In the words of Ayrton Senna, “if you no longer go for a gap that exists, you are no longer a racing driver,” and as a machine learning engineer, occasional racing fan, and huge nerd, I’ve identified a very interesting gap: there are shockingly few implementations of virtual self-driving racecars. I’ve found a couple videos and brief articles about it, but not nearly the number I expected when I first dreamed of building my own.

It makes sense why, though. Race tracks are very dynamic environments, and the act of racing itself, even when you are not competing with other drivers, is anything but simple. I myself, a human man and not a robot, am not particularly quick, despite having a couple hundred hours in driving simulators. But it seemed like a fun project that, even if it failed miserably, would be a chance to learn some new technologies and perhaps be a starting point for someone else’s awesome self-driving racecar idea.

Enter my attempt to go for the gap: DaiLE.

What is DaiLE?

Put simply, DaiLE (named after Dale Earnhardt) is a PyTorch model that can play the racing simulator Assetto Corsa (AC) using controller inputs. He looks at the past 60 frames of gameplay, analyzing the images on screen and corresponding telemetry data, and decides whether to turn, accelerate, or brake. While he is only trained on several hours of my own driving data, he frequently reacts to situations he’s seen before in novel ways. To demonstrate this, you can see him trying out a favorite Mario Kart track of mine, Baby Park.

DaiLE trying Baby Park

The less simplified version is that DaiLE is a Convolutional Recurrent Neural Network (CRNN), with ResNet18 used as the image encoder, a 2-layer LSTM which processes encoded images and telemetry data, and an output layer which decides what controller inputs should be used on the next frame. DaiLE then sends these inputs to vJoy using pyvjoy bindings for Python, which goes through Steam’s Big Picture controller settings and into AC.

Of course, before DaiLE is capable of making decent decisions, he must be trained. I’ve created two custom tools for this purpose:

play_together.py: a data collection routine that allows for control of the car to be passed between DaiLE and a human player (inspired by sethbling’s MariFlow).
record_data.py: an (optionally asynchronous) data acquisition (DAQ) system which allows for custom sampling rate. Synchronous data collection blocked DaiLE from making inferences during the play_together routine, resulting in much poorer performance. Capturing data in a non-blocking way seems to be a direct improvement in most ways.

DaiLE, at this point in time, is trained in a purely supervised manner; he is shown random batches of 60-frame sequences (spaced 1/15 of a second apart) and asked to predict what input should be used for frame 60. Accordingly, he only shows signs of intelligence on tracks he has seen during his training. One might argue that this means DaiLE is simply being overfit to the tracks he’s seen before, but I think this isn’t the case, as the probability space on-track is quite large and he shows novel behavior even in situations very similar to those with which he should be familiar.

Once DaiLE has been trained, it’s time to test him on track. I’ve created a DaiLE class that allows for instantiation of a specific version of the model (in case you have trained several different ones) and makes on-track, real-time inference painless. The inference runtime uses an OpenCV window to show the user DaiLE’s inputs at all times.

DaiLE making a pass at the final and first corners of Red Bull Ring

Challenges

There have been several key challenges throughout DaiLE’s development, most to do with performance. Getting a functional prototype ready was fairly straightforward, but optimization far and away took up the majority of my time with this project. As a college student with a (nice for the time I bought it) five year old gaming desktop and no money to rent GPU’s online, I had to work within tight constraints.

Framing the problem is difficult.

Should this be a classification or a regression problem? There are many ways you could approach this, but ultimately I chose to frame it as a regression task. I’m aware that you could just as easily bin the different steering, braking, and acceleration values and frame it as a classification problem, but regression seemed more natural to me. Which is better overall? While I have the code implemented to train and test either framing, I unfortunately lack the resources to effectively study both.

Real-time data acquisition is complicated.

I found creating a consistent stream of data is quite difficult in the presence of a resource heavy simulation like Assetto Corsa. My initial data sampling technique worked flawlessly when tested in a vacuum — it captured screenshots and my controller input without any delay or errors when the game was not running — but it completely fell apart when it was time to actually acquire the data for this project. Controller inputs were particularly troublesome, as a quick survey of the data after a test run revealed that the recorded controller inputs did not reflect my true inputs.

The culprit turned out to be the way I was capturing screenshots. Every time I grabbed the screen, everything else in my data acquisition script stopped and waited until it was done. I addressed this by making a data_recorder class that asynchronously captures all the data it needs, meaning it no longer blocks the rest of the data acquisition process.

Training on video is time-consuming.

Video data is notoriously expensive to train on. The medium is data rich, cumbersome to pre-process (in comparison to tabular data), and chews up memory very quickly. In addition, creating a PyTorch DataLoader is more complicated for this format than for the tabular data that I typically work with. I felt the effects of all these things quickly: as my data set grew in size, the training time ballooned with it. I worked to optimize this process as much as possible, but I was forced to start training DaiLE over night. I have ideas for making training more efficient than it is currently but have not had time to implement them yet.

Real-time inference is demanding.

My computer needed to run Assetto Corsa and DaiLE simultaneously at high frame rates. Because DaiLE was trained on data captured at 15 frames per second, any significant dip below this frame rate results in erratic behavior. At the same time, however, I found that training DaiLE on data captured at less than 15 frames per second resulted in erratic behavior regardless of Assetto Corsa’s frame rate at inference time.

What’s Next?

Supervised DaiLE

I am far from done with DaiLE. Even in his current supervised form, I feel there is a lot to optimize and improve. I’d like to collect significantly more training data and truly test the limits of what the architecture I’ve designed can really learn. DaiLE has demonstrated glimpses of good decision making skills, even with the rather limited data set I have collected. With more data, I feel that he will improve further.

Unsupervised DaiLE

However, once I’ve exhausted my ideas on this iteration of the project, I want to move on to a reinforcement learning approach. I’ve seen incredible success with sethbling’s MarIQ and Yosh’s Trackmania AI. I know this stage will require even heavier compute than supervised DaiLE, but I am hoping to have access to better training resources by the time I am ready to take things to the next level!

Closing Thoughts

Overall, DaiLE has been an exciting test of my skills, and this project incorporates many of the things I enjoy: machine learning, software engineering, racing, and project management. It’s forced me to design and redesign and re-redesign systems to make them consistent, modular, and efficient. At the same time, it’s been a joy to work on — seeing DaiLE successfully for the first time felt like magic. I hope to spend a lot more time on projects like these.

Thank you for taking the time to read this article! If you have any questions about this project, please drop a message in the comments or contact me on LinkedIn. You can find the freely distributable and modifiable code for my work here in my DaiLE GitHub repo.

About Me

A year and a half ago, at the end of a long internship with Honda, I decided to completely switch my career path from design-focused mechanical engineering to machine learning engineering. I’ve always been passionate about computer science, despite not studying it formally, and have been coding for years, so the switch was more a change in focus than a total paradigm shift. Still, I needed to dedicate some serious self-study time to machine learning. I have poured many weekends and weeknights into courses on Udacity and Udemy, ranging from Computer Vision to Web Design (since many ML products are web-based), which led me into my first internship as a ML researcher at a local hospital.

I’ve since begun my master’s degree, submitted my first paper for publication, and now am formally taking courses in computer science at The Ohio State University. And now, with DaiLE, I’ve created my first openly available ML project from scratch (i.e. no pre-made project outlines or hand holding guides).