Diving into the world of self-driving cars

Michael Virgo
Towards Data Science
10 min readApr 16, 2017

--

A year ago, I left my job at one of the Big Four accounting firms and moved to the Bay Area to start shifting towards the industry I really wanted to be a part of: technology. In the last few months of my job, I had made a list of all the potential tech areas I had wanted to get involved in, and at the top of the list was self-driving cars. Of course, at the time only having a background in accounting, this seemed like a far-away goal.

I still stuck with an accounting job when I moved to the Bay, but I specifically chose a firm with a greater emphasis on work-life balance, since I knew that learning programming was going to take up a lot of time each week. But where to start? At first, not yet knowing self-driving cars would definitely be my target shortly thereafter, I actually began with some basic web design (think HTML and CSS) through Code Academy. Then, I picked up a book on Ruby, which got me more into creating true programs.

Finding Udacity

It was at this point in mid-2016 that I began reading more and more articles on self-driving cars, while continuing to believe that was years down the road for me. I saw that the Python programming language was very popular in the field, and soon read that Udacity had a great programming class for learning Python (“Intro to Computer Science” for those interested — I highly recommend it if you want to learn Python basics; you’ll create a basic search engine like Google’s!). As I neared the end of this course, Udacity announced one of the most exciting things I could think of, given my interests: a Self-Driving Car Nanodegree program.

For those who have never heard of Udacity’s Nanodegree programs, they teach you the skills necessary for jobs in a given field, as well as projects that can help bolster your portfolio — many recruiters look at an individual’s Github repository, for instance, as evidence of their abilities. These Nanodegree programs include such areas as Data Analysis, Machine Learning, iOS and Android development, and much more. The programs also include various career workshops to help improve your resume, interviewing skills, etc.

Machine Learning

As excited as I was to have a Nanodegree program entirely focused on getting someone ready to be a Self-Driving Engineer, I was definitely not yet ready for it. Luckily, Udacity was fairly open on some of the courses and Nanodegrees that would be important for the program, which was to have its first cohort of students only a couple months after the announcement of the program, in October 2016. One of these was the Machine Learning Nanodegree. Although I was likely a bit in over my head when I first started it, that program truly helped me with the eventual first term of the SDC Nanodegree, as it was largely focused on various machine learning techniques.

The Machine Learning Nanodegree has some great projects if you are interested in the field. I used supervised learning to predict such things as whether a certain individual would survive on the Titanic or needed additional assistance at school, unsupervised learning to divide various customer segments into groups, and reinforcement learning to teach a smartcab to arrive at its destinations both safely and efficiently. There are a ton of resources out there that can explain those techniques better than I have the space to do here. The MLND ends with a Capstone Project, focusing on a deeper look into areas such as Deep Learning — I am still finishing mine up in tandem with the SDCND, but if you’d like a preview, see my proposal or my in-progress Capstone repository. I am taking a Deep Learning-based approach to detecting road lanes.

The Self-Driving Car Nanodegree

Unlike the Machine Learning Nanodegree, Udacity decided to limit the amount of incoming students for the Self-Driving Car Nanodegree. The first cohort for October 2016 was to only accept around 500 students (which I believe was actually an increase from their initial number). Thousands applied, including me, although I was still only through a few projects of the MLND. I was accepted (!) — but not to October. I would be joining the third cohort, beginning in December. This actually worked perfectly as I only finished up the final pre-Capstone project of the MLND just as the December cohort was set to begin. I was incredibly excited, while also worried I was going to be getting in over my head. Not only was I truly getting into some advanced material, but my accounting busy season was coming (January through April are not very fun in the accounting profession).

Project 1

And so, I dove right in, hoping to get ahead of Udacity’s schedule for my cohort. In the first project, we learned how to use various computer vision techniques, including canny-edge detection, masking, and Hough transformation, in order to detect lane lines on a road. Although my project looked great on straight road lines, the method I used was too simple to work on curved lines, something I would soon learn to correct.

Project 2’s unbalanced traffic sign data

The next project, Classifying Traffic Signs, was my first true exposure into building a Deep Learning model. Here, I learned how to use TensorFlow to create a deep neural network, capable after training of highly accurate classification of road signs. An interesting challenge for this project was that the data given to us was highly imbalanced, meaning that certain signs had thousands of training images while others had only a couple hundred. This presents a problem for neural networks — given that they attempt to minimize loss, an imbalanced training set could mean the neural network learns to minimize loss by simply always getting certain signs wrong. So, if you do not check your data, you could see your model achieve 90% validation accuracy (fairly robust at face value), while it could still be getting a few signs that only each made up a couple percent of the training images 100% wrong. Using augmented images from the original training data, but only for those below the average number of images for each traffic sign, I created supplementary traffic sign images to help balance out the impact, and ended up with a model that was far more accurate on images it had never seen.

Smooth sailing in Project 3

The third project of the first term upped the coolness factor, especially because Udacity made a simulator specifically designed for it. This project, using the Keras deep learning framework (which is built on top of TensorFlow), centered on the concept of Behavioral Cloning. With Behavioral Cloning, again using Deep Learning, you teach a certain behavior to the deep neural network. In this case, this meant that the neural network was fed images from the simulated car’s cameras (similar to the image above), with the label (i.e., what the neural network minimizes loss for and therefore will predict) for the image the car’s steering angle. Although Udacity eventually released some of their own training data, I chose to collect my own, which created some interesting issues. On the track, most of the time the steering angles are zero degrees, as you are driving straight. As you may have guessed based on what I mentioned for Project 2 above, this can create problems with unbalanced data. So, I had to be careful to collect sufficient data with the steering angle something other than zero, or else the car might just always drive straight. Another issue is that data cannot just be collected from the center of the lane — otherwise, when the car inevitably finds itself slightly off to the side, it will probably default back to its most likely outcome — drive straight. Again, I had to make sure that I had sufficient recovery data taken from the sides, including very sharp turn angles from odd places nearly driving off the side of the road. Perhaps the most rewarding experience of the entire Term 1 was watching my simulated car, driving based off my trained neural network, make its way around the entire track on its own.

Luckily, I had completed the first three projects in the first month or so of Term 1 — I had accomplished my goal of getting out in front before busy season! It very much took me the final two months of the term to be able to complete the final two projects.

Project 4 — Advanced Lane Detection

Remember the issues with curved lines in Project One? Project Four, Advanced Lane Lines, presented a more in-depth technique. Again using computer vision, this time learned how to use different gradient and color thresholds to create binary-activated images where only specific areas of the road video image would be left activated. Then using techniques to undistort the image (all cameras have a certain amount of distortion they naturally produce) and then perspective transform it (think a bird’s eye view of a road), my model would then calculate a polynomial function that fit the line. Given these polynomial functions, the detected lane, along with certain information like road curvature and and where in the lane the car was with respect to center, can then be put back onto the original image, as shown above. See my final project video with lane detection here.

Car detection heatmaps for Project 5

The final project of Term 1 focused on Vehicle Detection. My knowledge from the Machine Learning Nanodegree program came in handy here, as we used an algorithm for Support Vector Machines in helping to determine where in an image a car appeared. I had already used SVM’s multiple times before! Of course, there were still some important concepts to learn. Histograms of Oriented Gradients (HOGs, for short) helped train the SVM by showing the differences between gradients (i.e. changes in pixel values across an image to somewhat simplify) across an image with a car versus an image without a car. Next, the trained SVM was run on the given road image — but they were not 100% accurate. As such, I needed to remove false positives, while also accounting for potential misses of the car caused by where in the distance the car might appear. The heatmaps above show part of the solution — by removing detection spots with less than a certain number of detections, the above heatmaps were produced, and then labelled by bounding boxes, to show the true vehicles detected. See my finished product here.

What’s Next?

And so, I completed Term 1 of the Self-Driving Car Nanodegree program, learning skills I never would have dreamed of having within years just a year ago. If anyone is really interested in the area, I highly encourage them to apply (the cohort sizes have been continuing to increase). It does take a lot of time — I am still amazed that I managed to get it done before the official end of the term, given commitments I had with my regular job. Most people will likely need 20 hours a week to do it. It is certainly less expensive than a college education, although it will still set you back $800 per term (of which there are three).

But where do I go from here? My first goal is to get a job in the field. There are lots of jobs out there in this quickly expanding field, both with established car makers, large technology companies, and the startups that are so prevalent here in the Bay Area. Most of these, of course, still require multiple years of experience in various fields like robotics or deep learning. So while I am still lacking the experience in years that many of the companies want, I’ll keep working hard to hopefully make a model that can blow someone away. Here’s a preview of where I’m at on my deep-learning based model for lane detection, which can skip most of the computer vision-based techniques I learned above; I’m hoping that’s a start. Luckily in Silicon Valley many people are more focused on what you can do than simply how many years you’ve been doing something. For Udacity’s part, they’ve provided me with a mentor and lots of career content, as well as access to events with some great hiring partners, that also give me great hope that I’ll be able to make the jump to working directly on self-driving cars.

For now though, I am getting into Term 2 of the SDCND. Where Term 1 was Computer Vision and Deep Learning, Term 2 is about Sensor Fusion (using radar and lidar data to track objects around you), Localization, and Control. I’m expecting a tough term, given my lack of experience in each of these areas, but I’m also ecstatic to move another step along the way. At the end of the day, while my first goal is getting a job, the true end goal is to actually see an autonomous vehicle on the road, fully available to all who want to ride. That’s the real next step.

--

--