CarND Project 1: Lane Lines Detection — A Complete Pipeline
Some of you may know that I’ve recently started Udacity’s Self-Driving Car Engineer Nanodegree. This is an amazing and unique program, which consists of a number of projects on major aspects of autonomous driving. Here’s my solution to the first project: Lane Lines Detection. First of all, watch the video below — I’m a big believer in getting inspired first and going into details afterwards.
Now you see how cool this project is, let’s dig into details of implementation.
The goal of this project is to build up a simple image pipeline (take a frame from video as an input, do something, return a modified version of the frame), which allows detecting lane lines in simple conditions: sunny weather, good visibility, no cars in sight, only straight lanes. One more thing: our lane line detector should be linear. No polynomials are allowed yet!
Of course, this is a toy project and it is not intended for production use, but it does help to get some intuition for the problems which self-driving car engineers solve.
My achievements:
— Completely dropped hardcoded region of interest (except for the first frame when we have to initialize the lane).
— Code works well on any Project 1 andProject 4 (advanced lane lines) videos (except for the last harder challenge) without any tuning. Mountain roads are out of scope of this project anyway.
Image Pipeline explained
Let me just copy and paste the signature and docstring of image_pipeline
function from code:
The pipeline takes a single 3-channel RGB image, filters and transforms it, updates the internal state of Lane
and Line
objects and draws all the required elements on top of this image. We can visualize the whole pipeline this way:
We may notice that Stage 1 and Stage 2 are independent from each other. We can (and should) abstract away the details of each stage to be able to change the code in a modular way.
Stage 1: Preprocessing and vectorization
The first stage is well-known to data scientists and anyone who works with raw data: we need to preprocess it, munge it and turn it into a working dataset using any sort of vectorization procedure we find appropriate. The following code transforms the raw image into a vectorized dataset we can work with in Stage 2:
Let’s briefly discuss the code. Our project works on top of OpenCV, a terrific library for image manipulation on a pixel level using matrix operations. Since our pipeline quality is dependent on a correct color selection, we need a way to efficiently select a range of similar colors (white and yellow in our case). Standard RGB palette is not suitable for this — we have to convert our image to HSV palette first.
Next, we skip recommended grayscaling and gaussian blur phase. I found out that we can go straight to binarization—combining binary mask with blurred image just adds unnecessary noise to our vectorizer. We select yellow and white color ranges and get a binary mask of our frame:
Finally, we are ready to vectorize this image. We apply two transformations:
- Canny edge detection: an algorithm, which calculates the intensity gradients of the image and applies double threshold to determine edges (we use
(280, 360)
as a threshold incanny
function). - Hough lines transform: when we have single edges from canny detection, we can connect them using lines. We are not going to dig deeper into details of this algorithm (check out the link above, if you are curious, and here is another one), but the main takeaway is that we do get an array of lines (each is a
Line
instance with calculated slope, intercept, and more).
To initialize lane lines on the first frame of video, we have to use a region of interest masking out the rest of the image. Further on, we drop this region and only use the domain logic in Lane
and Line
classes to filter noisy data and decide whether to consider the output of Hough transform a candidate for being a segment of a lane line.
Stage 2: Lane Lines Updates
Such an update is called using one function update_lane(segments)
in image_pipeline
. As we see, we get segments
objects from the last stage (which are just Line
objects from Hough transform).
To facilitate the whole housekeeping, I decided to use OOP approach and represent lane lines as properties of Lane
class: Lane.left_line, Lane.right_line
. I could have used global
objects, but I’m not a fan of such an approach: polluting the global namespace is definitely a bad practice and adds chaos and uncertainty into code.
Let’s look closer toLine
and Lane
classes and instances.
Line
instance represents a single line: a segment of a lane line or just any random line we got from Hough transform. The main purpose of Lane
object is to calculate its relation to lane lines: can we consider it a candidate for being a segment of a lane line? To do so, we use the following domain logic:
Such a conservative and picky candidate selection process allowed me to drop the ROI. We may still get noisy data, but the process accepts only a fraction of lines which are close to lane lines and look similar to to current lane lines, too.
We use a trivial logic to determine which lane line a Line
instance belongs to: this is determined by the slope of the line. A lot of room for improvement here.
Lane
class is a container for two lane lines (which are instances of this very Lane
class — this part needs refactoring). It also provides a number of methods related to lane lines, the most important of which is fit_lane_line
. To get a new lane line, I represent positive Line
candidates as points and fit a 1st level polynomial (which is just a line) with numpy.polyfit
routine.
Lane lines stability. This is an important issue to be addressed. There are several stabilization techniques I used:
- Buffers. My lane line objects memorize N recent states and update the buffer itself by inserting a line state from the current frame.
- Smarter lane line state updates. If we still get noisy data after our filtering efforts, line fitting can easily go wrong. If we see that the estimated slope of the fitted line from the current frame differs too much from the buffer’s average, we need to treat this line more conservatively. For this very purpose, I created
DECISION_MAT
, which is a simple decision matrix on how to combine current line position and buffer’s average position.
For example, for DECISION_MAT = [[0.1,0.9],[1,0]]
we have just two cases: an unstable line (the slope difference from mean value is too high) and a stable line. For an unstable line we use a weighted average of 0.1
and 0.9
of lane’s current position and buffer’s mean value. For a stable line we simply use its current position without any weighting with historical data. Lane line’s stability indicator for the current frame is described in Lane.right_lane.stable
and Lane.left_lane.stable
boolean property. If any of them become False
, I visualize it as a red polygon between two lane lines (you’ll see it a bit later in this post).
Finally, we get pretty steady lane lines:
Stage 3: Drawing and updating the initial image
In order to render lane lines nicely, I wrote a routine which calculates the vanishing point (the point where two lane lines intersect) coordinates. Vanishing point has two purposes in my project so far:
- Limit the lane lines extrapolation to its coordinates.
- Every Hough line above vanishing point cannot be considered a candidate. This is a soft way to define a region of interest (instead of hardcoding it).
I like a lot nice visualizations so that I decided to implement some more drawing routines and used a unified signature for them:
Drawing itself is implemented as a step of transformations over the initial image:
As we see here, I render two snapshots and add them to the dashboard of the third image which is just a transformed initial image with lane lines on it. Such a structure makes it trivial recombine the things to be rendered on image and help visualize multiple components — all without any hassle or significant code changes.
By the way, a green polygon between lane lines which occasionally changes color to red is the lane line’s stability indicator which I described earlier.
Where to go from here?
This project is far from being over. The more I touch it, the more I find new things to work on:
- Making the detector non-linear. As far as I understand, this is the main goal of Project 4: Advanced Lane Lines Detection.
- Instead of a naive way of image binarization (selecting color ranges) , implement a CNN which can detect parts of a lane line in different conditions.
- Road detection. It would be great to detect a road itself and to use it as a ROI.
The full code for this project is available on GitHub (direct link to Jupyter Notebook): https://github.com/Kidra521/carnd/blob/master/p1_lane_lines_detection/P1.ipynb
P.S. A fun part
Of course, there should be a fun part in this post! Let’s now see, how miserably this linear detector fails, when it comes to mountain roads. It starts somewhat satisfactory, although it’s too slow for such a fast-changing slope:
But in the forest when light changes quickly, it gets totally destroyed:
This is great — I’ll have a chance to work on this challenge in Project 4.
Stay tuned for more self-driving car awesomeness in the future!