The world’s leading publication for data science, AI, and ML professionals.

Kalman Filter(1) – The Basics

Basics of self-driving car localisation

I was trying to learn Kalman Filter, a way to combine your guesses and some uncertain measurements and make a better estimation, and found there is no such easy-to-understand topics out there. But later on, I came across this course, which introduces the idea from the very fundamental. So in this post, I will follow the structure from the course and give a brief introduction of the basics of self-driving car localisation, which is also the starting point of Kalman Filter.

Problem Setting

In self-driving car localisation, there are typically two components – the first is movement, when a car kicks a throttle, we surely can estimate(guess) how far it can go, and the second is measurements, the sensor installed in the car is able to detect the environment and measure where it is. Now the question comes, both our estimation(guess) and sensor measurement can be inaccurate, combining these two uncertain components, are we still able to make a guess on the car’s location or even make a better guess?

The answer is yes, and let’s get into a concrete example to see how this can be solved with basic statistics:


Say our car drives in a 1-dimensional world, where there are only 5 grids and 2 different colours – green and red. In the beginning, our car has an equal probability of staying in any of the 5 grids, and it also has a sensor Z on it, which detects the colour of the world, however, the sensor is not always correct, when the sensor says the grid is red , we give red a higher weight of pHit = 0.6 , and give green a lower weight pMiss = 0.2 (here we use weight, in fact it could be normalised as probability when added up to 1). Now the question is when our sensor Z = red , what is the probability of our car’s location?

The solution is simple, our sensor see the grid is red , so it is more likely to be in the second and third cell. For the red cells we give higher weight and green cells lower weight:

The final probability needs to be normalised in order to add up to 1. Running this, we get the result:

array([0.11111111, 0.33333333, 0.33333333, 0.11111111, 0.11111111])

clearly the second and third cell has higher probability than the rest.

In fact, the statistics behind it is Bayes rule:

Taking the example of the probability in cell 2(X_2) , combine formula (2) and (3), you can easily calculate the probability is:

0.2*0.6 / (0.2*(0.2+0.6+0.6+0.2+0.2))

the denominator is a normaliser, and is same for each cell which corresponds to sum(combine_prob) in our code above.

Now we are able to calculate the posterior probability after sensing the environment, let’s move the car and see how it affects localisation.


In this horizontal 5-grid world, our car is able to move left and right with number of steps defined as U . However, it may move inaccurately to unexpected cells and the movement probability has the distribution:

It has 0.8 chance move correctly, 0.1 chance move 1 step further than planned and 0.1 chance move 1 step shorter. Now the question is give a prior distribution p (probability in each cell), what is the posterior probability q after taking a move U ?

Let’s get into an example:

suppose our car is instructed to move 1 step, then what is the probability of landing in cell 3?

Let’s consider the problem in a reverse way: after moving 1 step, in what scenarios will the car land in cell 3? It could move correctly with probability 0.3 from cell 2, and with probability 0.1 of undershoot from cell 3 and another 0.1 of overshoot from cell 1.

The statistic here is called total probability:

The probability of landing in X_j equals the summation of probability of starting from every possible cell X_i multiplies the probability of moving from X_i to X_j .

Notice here for U larger than the total number of cells n , the car will cycle back to step U % n .

Combine Together

Now we have the 2 most important components – sense, and move, our car is ready to move. The process would follow: the car sense the environment → make a movement → sense the environment → make a movement, …

In this example, our car starts with uniform distribution, and has 2 sequential measurements red and green , and each time it moves 1 step. We get the final probability:

[0.21157, 0.15157, 0.08105, 0.1684, 0.3873]

So after a series of steps, it most likely to land in the last cell with probability 0.3873.



Related Articles