Kalman filter: Intuition and discrete case derivation

Vivek Yadav
Towards Data Science
6 min readMar 4, 2017

--

Introduction

In this post, we will go over derivation of a discrete Kalman filter. We will first set up equations of a system governed by discrete dynamics, then express the approximate system, compute error covariances and calculate an update rule that minimizes error covariance. As estimation via kalman filtering involves successively measurement and state propagation, they are easier to understand in the case of discrete implementation.

This section is part of the advanced control systems course I developed at Stony Brook University. I will keep adding additional notes on medium over time, but in case you are interested, complete notes can be found here. Details of involved probability concepts can be found in an introductory lesson. First we will go over the types of uncertainties a model may have.

From above, depending on the type of uncertainty, the effect on the model may vary. As we do not know the true uncertainty nor the structure of it, one safe thing to do is assume that the underlying errors are zero-centered Gaussian processes. This assumptions is not wrong as in most cases we are able to at least approximately model the system using system identification methods. As an example, consider the case where we get the states and derivatives of the states and fit a linear model to the data. This gives us an approximate model, and we can assume any deviation from this idealized model follows a Gaussian distribution.

Kalman filter intuition-I

The animation below shows the intuition behind Kalman filters. The states propagate following system dynamics. As we do not know the true values of the states, we estimate them based on measurements. As these measurements come at discrete times, there is uncertainity in what the true states may be. However, we know that the states must follow some approximate dynamics. In the animation below, a point is moving with constant velocity, and as position, velocity or acceleration are not precisely known.

As the exact position, velocity and accelerations are not known, the possible locations of the red points after some times diverge. This leads to an error in estimate in the true state. When a new measurement comes, we can use this information to discard some values that are less unlikely. Before describing further, lets clarify some terminology,

  • State Propagation refers to the process where we allow the system to follow its dynamics and compute approximate state values before a measurement comes.
  • Apriori estimates are the values computed using state propagation, BEFORE taking into account the new measurement.
  • Posterior estimates are the values computed AFTER taking into account the new measurement.
  • Error covariance are the errors between the true state and estimates.

Using the terminology above, we perform state propagation after a measurement is available. We use state propagation to compute apriori estimates of probability distribution of states. Once a new measurement is available, we perform a Bayes’ rule update to obtain better estimate of probability. Bayes rule for probabilities are given below. These are taken from my advanced control systems notes on uncertainty.

Performing this step gives a better estimate of position and velocity, despite the fact that we have estimate of position alone.

Kalman filter intuition-II

We will go over another example to better understand how kalman filters can combine measurement from one state and system dynamics to give better estimates of both the measured and unmeasured states.

Kalman filters perform state estimation in two primary steps. The first step involves propogation of system dynamics to obtain apriori probability of states, once the measurements are obtained the state variables are updated using Bayes theorm. This is illustrated by the example below. Consider the case of a car moving along a straight line with a fixed velocity. Further, say we have measurement of position only, and not velocity. At the start, only the position is known from the sensor, so the region where the robot’s states could lie is shown in the figure below, i.e. the robot is at position 0, but velocity can be any of the values along the vertical line.

Now consider the scenario 1 second after the robot moves. As we are moving for 1 s, the new position is previous position plus the velocity. Therefore, the probability distribution after moving for 1 second with constant velocity is given by the red points in the figure below.

Assuming that the robot moves with a velocity of 1, the next position estimate is around 1. By performing a second measurement at this point, we get the region in blue as the possible values of the position estimate.

Therefore, from system dynamics, we expect the new position to be in the red region, and from sensors we expect the states to be in the blue region. The overlapping region between these two distributions is very small, and we can say that the true states of the particle will lie in this region. Bayes’ theorem can be applied to estimate the multivariate state distribution probability. The corresponding probability distributions of velocity and position are presented along x- and y- axis. This process of combining system dynamics with state measurements is the underlying principle of Kalman filters. Kalman filters provide good estimation properties and are optimal in the special case when the process and measurement follow a Gaussian distributions.

Some more intuitive explanations:

  • S is the sensitivity matrix that inversely weighs measurement errors. So if we have a sensor that is too noisy, the corresponding entry in R will be high, and by taking inverse, measurement from this sensor will be weighed less.
  • If Q were high, i.e. we do not know the true model and our approximate model has large deviations, the apriori covariance estimates are going to be large, which makes intuitive sense.
  • For linear systems, the gain term K acts as a weighing factor between state estimate from apriori predictions and corrections based on measurements.

TO ADD

NIS, CONVERGENCE

Apologies

I apologize for copy paste of screen shots from my notes. I haven't figured out a good way of incorporating equations in Medium.

References:

All the screenshots above are from an Advanced Control Systems course I developed at Stony Brook University. The link to full course is https://mec560sbu.github.io/.

--

--

Staff Software Engineer at Lockheed Martin-Autonomous System with research interest in control, machine learning/AI. Lifelong learner with glassblowing problem.