The Unscented Kalman Filter: Nonlinear State Estimation#

Limitations of the Standard (Linear) Kalman Filter#

So far, we have discussed the standard Kalman Filter algorithm. However, we have not mentioned its limitations. The standard Kalman Filter assumes that the system is both linear and Gaussian. In other words, the uncertainties in the motion and measurement models are assumed to be normally distributed about a mean in order to produce optimal estimates, which allows us to represent the state estimate as a Gaussian with mean and variance. For many systems, the Gaussian assumption is a good one. Intuitively, one can imagine that a sensor’s noise, for example, varies more or less symmetrically about a true mean value, with larger deviations occurring less frequently.

The greater constraint, however, is the assumption that the system is linear. What we mean by this is that the state transition function and measurement function are linear functions, and as a result, when we pass Gaussian distributions through these functions, the output remains Gaussian or proportional to a Gaussian. An arbitrary nonlinear function, on the other hand, will not output another Gaussian or scaled Gaussian, which is a problem since so much of the Kalman Filter math depends on the state estimate being Gaussian. The Unscented Kalman Filter was expressly designed to robustly handle this issue of nonlinearity.

In this project’s \(z\)-axis UKF, the functions are linear, so indeed a standard Kalman Filter would suffice. However, for the second UKF that you will be implementing, there are nonlinearities due to the drone’s orientation in space. To make the transition easier from the first part to the second part of this project, we are asking you to implement a UKF even for a linear system. The UKF estimates will be the same as a KF; the only downsides might be code complexity and computation time. That said, you will be using a Python library called FilterPy (written by Labbe, author of Kalman and Bayesian Filters in Python [Jr18] that handles and hides most of the filtering math anyway.

You might also be wondering what the term “unscented” has to do with a Kalman Filter that applies to nonlinear systems. There is no greater technical meaning to the word; the inventor claims it is an arbitrary choice that resulted from his catching a glimpse of a coworker’s deodorant while trying to come up with a name for his filter [uhl].

Underlying Principle of the UKF#

To handle the nonlinearities, the UKF uses a sampling scheme. An alternative to the UKF known as the Extended Kalman Filter (EKF) uses Jacobians to linearize the nonlinear equations, but the UKF takes a deterministic sampling approach that in many cases results in more accurate estimates and is a simpler algorithm to implement [TellexBrownLupashin18].

The UKF uses a function to compute so-called sigma points, which are the sample points to pass through the state transition and measurement functions. Each sigma point also has corresponding weights for the sample’s mean and covariance. The sigma points are generated so that there are \(2n+1\) of them, where \(n\) is the size of the state vector. Imagine a one-dimensional state vector, for example, which we represent as a single-variable Gaussian. In this instance, \(2(1)+1=3\) sigma points are chosen. One of these points is the mean of the Gaussian, and the two other points are symmetric about the mean on either side. The exact distance of these points from the mean sigma point will vary depending on parameters passed into the sigma point function, but we do not expect you to worry about these parameters. The idea, though, is that these \(2(1)+1=3\) sigma points and their weights are sufficiently representative of the Gaussian distribution.

Next, these points that represent the Gaussian state estimate are passed through a nonlinear function (i.e., the state transition or measurement functions), which can scatter the points arbitrarily. We then want to recover a Gaussian from these scattered points, and we do so by using the unscented transform, which computes a new mean and covariance matrix. To compute the new mean, the unscented transform calculates a weighted sum of each sigma point with its associated sample mean weight.

UKF in the Prediction Step#

The UKF formulates the prior state estimate by specifying a set of sigma points \(\boldsymbol{\mathcal{X}}_{t-\Delta t}\) according to the current state estimate and then propagating these points through the state transition function to yield a new set of sigma points \(\boldsymbol{\mathcal{Y}}_t\), which are passed through the unscented transform to produce the prior state estimate.

UKF in the Update Step#

Below is the algorithm for the Unscented Kalman Filter [TellexBrownLupashin18], [Jr18]. Note that the sigma point weights denoted by \(W_i^{(m)}\) and \(W_i^{(c)}\) can be computed as part of a number of sigma point algorithms. We will use Van der Merwe’s scaled sigma point algorithm to compute the sigma points and weights [MW03], [Jr18]. The sigma points get computed at each prediction, whereas the weights can be computed just once upon filter initialization.


\(\hspace{5mm} \textbf{function}\text{ predict}( \mathbf{x}_{t-\Delta t}, \mathbf{P}_{t-\Delta t}, \mathbf{u}_t, \Delta t )\)
\(\hspace{10mm} \texttt{// Compute 2n+1 sigma points given the most recent state estimate}\)
\(\hspace{10mm} \boldsymbol{\mathcal{X}}_{t-\Delta t} = \text{compute\_sigma\_points}(\mathbf{x}_{t-\Delta t}, \mathbf{P}_{t-\Delta t})\)
\(\hspace{10mm} \texttt{// Propagate each sigma point through the state transition function}\)
\(\hspace{10mm} \text{for }(i = 0; i\leq 2n; i\texttt{++}):\)
\(\hspace{15mm} \boldsymbol{\mathcal{Y}}_{i,t} = g(\boldsymbol{\mathcal{X}}_{i,t-\Delta t}, \mathbf{u}_t, \Delta t)\)
\(\hspace{10mm} \texttt{// Compute the prior mean and covariance by passing the sigma}\)
\(\hspace{10mm} \texttt{// points through the unscented transform (the next two lines)}\)
\(\hspace{10mm} \mathbf{\bar x}_t = \sum_{i=0}^{2n} W_i^{(m)} \boldsymbol{\mathcal{Y}}_{i,t}\)
\(\hspace{10mm} \mathbf{\bar P}_t = \sum_{i=0}^{2n} W_i^{(c)} \left(\boldsymbol{\mathcal{Y}}_{i,t}-\mathbf{\bar x}_t \right)\left(\boldsymbol{\mathcal{Y}}_{i,t}-\mathbf{\bar x}_t \right)^\mathsf{T} + \mathbf{Q}_t\)
\(\hspace{10mm} \text{return } \mathbf{\bar x}_t, \mathbf{\bar P}_t\)

\(\hspace{5mm} \textbf{function}\text{ update}(\mathbf{\bar x}_t, \mathbf{\bar P}_t, \mathbf{z}_t)\)
\(\hspace{10mm} \texttt{// Compute the measurement sigma points}\)
\(\hspace{10mm} \text{for }(i = 0; i\leq 2n; i\texttt{++}):\)
\(\hspace{15mm} \boldsymbol{\mathcal{Z}}_{i,t} = h(\boldsymbol{\mathcal{Y}}_{i,t})\)
\(\hspace{10mm} \texttt{// Compute the mean and covariance of the measurement}\)
\(\hspace{10mm} \texttt{// sigma points by passing them through the unscented}\)
\(\hspace{10mm} \texttt{// transform (the next two lines)}\)
\(\hspace{10mm} \boldsymbol{\mu}_z = \sum_{i=0}^{2n} W_i^{(m)} \boldsymbol{\mathcal{Z}}_{i,t}\)
\(\hspace{10mm} \mathbf{P}_z = \sum_{i=0}^{2n} W_i^{(c)} \left(\boldsymbol{\mathcal{Z}}_{i,t}-\boldsymbol{\mu}_z \right)\left(\boldsymbol{\mathcal{Z}}_{i,t}-\boldsymbol{\mu}_z \right)^\mathsf{T} + \mathbf{R}_t\)
\(\hspace{10mm} \mathbf{y}_t = \mathbf{z}_t - \boldsymbol{\mu}_z\)
\(\hspace{10mm} \texttt{// Compute the cross covariance between state and measurements}\)
\(\hspace{10mm} \mathbf{P}_{xz} = \sum_{i=0}^{2n} W_i^{(c)} \left(\boldsymbol{\mathcal{Y}}_{i,t}-\mathbf{\bar x}_t \right)\left(\boldsymbol{\mathcal{Z}}_{i,t}-\boldsymbol{\mu}_z \right)^\mathsf{T}\)
\(\hspace{10mm} \texttt{// Compute the Kalman gain}\)
\(\hspace{10mm} \mathbf{K}_t = \mathbf{P}_{xz}\mathbf{P}_z^{-1}\)
\(\hspace{10mm} \texttt{// Compute the mean of the posterior state estimate}\)
\(\hspace{10mm} \mathbf{x}_t = \mathbf{\bar x}_t + \mathbf{K}_t\mathbf{y}_t\)
\(\hspace{10mm} \texttt{// Compute the covariance of the posterior state estimate}\)
\(\hspace{10mm} \mathbf{P}_t = \mathbf{\bar P}_t - \mathbf{K}_t\mathbf{P}_z\mathbf{K}_t^\mathsf{T}\)
\(\hspace{10mm} \text{return } \mathbf{x}_t, \mathbf{P}_t\)

\(\hspace{5mm} \textbf{function}\text{ unscented\_kalman\_filter}( \mathbf{x}_{t-\Delta t}, \mathbf{P}_{t-\Delta t} )\)
\(\hspace{10mm} \mathbf{u}_t = \text{get\_control\_input}()\)
\(\hspace{10mm} \Delta t = \text{compute\_time\_step}()\)
\(\hspace{10mm} \mathbf{\bar x}_t, \mathbf{\bar P}_t = \text{predict}( \mathbf{x}_{t-\Delta t}, \mathbf{P}_{t-\Delta t}, \mathbf{u}_t, \Delta t )\)
\(\hspace{10mm} \mathbf{z}_t = \text{get\_sensor\_data}()\)
\(\hspace{10mm} \mathbf{x}_t, \mathbf{P}_t = \text{update}(\mathbf{\bar x}_t, \mathbf{\bar P}_t, \mathbf{z}_t)\)
\(\hspace{10mm} \text{return } \mathbf{x}_t, \mathbf{P}_t\)


First-hand: the unscented transform. Engineering and Technology History Wiki. Contributions from Uhlmann, Jeffrey. Accessed August 31, 2018. URL:


Roger R Labbe Jr. Kalman and bayesian filters in python. Published as a Jupyter Notebook hosted on GitHub at and in PDF form at (accessed August 29, 2018), May 2018.


Rudolph Van Der Merwe and Eric Wan. Sigma-point kalman filters for probabilistic inference in dynamic state-space models. In In Proceedings of the Workshop on Advances in Machine Learning. 2003.


S. Tellex, A. Brown, and S. Lupashin. Estimation for Quadrotors. ArXiv e-prints, August 2018. arXiv:1809.00037.