Kalman Filter
kf.Rd
These functions implement the "standard" Kalman filter and the "square root" Kalman filter (also called "square root covariance filter") for time invariant, linear state space systems without exogenous inputs, see e.g. (Anderson and Moore 2005) .
Usage
kf_cpp(A, C, Q, R, S, y_t, P1, a1)
kf2_cpp(A, C, H_t, y_t, P1_R, a1)
kf(model, y, method = c("kf", "kf2"), P1 = NULL, a1 = NULL)
Arguments
- A
\((s,s)\) dimensional state transition matrix \(A\).
- C
\((m,s)\) dimensional matrix \(C\).
- Q, R, S
The variance, covariance matrices of the "state disturbances" (\(Bu_t\)) and the "measurement disturbances" (\(Du_t\)) as described above. These matrices must be of dimension \((s,s)\), \((m,m)\) and \((s,m)\) respectively.
- y_t
\((m,N)\) transposed data matrix
y_t = t(y)
.- P1
\((s,s)\) dimensional covariance matrix of the error of the initial state estimate, i.e. \(\Pi_{1|0}\). If
NULL
, then the state covariance \(P = APA'+B\Sigma B'\) is used. Note that this scheme assumes that the state space model is stable, i.e. that the state transition matrix \(A\) is stable.- a1
\(s\) dimensional vector, which holds the initial estimate \(a_{1|0}\) for the state at time \(t=1\). If
a1=NULL
, then a zero vector is used.- H_t
\((n,s+m)\) dimensional matrix. This parameter corresponds to the transpose \(H'\) of \(H=(D',B')'\Sigma^{1/2}\).
- P1_R
(right) square root of
P1
, i.e.P1 = t(P1_R) \%*\% P1_R
.- model
stspmod()
object, which represents the state space model.- y
sample, i.e. an \((N,m)\) dimensional matrix, or a "time series" object (i.e.
as.matrix(y)
should return an \((N,m)\)-dimensional numeric matrix). Missing values (NA
,NaN
andInf
) are not supported.- method
Character string. If
method="kf"
thenkf
callskf_cpp
("standard form" of the Kalman filter) and formethod="kf2"
the "square root" form of the Kalman filter is used, i.e.kf2_cpp
is called. Up to numerical errors the outputs should not depend on the chosen method.
Value
List with components
- e
\((N,m)\) dimensional matrix with the standardized one-step ahead prediction errors. The \(t\)-th row of the matrix
e
corresponds to $$e_t = \Sigma_{t|t-1}^{-1/2}\epsilon_{t|t-1}.$$ If the model is correctly specified then these standardized residuals are white noise with a unit covariance matrix. So they may be used for validaton of the model.- a
\((N+1,s)\) dimensional matrix with the estimated states. The \(t\)-th row of the matrix
a
corresponds to \(a_{t|t-1}\). Giveny
anda
, the one step ahead predictions \(y_{t|t-1}\) may be computed withyh = a \%*\% t(C)
.- ll
(scaled) Gaussian log likelihood of the model $$-\frac{1}{2N}\sum_{t=1}^{N}\left(m\log(2\pi) + \log\det\Sigma_{t|t-1} + (y_t - y_{t|t-1})' \Sigma_{t|t-1}^{-1} (y_t - y_{t|t-1}) \right).$$
- P1
\((s,s)\) dimensional covariance matrix of the error of the state prediction \(a_{N+1|N}\), i.e. this matrix corresponds to \(\Pi_{N+1|N}\).
Details
The model considered is $$a_{t+1} = A a_t + Bu_t$$ $$y_t = C a_t + Du_t$$ with \(m\)-dimensional outputs \(y_t\), \(s\)-dimensional states \(a_t\) and \(n\)-dimensional disturbances \(u_t\). The disturbances are white noise with a covariance matrix \(\mathbf{E} u_tu_t'=\Sigma\). Note that the disturbances and the outputs may have different dimensions, however, only "wide" systems with (\(m\leq n\)) are implemented.
The Kalman filter is a recursive scheme to compute the linear, least squares predictions for \(a_{t+1}\) and \(y_{t+1}\) given the observations \(y_t,\ldots,y_1\) up to time \(t\). These predictions are notated with \(a_{t+1|t}\) and \(y_{t+1|t}\), the prediction error for the output \(y_{t+1}\) is \(\epsilon_{t+1|t}=(y_{t+1}-y_{t+1|t})\) and the corresponding variances of the prediction errors are $$\Pi_{t+1|t}=\mathbf{E}(a_{t+1}-a_{t+1|t}) (a_{t+1}-a_{t+1|t})',$$ $$\Sigma_{t+1|t}=\mathbf{E}(\epsilon_{t+1|t} \epsilon_{t+1|t}').$$
The standard form of the Kalman filter is based on the parameter matrices \(A,C\), the variance of "state disturbances" \(Q=\mathbf{E}(Bu_t (Bu_t)')=(B\Sigma B')\), the variance of the "measurement disturbances" \(R=\mathbf{E}(Du_t (Du_t)')=(D\Sigma D')\) and the covariance \(S=\mathbf{E}(Bu_t(Du_t)')=(B\Sigma D')\). Furthermore we need the initial prediction \(a_{1|0}\) and the corresponding error variance \(\Pi_{1|0}\).
For the square root form of the filter we need the "square roots" \(\Pi_{1|0}^{1/2}\) and \(\Sigma^{1/2}\), i.e. matrices such that \(\Pi_{1|0} = \Pi_{1|0}^{1/2} (\Pi_{1|0}^{1/2})'\) and \(\Sigma = \Sigma^{1/2}(\Sigma^{1/2})'\). In addition, we define \(H=(D',B')'\Sigma^{1/2}\).
The routines kf_cpp
, kf2_cpp
are RcppArmadillo implementations of the standard form and
of the square root form of the Kalman filter. The wrapper function kf
takes an stspmod()
object,
which describes the state space model, and then calls the approriate RcppArmadillo
function.
Square root Kalman filter: For the square root
\(\Pi_{1|0}^{1/2}\) the procedure first tries the Cholesky decomposition.
If this fails (since \(\Pi_{1|0}^{1/2}\) is (close to) singular),
then ll_kf
tries to compute a symmetric square root via the eigenvalue decomposition
of \(\Pi_{1|0}^{1/2}\).
Notes
The RcppArmadillo
functions (kf_cpp
and kf2_cpp
) do not check the input parameters,
so these function must be used with some care.
The procedures only accept "wide" state space systems (\(m \leq n\)), since for "tall" systems (\(m > n\)) the variance of the prediction errors (\(\Sigma_{t+1|t}\)) is singular for \(t\) larger than some threshold.
Up to now, there is no support for models with exogenous inputs.
References
Anderson BDO, Moore JB (2005). Optimal filtering. Dover Publications Inc., London. Originally published: Englewood Cliffs, Prentice-Hall 1979.
See also
There exist a number of R packages which implement the Kalman filter, e.g. KFAS. However, most of these packages do not allow for correlations between the "state noise" and the "measurement noise".
If only the likelihood is needed, then one may use ll_kf()
.
Examples
s = 4 # state dimension
m = 2 # number of outputs
n = 3 # number of inputs,
n.obs = 100 # sample size
# generate a (stable) state space model
tmpl = tmpl_stsp_full(m, n, s, sigma_L = "chol")
model = r_model(tmpl, bpoles = 1, sd = 0.5)
# generate a sample
data = sim(model, n.obs = n.obs, a1 = NA)
# compute Q, R, S and P1
sigma_L = model$sigma_L
sigma = tcrossprod(sigma_L)
R = model$sys$D %*% sigma %*% t(model$sys$D)
S = model$sys$B %*% sigma %*% t(model$sys$D)
Q = model$sys$B %*% sigma %*% t(model$sys$B)
P1 = lyapunov(model$sys$A, Q)
# call Kalman filter. Note y_t = t(y)!
out = kf_cpp(model$sys$A, model$sys$C, Q, R, S, t(data$y), P1, double(s))
# use the wrapper function
out_test = kf(model, data$y, method = 'kf')
all.equal(out, out_test)
#> [1] TRUE
# compute H and square root of P1
H = rbind(model$sys$D, model$sys$B) %*% sigma_L
P1_R = chol(P1)
# call square root Kalman filter. Note H_t = t(H) and y_t = t(y)!
out_test = kf2_cpp(model$sys$A, model$sys$C, t(H), t(data$y), P1_R, double(s))
all.equal(out, out_test)
#> [1] TRUE
# use the wrapper function
out_test = kf(model, data$y, method = 'kf2')
all.equal(out, out_test)
#> [1] TRUE
# The one step ahead predictions for y[t] may be computed by
yh = out$a %*% t(model$sys$C)
# and the (non scaled) prediction errors are
uh = data$y - out$a[1:n.obs,] %*% t(model$sys$C)