Kalman Filter
kf.RdThese functions implement the "standard" Kalman filter and the "square root" Kalman filter (also called "square root covariance filter") for time invariant, linear state space systems without exogenous inputs, see e.g. (Anderson and Moore 2005) .
Usage
kf(model, y, method = c("kf", "kf2"), P1 = NULL, a1 = NULL)Arguments
- model
stspmod()object, which represents the state space model.- y
sample, i.e. an \((N,m)\) dimensional matrix, or a "time series" object (i.e.
as.matrix(y)should return an \((N,m)\)-dimensional numeric matrix). Missing values (NA,NaNandInf) are not supported.- method
Character string. If
method="kf"then the standard form of the Kalman filter is used, and formethod="kf2"the square root form is used. Up to numerical errors the outputs should not depend on the chosen method.- P1
\((s,s)\) dimensional covariance matrix of the error of the initial state estimate, i.e. \(\Pi_{1|0}\). If
NULL, then the state covariance \(P = APA'+B\Sigma B'\) is used. Note that this scheme assumes that the state space model is stable, i.e. that the state transition matrix \(A\) is stable.- a1
\(s\) dimensional vector, which holds the initial estimate \(a_{1|0}\) for the state at time \(t=1\). If
a1=NULL, then a zero vector is used.
Value
List with components
- e
\((N,m)\) dimensional matrix with the standardized one-step ahead prediction errors. The \(t\)-th row of the matrix
ecorresponds to $$e_t = \Sigma_{t|t-1}^{-1/2}\epsilon_{t|t-1}.$$ If the model is correctly specified then these standardized residuals are white noise with a unit covariance matrix. So they may be used for validaton of the model.- a
\((N+1,s)\) dimensional matrix with the estimated states. The \(t\)-th row of the matrix
acorresponds to \(a_{t|t-1}\). Givenyanda, the one step ahead predictions \(y_{t|t-1}\) may be computed withyh = a \%*\% t(C).- ll
(scaled) Gaussian log likelihood of the model $$-\frac{1}{2N}\sum_{t=1}^{N}\left(m\log(2\pi) + \log\det\Sigma_{t|t-1} + (y_t - y_{t|t-1})' \Sigma_{t|t-1}^{-1} (y_t - y_{t|t-1}) \right).$$
- P1
\((s,s)\) dimensional covariance matrix of the error of the state prediction \(a_{N+1|N}\), i.e. this matrix corresponds to \(\Pi_{N+1|N}\).
Details
The model considered is $$a_{t+1} = A a_t + Bu_t$$ $$y_t = C a_t + Du_t$$ with \(m\)-dimensional outputs \(y_t\), \(s\)-dimensional states \(a_t\) and \(n\)-dimensional disturbances \(u_t\). The disturbances are white noise with a covariance matrix \(\mathbf{E} u_tu_t'=\Sigma\). Note that the disturbances and the outputs may have different dimensions, however, only "wide" systems with (\(m\leq n\)) are implemented.
The Kalman filter is a recursive scheme to compute the linear, least squares predictions for \(a_{t+1}\) and \(y_{t+1}\) given the observations \(y_t,\ldots,y_1\) up to time \(t\). These predictions are notated with \(a_{t+1|t}\) and \(y_{t+1|t}\), the prediction error for the output \(y_{t+1}\) is \(\epsilon_{t+1|t}=(y_{t+1}-y_{t+1|t})\) and the corresponding variances of the prediction errors are $$\Pi_{t+1|t}=\mathbf{E}(a_{t+1}-a_{t+1|t}) (a_{t+1}-a_{t+1|t})',$$ $$\Sigma_{t+1|t}=\mathbf{E}(\epsilon_{t+1|t} \epsilon_{t+1|t}').$$
The standard form of the Kalman filter is based on the parameter matrices \(A,C\), the variance of "state disturbances" \(Q=\mathbf{E}(Bu_t (Bu_t)')=(B\Sigma B')\), the variance of the "measurement disturbances" \(R=\mathbf{E}(Du_t (Du_t)')=(D\Sigma D')\) and the covariance \(S=\mathbf{E}(Bu_t(Du_t)')=(B\Sigma D')\). Furthermore we need the initial prediction \(a_{1|0}\) and the corresponding error variance \(\Pi_{1|0}\).
For the square root form of the filter we need the "square roots" \(\Pi_{1|0}^{1/2}\) and \(\Sigma^{1/2}\), i.e. matrices such that \(\Pi_{1|0} = \Pi_{1|0}^{1/2} (\Pi_{1|0}^{1/2})'\) and \(\Sigma = \Sigma^{1/2}(\Sigma^{1/2})'\). In addition, we define \(H=(D',B')'\Sigma^{1/2}\).
The Kalman filter is implemented in C++ via RcppArmadillo for performance. The wrapper function kf takes an stspmod() object,
which describes the state space model, and then calls the appropriate internal implementation.
Square root Kalman filter: For the square root
\(\Pi_{1|0}^{1/2}\) the procedure first tries the Cholesky decomposition.
If this fails (since \(\Pi_{1|0}^{1/2}\) is (close to) singular),
then ll_kf tries to compute a symmetric square root via the eigenvalue decomposition
of \(\Pi_{1|0}^{1/2}\).
Notes
The internal C++ implementations do not check the input parameters,
so direct use (via .Call()) must be done with care.
The procedures only accept "wide" state space systems (\(m \leq n\)), since for "tall" systems (\(m > n\)) the variance of the prediction errors (\(\Sigma_{t+1|t}\)) is singular for \(t\) larger than some threshold.
Up to now, there is no support for models with exogenous inputs.
References
Anderson BDO, Moore JB (2005). Optimal filtering. Dover Publications Inc., London. Originally published: Englewood Cliffs, Prentice-Hall 1979.
See also
There exist a number of R packages which implement the Kalman filter, e.g. KFAS. However, most of these packages do not allow for correlations between the "state noise" and the "measurement noise".
If only the likelihood is needed, then one may use ll_kf().
Examples
s = 4 # state dimension
m = 2 # number of outputs
n = 3 # number of inputs,
n.obs = 100 # sample size
# generate a (stable) state space model
tmpl = tmpl_stsp_full(m, n, s, sigma_L = "chol")
model = r_model(tmpl, bpoles = 1, sd = 0.5)
# generate a sample
data = sim(model, n.obs = n.obs, a1 = NA)
# compute Q, R, S and P1
sigma_L = model$sigma_L
sigma = tcrossprod(sigma_L)
R = model$sys$D %*% sigma %*% t(model$sys$D)
S = model$sys$B %*% sigma %*% t(model$sys$D)
Q = model$sys$B %*% sigma %*% t(model$sys$B)
P1 = lyapunov(model$sys$A, Q)
# call Kalman filter using the wrapper function
out = kf(model, data$y, method = 'kf')
# Note: kf_cpp is the internal C++ implementation called by kf()
# compute H and square root of P1
H = rbind(model$sys$D, model$sys$B) %*% sigma_L
P1_R = chol(P1)
# call square root Kalman filter using wrapper function
out_test = kf(model, data$y, method = 'kf2') # using wrapper function
all.equal(out, out_test)
#> [1] TRUE
# The one step ahead predictions for y[t] may be computed by
yh = out$a %*% t(model$sys$C)
# and the (non scaled) prediction errors are
uh = data$y - out$a[1:n.obs,] %*% t(model$sys$C)