Maximum Likelihood Estimation
est_ML.Rd
Arguments
- y
sample, i.e. an \((N,m)\) dimensional matrix, or a "time series" object (i.e.
as.matrix(y)
should return an \((N,m)\)-dimensional numeric matrix). Missing values (NA
,NaN
andInf
) are not supported.- tmpl
a model template which describes the model class, see
model structures()
. Note that only the case of (non-empty, square) state space or ARMA models is implemented.- th
Initial parameter estimate.
- which
(character string) determines which "likelihood" to be used, see also
ll()
. The option"kf"
is only supported for state space models.- method, hessian, control
are passed on to the optimization routine
stats::optim()
.
Value
A list with components
- model
The estimated model.
- th
The corresponding vector of deep parameters.
- ll
The log likelihood of the estimated model.
- which
The type of likelihood used.
- counts, convergence, message, hessian
as returned by
stats::optim()
.
Note
The optimization is computed with the general-purpose routine
stats::optim()
.An initial estimate is needed.
The procedure does not respect constraints like stability or minimum phase.
The case of the conditional, concentrated likelihood is somewhat special. In this case the model template must have a particular structure: (1) The noise covariance is parametrized via the left cholesky factor. (2) The last \(m(m+1)/2\) components of the parameter vector \(\theta\) parametrize this left cholesky factor and the other components describe the system. (This implies that there is no overlap/dependency betweeen the "system parameters" and the "noise parameters".)
Examples
# Generate a random model in echelon form model (m = 3)
tmpl = tmpl_stsp_echelon(nu = c(2,1,1))
model = r_model(template = tmpl, bpoles = 1, bzeroes = 1, sd = 0.25)
diag(model$sigma_L) = 1 # scale the diagonal entries of sigma_L
print(model)
#> state space model [3,3] with s = 4 states
#> s[1] s[2] s[3] s[4] u[1] u[2]
#> s[1] 0.0000000 0.0000000 0.00000000 1.0000000 -0.12772670 -0.27866317
#> s[2] 0.3748673 -0.2992734 0.05335086 -0.2676605 0.09424537 -0.09885959
#> s[3] -0.1792097 0.4426172 0.14887576 -0.1453610 0.26086831 -0.15156408
#> s[4] -0.1153381 0.0291091 -0.33532175 0.1548390 0.01967283 0.12601084
#> x[1] 1.0000000 0.0000000 0.00000000 0.0000000 1.00000000 0.00000000
#> x[2] 0.0000000 1.0000000 0.00000000 0.0000000 0.00000000 1.00000000
#> x[3] 0.0000000 0.0000000 1.00000000 0.0000000 0.00000000 0.00000000
#> u[3]
#> s[1] 0.002798484
#> s[2] 0.162476956
#> s[3] -0.043841936
#> s[4] -0.196106240
#> x[1] 0.000000000
#> x[2] 0.000000000
#> x[3] 1.000000000
#> Left square root of noise covariance Sigma:
#> u[1] u[2] u[3]
#> u[1] 1.0000000 0.00000000 0
#> u[2] -0.4078598 1.00000000 0
#> u[3] -0.2039710 0.00163432 1
# extract the corresponding free/deep parameters
th = extract_theta(model, tmpl)
# generate a sample with 500 observations
y = sim(model, n.obs = 500, n.burn_in = 100)$y
# We are cheating here and use the true model parameters
# as starting values for the optimization routine:
# estimate the model with the "exakt log likelihood"
out = est_ML(y, tmpl, th, which = 'kf')
KL_divergence(model, out$model)
#> [1] 0.009797793
# estimate the model with "conditional log likelihood"
out = est_ML(y, tmpl, th, which = 'conditional')
KL_divergence(model, out$model)
#> [1] 0.009813696
# estimate the model with "concentrated, conditional log likelihood"
out = est_ML(y, tmpl, th, which = 'concentrated')
KL_divergence(model, out$model)
#> [1] 0.009818641