Maximum Likelihood Estimation — est

This is a naive implementation of Maximum Likelihood Estimation. Rather use ll() and ll_FUN().

Usage

est_ML(
  y,
  tmpl,
  th,
  which = c("concentrated", "conditional", "kf"),
  method = c("BFGS", "Nelder-Mead", "CG", "L-BFGS-B", "SANN", "Brent"),
  hessian = FALSE,
  control = list()
)

Arguments

y: sample, i.e. an \((N,m)\) dimensional matrix, or a "time series" object (i.e. as.matrix(y) should return an \((N,m)\)-dimensional numeric matrix). Missing values (NA, NaN and Inf) are not supported.
tmpl: a model template which describes the model class, see model structures(). Note that only the case of (non-empty, square) state space or ARMA models is implemented.
th: Initial parameter estimate.
which: (character string) determines which "likelihood" to be used, see also ll(). The option "kf" is only supported for state space models.
method, hessian, control: are passed on to the optimization routine stats::optim().

Value

A list with components

model: The estimated model.
th: The corresponding vector of deep parameters.
ll: The log likelihood of the estimated model.
which: The type of likelihood used.
counts, convergence, message, hessian: as returned by stats::optim().

Note

The optimization is computed with the general-purpose routine stats::optim().
An initial estimate is needed.
The procedure does not respect constraints like stability or minimum phase.
The case of the conditional, concentrated likelihood is somewhat special. In this case the model template must have a particular structure: (1) The noise covariance is parametrized via the left cholesky factor. (2) The last \(m(m+1)/2\) components of the parameter vector \(\theta\) parametrize this left cholesky factor and the other components describe the system. (This implies that there is no overlap/dependency betweeen the "system parameters" and the "noise parameters".)

Examples

# Generate a random model in echelon form model (m = 3)
tmpl = tmpl_stsp_echelon(nu = c(2,1,1))
model = r_model(template = tmpl, bpoles = 1, bzeroes = 1, sd = 0.25)
diag(model$sigma_L) = 1 # scale the diagonal entries of sigma_L
print(model)
#> state space model [3,3] with s = 4 states
#>            s[1]       s[2]        s[3]       s[4]        u[1]        u[2]
#> s[1]  0.0000000  0.0000000  0.00000000  1.0000000 -0.12772670 -0.27866317
#> s[2]  0.3748673 -0.2992734  0.05335086 -0.2676605  0.09424537 -0.09885959
#> s[3] -0.1792097  0.4426172  0.14887576 -0.1453610  0.26086831 -0.15156408
#> s[4] -0.1153381  0.0291091 -0.33532175  0.1548390  0.01967283  0.12601084
#> x[1]  1.0000000  0.0000000  0.00000000  0.0000000  1.00000000  0.00000000
#> x[2]  0.0000000  1.0000000  0.00000000  0.0000000  0.00000000  1.00000000
#> x[3]  0.0000000  0.0000000  1.00000000  0.0000000  0.00000000  0.00000000
#>              u[3]
#> s[1]  0.002798484
#> s[2]  0.162476956
#> s[3] -0.043841936
#> s[4] -0.196106240
#> x[1]  0.000000000
#> x[2]  0.000000000
#> x[3]  1.000000000
#> Left square root of noise covariance Sigma:
#>            u[1]       u[2] u[3]
#> u[1]  1.0000000 0.00000000    0
#> u[2] -0.4078598 1.00000000    0
#> u[3] -0.2039710 0.00163432    1
# extract the corresponding free/deep parameters
th = extract_theta(model, tmpl)

# generate a sample with 500 observations
y = sim(model, n.obs = 500, n.burn_in = 100)$y

# We are cheating here and use the true model parameters
# as starting values for the optimization routine:

# estimate the model with the "exakt log likelihood"
out = est_ML(y, tmpl, th, which = 'kf')
KL_divergence(model, out$model)
#> [1] 0.009797793

# estimate the model with "conditional log likelihood"
out = est_ML(y, tmpl, th, which = 'conditional')
KL_divergence(model, out$model)
#> [1] 0.009813696

# estimate the model with "concentrated, conditional log likelihood"
out = est_ML(y, tmpl, th, which = 'concentrated')
KL_divergence(model, out$model)
#> [1] 0.009818641