Skip to contents

Introduction

This vignette is intended as an introduction to the usage of the kDGLM package, which offers routines for Bayesian analysis of Dynamic Generalized Linear Models, including fitting (filtering and smoothing), forecasting, sampling, intervention and automated monitoring, following the theory developed and/or explored in Kalman (1960), West and Harrison (1997) and Alves et al. (2024).

In this document we will focus exclusively in the usage of the package and will only briefly mention the theory behind these models and only with the intention of highlighting the notation. We highly recommend all users to read the theoretical work (dos Santos et al., 2024) in which we based this package.

This document is organized in the following order:

  1. First we introduce the notations and the class of models we will be dealing with;
  2. Next we present the details about the specification of the model structure, offering tools that allow for an easy, fast and (hopefully) intuitive way of defining models;
  3. In the following section we discuss details about how the user can specify the observational model;
  4. Then we present some basic examples of model fitting, also showing the auxiliary functions that help the user to analyse the fitted model. We also show tools for easy model selection;
  5. Lastly, we present a variety of advanced examples, combining the basic features shown in previous sections to create more complex models;

Notation

In this section, we assume the user’s interest lies in analyzing a Time Series {Yt}t=1T\{\vec{Y}_t\}_{t=1}^T, which adheres to the model described by :

Yt|ηt(ηt),g(ηt)=λt=Ftθt,θt=Gtθt1+ωt,ωt𝒩n(ht,Wt),\begin{equation} \begin{aligned} \vec{Y}_t|\vec{\eta}_t &\sim \mathcal{F}\left(\vec{\eta}_t\right),\\ g(\vec{\eta}_t) &=\vec{\lambda}_{t}=F_t'\vec{\theta}_t,\\ \vec{\theta}_t&=G_t\vec{\theta}_{t-1}+\vec{\omega}_t,\\ \vec{\omega}_t &\sim \mathcal{N}_n(\vec{h}_t,W_t), \end{aligned} \end{equation}

The model comprises:

  • Yt=(Y1,t,...,Yr,t)\vec{Y}_t=(Y_{1,t},...,Y_{r,t})', the outcome, is an rr-dimensional vector of observed variable.
  • θt=(θ1,t,...,θn,t)\vec{\theta}_t=(\theta_{1,t},...,\theta_{n,t})', representing the unknown parameters (latent states), is an nn-dimensional vector, consistently dimensioned across observations.
  • λt=(λ1,t,...,λk,t)\vec{\lambda}_t=(\lambda_{1,t},...,\lambda_{k,t})', the linear predictors, is a kk-dimensional vector indicating the linear transformation of the latent states. As per , λt\vec{\lambda}_t is assumed to be (approximately) Normally distributed at all times and directly corresponds to the observational parameters ηt\vec{\eta}_t, through a one-to-one correspondence gg.
  • ηt=(η1,t,...,ηl,t)\vec{\eta}_t=(\eta_{1,t},...,\eta_{l,t})', the observational parameters, is an ll-dimensional vector defining the model’s observational aspects. Typically, l=kl=k, but this may not hold in some special cases, such as in the Multinomial model, where k=l1k=l-1.
  • \mathcal{F}, a distribution from the Exponential Family indexed by ηt\vec{\eta}_t. Pre-determines the values kk and ll, along with the link function gg.
  • gg, the link function, establishes a one-to-one correspondence between λt\vec{\lambda}_t and ηt\vec{\eta}_t.
  • FtF_t, the design matrix, is a user-defined, mostly known, matrix of size k×nk \times n.
  • GtG_t, the evolution matrix, is a user-defined, mostly known, matrix of size n×nn \times n.
  • ht=(h1,t,...,hn,t)\vec{h}_t=(h_{1,t},...,h_{n,t})', the drift, is a known nn-dimensional vector, typically set to 0\vec{0} except for model interventions (refer to subsection ).
  • WtW_t, a known covariance matrix of size n×nn \times n, is specified by the user.

Per , we define 𝒟t\mathcal{D}_t as the cumulative information after observing the first tt data points, with 𝒟0\mathcal{D}_0 denoting pre-observation knowledge of the process {Yt}t=1T\{Y_t\}^T_{t=1}.

The specification of WtW_t follows , section 6.3, where Wt=Var[Gtθt1|𝒟t1](1Dt)Dt+HtW_t=Var[G_t\theta_{t-1}|\mathcal{D}_{t-1}] \odot (1-D_t) \oslash D_t + H_t. Here, DtD_t (the discount matrix) is an n×nn \times n matrix with values between 00 and 11, \odot represents the Hadamard product, and \oslash signifies Hadamard division. HtH_t is another known n×nn \times n matrix specified by the user. This formulation implies that if DtD_t entries are all 11, and HtH_t entries are all 00, the model equates to a Generalized Linear Model.

A prototypical example within the general model framework is the Poisson model augmented with a dynamic level featuring linear growth and a single covariate XX:

Yt|ηt𝒫(ηt),ln(ηt)=λt=μt+βtXt,μt=μt1+βt1+ωμ,t,νt=νt1+ων,t,βt=βt1+ωβ,t,ωμ,t,ων,t,ωβ,t𝒩3(0,Wt),\begin{equation} \begin{aligned} Y_t|\eta_t &\sim \mathcal{P}\left(\eta_t\right),\\ ln(\eta_t) &=\lambda_{t}=\mu_t+\beta_t X_t,\\ \mu_t&=\mu_{t-1}+\beta_{t-1}+\omega_{\mu,t},\\ \nu_t&=\nu_{t-1}+\omega_{\nu,t},\\ \beta_t&=\beta_{t-1}+\omega_{\beta,t},\\ \omega_{\mu,t},\omega_{\nu,t},\omega_{\beta,t} &\sim \mathcal{N}_3(\vec{0},W_t), \end{aligned} \end{equation}

In this model, \mathcal{F} denotes the Poisson distribution; the model dimensions are r=k=l=1r=k=l=1; the state vector θt\theta_t is (μt,νt,βt)(\mu_t,\nu_t,\beta_t)' with dimension n=3n=3; the link function gg is the natural logarithm; and the matrices FtF_t and GtG_t are defined as:

Ft=[10Xt]Gt=[110010001] F_t=\begin{bmatrix} 1 \\ 0 \\ X_t \end{bmatrix} \quad G_t=\begin{bmatrix} 1 & 1 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{bmatrix}

Consider now a Normal model with unknown mean η1,t\eta_{1,t} and unknown precision η2,t\eta_{2,t}:

Yt|ηt𝒩(η1,t,η2,t1),η1,t=λ1,t=μ1,t+βtXt,ln(η2,t)=λ2,t=μ2,t,μ1,t=μ1,t1+βt1+ωμ1,t,νt=νt1+ων,t,βt=βt1+ωβ,t,μ2,t=ϕt1μ2,t1+ωμ2,t,ϕt=ϕt1+ωϕ,t,ωμ1,t,ων,t,ωμ,t,ωβ,t,ωϕ,t𝒩5(0,Wt),\begin{equation} \begin{aligned} Y_t|\eta_t &\sim \mathcal{N}\left(\eta_{1,t},\eta_{2,t}^{-1}\right),\\ \eta_{1,t} &=\lambda_{1,t}=\mu_{1,t}+\beta_t X_t,\\ ln(\eta_{2,t}) &=\lambda_{2,t}=\mu_{2,t},\\ \mu_{1,t}&=\mu_{1,t-1}+\beta_{t-1}+\omega_{\mu_1,t},\\ \nu_t&=\nu_{t-1}+\omega_{\nu,t},\\ \beta_t&=\beta_{t-1}+\omega_{\beta,t},\\ \mu_{2,t}&=\phi_{t-1}\mu_{2,t-1}+\omega_{\mu_2,t},\\ \phi_{t}&=\phi_{t-1}+\omega_{\phi,t},\\ \omega_{\mu_1,t},\omega_{\nu,t},\omega_{\mu,t},\omega_{\beta,t},\omega_{\phi,t} &\sim \mathcal{N}_5(\vec{0},W_t), \end{aligned} \end{equation}

For this case, \mathcal{F} represents the Normal distribution; the model dimensions are r=1r=1 and k=l=2k=l=2; the state vector θt\theta_t is (μ1,t,νt,βt,μ2,t,ϕt)(\mu_{1,t},\nu_t,\beta_t,\mu_{2,t},\phi_t)' with dimension n=5n=5; the link function gg and matrices FtF_t, GtG_t are:

g([x1x2])=[x1ln(x2)]Ft=[1000Xt00100]Gt=[11000010000010000¨0ϕ000¨001] g\left(\begin{bmatrix} x_1 \\ x_2 \end{bmatrix}\right)= \begin{bmatrix} x_1 \\ \ln(x_2) \end{bmatrix}\quad F_t=\begin{bmatrix} 1 & 0 \\ 0 & 0\\ X_t & 0 \\ 0 & 1 \\ 0 & 0 \end{bmatrix} \quad G_t=\begin{bmatrix} 1 & 1 & 0 & 0 & 0\\ 0 & 1 & 0 & 0 & 0\\ 0 & 0 & 1 & 0 & 0\\ 0 & 0¨& 0 & \phi & 0\\ 0 & 0¨& 0 & 0 & 1 \end{bmatrix}

This configuration introduces l=2l=2 observational parameters, necessitating k=2k=2 linear predictors. The first linear predictor pertains to the location parameter of the Normal distribution and includes a linear growth model and the covariate XtX_t. The second linear predictor, associated with the precision parameter, models log precision as an autoregressive (AR) process. We express this model in terms of an Extended Kalman Filter (Kalman, 1960; West and Harrison, 1997). This formulation aligns with the concept of a traditional Stochastic Volatility model, as highlighted by Alves et al. (2024).

Both the Normal and Poisson models illustrate univariate cases. However, the general model also accommodates multivariate outcomes, such as in the multinomial case. Consider a vector of counts Yt=(Y1,t,Y2,t,Y3,t,Y4,t,Y5,t)\vec{Y}_t=(Y_{1,t},Y_{2,t},Y_{3,t},Y_{4,t},Y_{5,t})', with Yi,TY_{i,T} \in \mathbb{Z} and Nt=i=15Yi,tN_t=\sum_{i=1}^{5}Y_{i,t}. The model is:

Y5,t|Nt,ηtMultinomial(Nt,ηt),ln(ηi,tη5,t)=λ1,t=μi,t,i=1,...,4μi,t=μi,t1+ωμi,t,i=1,...,4ωμ1,t,ωμ2,t,ωμ3,t,ωμ4,t𝒩4(0,Wt),\begin{equation} \begin{aligned} \vec{Y}_{5,t}|N_t,\vec{\eta}_{t} &\sim Multinomial\left(N_t,\vec{\eta}_{t}\right),\\ \ln\left(\frac{\eta_{i,t}}{\eta_{5,t}}\right) &=\lambda_{1,t}=\mu_{i,t},i=1,...,4\\ \mu_{i,t}&=\mu_{i,t-1}+\omega_{\mu_i,t},i=1,...,4\\ \omega_{\mu_1,t},\omega_{\mu_2,t},\omega_{\mu_3,t},\omega_{\mu_4,t} &\sim \mathcal{N}_4(\vec{0},W_t), \end{aligned} \end{equation}

In this multinomial model, \mathcal{F} is the Multinomial distribution; the model dimensions are r=5r=5, l=5l=5 and k=4k=4; the state vector θt\theta_t is (μ1,t,μ2,t,μ3,t,μ4,t)(\mu_{1,t},\mu_{2,t},\mu_{3,t},\mu_{4,t})'; FtF_t and GtG_t are identity matrices of size 4×44\times 4; and the link function gg maps 5\mathbb{R}^5 in 4\mathbb{R}^{4} as:

g([x1x2x3x4x5])=[ln(x1x5)ln(x2x5)ln(x3x5)ln(x4x5)] g\left(\begin{bmatrix} x_1 \\ x_2 \\ x_3 \\ x_4 \\ x_5 \end{bmatrix}\right)= \begin{bmatrix} \ln\left(\frac{x_1}{x_5}\right) \\ \ln\left(\frac{x_2}{x_5}\right) \\ \ln\left(\frac{x_3}{x_5}\right) \\ \ln\left(\frac{x_4}{x_5}\right) \end{bmatrix}

Note that in the Multinomial distribuition, ηi0,i\eta_i\ge 0, \forall i and i=15ηi=1\sum_{i=1}^{5} \eta_i=1. Thus, only k=l1k=l-1 linear predictors are necessary to describe this model.

It’s important to emphasize that while we have chosen to illustrate simple model structures, such as a random walk in the log odds for each outcome, neither the general model framework nor the kDGLM package restricts to these configurations. Analysts have the flexibility to tailor models to their specific contexts, including the incorporation of additional latent states to enhance outcome explanation.

Lastly, this general model framework can be extended to encompass multiple outcome models. For further details, see Handling multiple outcomes.

Given the complexity of manually specifying all model components, the kDGLM package includes a range of auxiliary functions to simplify this process. The subsequent section delves into these tools.

Single outcome models

The kDGLM package supports the joint modelling of multiple time series, each with it own structure and distribution (see Handling multiple outcomes). This flexibility comes with a somewhat complex syntax, designed to allow analysts to meticulously define every aspect of the model. While we have aimed to create an intuitive yet powerful syntax, we recognize that it may feel overwhelming for new users. To address this, the kDGLM package also provides a simplified syntax, similar to the lm and glm functions native to R. This simplified approach supports single-outcome models with any of the supported distributions, while still allowing for complex dynamic structures in all parameters of the observational distribution.

Lets consider the classic airline data, which is comprised of monthly totals of international airline passengers, from 1949 to 1960. We can adjust a simple Time Series model using the following code:

fitted.data <- kdglm(c(AirPassengers) ~ 1, family = Poisson)
plot(fitted.data)

Detail about the plot method can be found in Filtering and smoothing, for now we focus only on the usage of the kdglm function. By the default, the intercept of a model is considered dynamic in a discount factor of 0.950.95. One can specify the details of the intercept using the pol function inside the formula:

fitted.data <- kdglm(c(AirPassengers) ~ pol(D = 0.99), family = Poisson)
plot(fitted.data)

One can also add more complex structures to the model using the functions harhar, regreg, ARAR, TFTF and noisenoise:

fitted.data <- kdglm(c(AirPassengers) ~ pol(D = 0.99, order = 2) + har(period = 12) + noise(R1 = 0.01), family = Poisson)
plot(fitted.data)

For more details, see Structures.

For dynamic regressions, the kdglm package adopts the same convetions as the lm and glm functions:

# Total number of cases
chickenPox$Total <- rowSums(chickenPox[, c(2, 3, 4, 6, 5)])
# Indicator of the introcution of the chicken pox vaccine to the national program of immunization
chickenPox$Vaccine <- chickenPox$date >= as.Date("2013-09-01")

fitted.data <- kdglm(`< 5 year` ~ pol(2, D = 0.95) + har(12, D = 0.975) + Vaccine,
  N = chickenPox$Total,
  family = Multinom,
  data = chickenPox
)
plot(fitted.data, plot.pkg = "base")

For the multinomial case, where we the outcome of the model is a vector, the user may include several formulas, one for each index of the outcome vector:

fitted.data <- kdglm(
  `< 5 year` ~ pol(2, D = 0.95) + har(12, D = 0.975) + Vaccine,
  `10 to 14 years` ~ pol(2, D = 0.95) + har(12, D = 0.975) + Vaccine,
  `15 to 49 years` ~ pol(2, D = 0.95) + har(12, D = 0.975) + Vaccine,
  `50 years or more` ~ pol(2, D = 0.95) + har(12, D = 0.975) + Vaccine,
  N = chickenPox$Total,
  family = Multinom,
  data = chickenPox
)
plot(fitted.data, plot.pkg = "base")

Lastly, some outcomes may require extra arguments for the parameters of the observational model. For instance, the kdglm package allows for dynamic structure for all parameters of a model, so for the Normal family, the user may include dynamic structure for both mean and variance:

fitted.data <- kdglm(corn.log.return ~ 1,
  V = ~1,
  family = Normal,
  data = cornWheat[1:500, ]
)
plot(fitted.data, plot.pkg = "base")

For more details about each outcome see Outcomes.

References

Alves, M. B., Migon, H. S., Marotta, R., and dos Santos, S. V., Junior. (2024). K-parametric dynamic generalized linear models: A sequential approach via information geometry.
dos Santos, S. V., Junior, Alves, M. B., and Migon, H. S. (2024). kDGLM: An r package for bayesian analysis of dynamic generialized linear models.
Kalman, R. E. (1960). A new approach to linear filtering and prediction problems. Transactions of the ASME–Journal of Basic Engineering, 82(Series D), 35–45.
West, M., and Harrison, J. (1997). Bayesian forecasting and dynamic models (springer series in statistics). Hardcover; Springer-Verlag.