DS-GA 1018: Probabilistic Time Series Analysis (complete)

Note: This equation sheet will be provided alongside the midterm exam. Outside copies of this or any other aids are not permitted.

Basics

\begin{align*} \mu_X(t) &= \mathbb{E}(X_t) \\ \gamma_X(s,t) &= \mathbb{E}\left[ \left(X_s - \mu_X(s) \right)(X_t - \mu_X(t)) \right] \\ \rho_X(s,t) &= \frac{\gamma_X(s,t)}{\sqrt{\gamma_X(s,s) \gamma_X(t,t)}} \\ \gamma_{X,Y}(s,t) &= \mathbb{E}\left[ \left(X_s - \mu_X(s) \right)(Y_t - \mu_Y(t)) \right] \\ \rho_{X,Y}(s,t) &= \frac{\gamma_{X,Y}(s,t)}{\sqrt{\gamma_X(s,s) \gamma_Y(t,t)}} \end{align*}

The Markov boundary for X is the union of its parents, its children, and the parents of children. If a variable is conditioned on its Markov boundary, it becomes independent of all other random variables.

Conditional Gaussian Equations

\begin{align*} \mathcal{N}(\mathbf{x} | \boldsymbol{\mu}, \boldsymbol{\Sigma}) &= \mathcal{N}\left(\begin{bmatrix} \mathbf{x}_a \\ \mathbf{x}_b \end{bmatrix} \middle| \begin{bmatrix} \boldsymbol{\mu}_a \\ \boldsymbol{\mu}_b \end{bmatrix} , \begin{bmatrix} \boldsymbol{\Sigma}_{aa} & \boldsymbol{\Sigma}_{ab} \\ \boldsymbol{\Sigma}_{ba} & \boldsymbol{\Sigma}_{bb} \end{bmatrix}\right)\\ p(\mathbf{x}_a | \mathbf{x}_b) &= \mathcal{N} \left( \mathbf{x}_a | \boldsymbol{\mu}_{a|b}, \boldsymbol{\Sigma}_{a|b} \right) \\ \boldsymbol{\mu}_{a|b} &= \boldsymbol{\mu}_a + \boldsymbol{\Sigma}_{ab}\boldsymbol{\Sigma}_{bb}^{-1}(\mathbf{x}_b - \boldsymbol{\mu}_b) \\ \boldsymbol{\Sigma}_{a|b} &= \boldsymbol{\Sigma}_{aa} - \boldsymbol{\Sigma}_{ab} \boldsymbol{\Sigma}_{bb}^{-1} \boldsymbol{\Sigma}_{ba} \end{align*}

The solutions of a difference equation of order p:

u_n - \alpha_1 u_{n-1} - \ldots - \alpha_{p} u_{n-p} = 0

are given by:

u_n = z_1^{-n} P_1(n) + z_2^{-n} P_2(n) + \ldots + z_r^{-n} P_r(n),

where r is the number of distinct roots and P_j(n) is a polynomial in n of degree m_j -1 where m_j is the multiplicity of the root z_j.

ARMA

An autoregressive model of order p – AR(p) – is a random process with the form:

X_t = \phi_1 X_{t-1} + \phi_2 X_{t-2} + \ldots + \phi_p X_{t-p} + W_t

where W_t is drawn from \mathcal{N}(0, \sigma_W^2) and \phi_1, \ldots, \phi_p are constants.

A moving average model of order q – MA(q) – is a random process with the form:

X_t = W_t + \theta_1 W_{t-1} + \theta_2 W_{t-2} + \ldots + \theta_q W_{t-q}

where W_t is drawn from \mathcal{N}(0, \sigma_W^2) and \theta_1, \ldots, \theta_q are constants.

The autoregressive operator, P(B), for an AR(p) model is defined as:

P(B) = 1 - \phi_1 B - \phi_2 B^2 - \ldots - \phi_p B^p,

such that:

P(B) X_t = W_t.

The moving average operator for an MA(q) process is defined as:

\Theta(B) = 1 + \theta_1 B + \theta_2 B^2 + \ldots + \theta_q B^q,

such that:

X_t = \Theta(B) W_t.

An Autoregressive Moving Average process of order p,q - ARMA(p,q) is a process with the form:

P(B)X_t = \Theta(B)W_t.

where W_t is drawn from \mathcal{N}(0, \sigma_W^2), \phi_1, \ldots, \phi_p are constants, \theta_1, \ldots, \theta_q are constants, and both \theta_q \neq 0 and \phi_p \neq 0. The autoregressive and moving average operators must not share any roots.

An ARMA(p,q) process is invertible if the time series can be written as:

\pi(B) X_t = \sum_{j=0}^\infty \pi_j X_{t-j} = W_t

with the infinite sum \sum_{j=0}^\infty |\pi_j| < \infty. We can determine \pi(B) as:

\pi(B) = \frac{P(B)}{\Theta(B)}.

The conditions for invertibility hold so long as the roots of \Theta(z) lie outside the unit circle.

An ARMA(p,q) process is causal if the time series can be written as:

X_t = \psi(B) W_t = \sum_{j=0}^\infty \psi_j W_{t-j}

with the infinite sum \sum_{j=0}^\infty |\psi_j| < \infty. We can determine \psi(B) as:

\psi(B) = \frac{\Theta(B)}{P(B)}.

The conditions for causality hold so long as the roots of P(z) lie outside the unit circle.

The auto-covariance function of an MA process for order q is:

\gamma(h) = \begin{cases} \sigma_w^2 \sum_{j=0}^{q-h} \theta_j \theta_{j+h} & |h| \leq q \\ 0 & |h|>q \end{cases},

with \theta_0 = 1.

Linear Dynamical Systems

The Kalman filtering and smoothing equations are given by:

\begin{align*} \boldsymbol{\mu}_{t|t-1} &= \mathbf{A} \boldsymbol{\mu}_{t-1|t-1} \\ \boldsymbol{\Sigma}_{t|t-1} &= \mathbf{Q} + \mathbf{A} \boldsymbol{\Sigma}_{t-1|t-1}\mathbf{A}^T \\ \boldsymbol{\mu}_{t|t} &= \boldsymbol{\mu}_{t|t-1} + \mathbf{K}_t (\mathbf{x}_t - \mathbf{C}\boldsymbol{\mu}_{t|t-1}) \\ \boldsymbol{\Sigma}_{t|t} &= \boldsymbol{\Sigma}_{t|t-1} - \mathbf{K}_t \mathbf{C}\boldsymbol{\Sigma}_{t|t-1} \\ \mathbf{K}_t &= \boldsymbol{\Sigma}_{t|t-1} \mathbf{C}^T(\mathbf{C} \boldsymbol{\Sigma}_{t|t-1}\mathbf{C}^T + \mathbf{R})^{-1} \\ \boldsymbol{\mu}_{t|T} &= \boldsymbol{\mu}_{t|t} + \mathbf{F}_t(\boldsymbol{\mu}_{t+1|T} - \boldsymbol{\mu}_{t+1|t}) \\ \boldsymbol{\Sigma}_{t|T} &= \mathbf{F}_t (\boldsymbol{\Sigma}_{t+1|T} - \boldsymbol{\Sigma}_{t+1|t})\mathbf{F}_t^T + \boldsymbol{\Sigma}_{t|t} \\ \mathbf{F}_t &= \boldsymbol{\Sigma}_{t|t} \mathbf{A}^T \boldsymbol{\Sigma}_{t+1|t}^{-1} \end{align*}

Prediction is given by:

\begin{align*} \boldsymbol{\mu}_{T+k|T} &=\mathbf{A} \boldsymbol{\mu}_{T+k-1|T} \\ \boldsymbol{\Sigma}_{T+k|T} &=\mathbf{A} \boldsymbol{\Sigma}_{T+k-1|T} \mathbf{A}^T + \mathbf{Q} \end{align*}