L05d Sequence Modeling

Materials are adopted from "Dixon, Matthew F., Igor Halperin, and Paul Bilokon. Machine learning in Finance. Springer International Publishing, 2020.". This handout is only for teaching. DO NOT DISTRIBUTE.

Chapter Objectives
By the end of this chapter, the reader should expect to accomplish the following:

Autoregressive Modeling

Preliminaries

Before we can build a model to predict YtY_t , we recall some basic definitions and terminology, starting with a continuous time setting and then continuing thereafter solely in a discrete-time setting.

Stochastic Process
A stochastic process is a sequence of random variables, indexed by continuous time: {Yt}t=\left\{Y_{t}\right\}_{t=-\infty}^{\infty}.

Time Series
A time series is a sequence of observations of a stochastic process at discrete times over a specific interval: {yt}t=1n\left\{y_{t}\right\}_{t=1}^{n}.

Autocovariance
The jj th autocovariance of a time series is
γjt:=E[(ytμt)(ytjμtj)]\gamma_{j t}:=\mathbb{E}\left[\left(y_{t}-\mu_{t}\right)\left(y_{t-j}-\mu_{t-j}\right)\right]
where μt:=E[yt]\mu_{t}:=\mathbb{E}\left[y_{t}\right].

Covariance (Weak) Stationarity
A time series is weak (or wide-sense) covariance stationary if it has time constant mean and autocovariances of all orders:
μt=μ,tγjt=γj,t.\begin{aligned} \mu_{t} &=\mu, & \forall t \\ \gamma_{j t} &=\gamma_{j}, & \forall t . \end{aligned}

Autocorrelation
The jj th autocorrelation, τj\tau_{j} is just the jj th autocovariance divided by the variance:
τj=γjγ0.\tau_{j}=\frac{\gamma_{j}}{\gamma_{0}} .

White Noise
White noise, ϵt\epsilon_{t}, is i.i.d. error which satisfies all three conditions:
a. E[ϵt]=0,t\mathbb{E}\left[\epsilon_{t}\right]=0, \forall t;
b. V[ϵt]=σ2,t\mathbb{V}\left[\epsilon_{t}\right]=\sigma^{2}, \forall t; and
c. ϵt\epsilon_{t} and ϵs\epsilon_{s} are independent, ts,t,st \neq s, \forall t, s.


Autoregressive Processes

AR(p)\mathbf{A R}(\mathbf{p}) Process
The pp th order autoregressive process of a variable YtY_{t} depends only on the previous values of the variable plus a white noise disturbance term
yt=μ+i=1pϕiyti+ϵt,y_{t}=\mu+\sum_{i=1}^{p} \phi_{i} y_{t-i}+\epsilon_{t},
where ϵt\epsilon_{t} is independent of {yt1}i=1p\left\{y_{t-1}\right\}_{i=1}^{p}. We refer to μ\mu as the drift term. pp is referred to as the order of the model.


Stability

whether past disturbances exhibit an inclining or declining impact on the current value of yy as the lag increases


Stationarity

A sufficient condition for the autocorrelation function of AR(p) models convergences to zero as the lag increases is stationary.

Φ(z)=(1zλ1)(1zλ2)(1zλp)=0\Phi(z)=\left(1-\frac{z}{\lambda_{1}}\right) \cdot\left(1-\frac{z}{\lambda_{2}}\right) \cdot \ldots \cdot\left(1-\frac{z}{\lambda_{p}}\right)=0

Stationarity of Random Walk
We can show that the following random walk (zero mean AR(1) process) is not strictly stationary:
yt=yt1+ϵty_{t}=y_{t-1}+\epsilon_{t}
Written in compact form gives
Φ(L)[yt]=ϵt,Φ(L)=1L\Phi(L)\left[y_{t}\right]=\epsilon_{t}, \quad \Phi(L)=1-L
and the characteristic polynomial, Φ(z)=1z=0\Phi(z)=1-z=0, implies that the real root z=1z=1. Hence the root is on the unit circle and the model is a special case of nonstationarity.


Partial Autocorrelations

Partial Autocorrelation
A partial autocorrelation at lag h2h \geq 2 is a conditional autocorrelation between a variable, yty_{t}, and its hh th lag, ythy_{t-h} under the assumption that the values of the intermediate lags, yt1,,yth+1y_{t-1}, \ldots, y_{t-h+1} are controlled:
τ~h:=τ~t,th:=γ~hγ~t,hγ~th,h,\tilde{\tau}_{h}:=\tilde{\tau}_{t, t-h}:=\frac{\tilde{\gamma}_{h}}{\sqrt{\tilde{\gamma}_{t, h}} \sqrt{\tilde{\gamma}_{t-h, h}}},
where
γ~h:=γ~t,th:=E[ytP(ytyt1,,yth+1),ythP(ythyt1,yth+1)]\tilde{\gamma}_{h}:=\tilde{\gamma}_{t, t-h}:=\mathbb{E}\left[y_{t}-P\left(y_{t} \mid y_{t-1}, \ldots, y_{t-h+1}\right), y_{t-h}-P\left(y_{t-h} \mid y_{t-1}, \ldots y_{t-h+1}\right)\right]
is the lag- hh partial autocovariance, P(WZ)P(W \mid Z) is an orthogonal projection of W\mathrm{W} onto the set ZZ and
γ~t,h:=E[(ytP(ytyt1,,yth+1))2].\tilde{\gamma}_{t, h}:=\mathbb{E}\left[\left(y_{t}-P\left(y_{t} \mid y_{t-1}, \ldots, y_{t-h+1}\right)\right)^{2}\right] .
The partial autocorrelation function τ~h:N[1,1]\tilde{\tau}_{h}: \mathbb{N} \rightarrow[-1,1] is a map h:τ~hh: \mapsto \tilde{\tau}_{h}. The plot of τ~h\tilde{\tau}_{h} against hh is referred to as the partial correlogram.


Maximum Likelihood Estimation

Heteroscedasticity

Moving Average Processes

MA(q)\mathbf{M A}(\mathbf{q}) Process
The qq th order moving average process is the linear combination of the white noise process {ϵti}t=0q,t\left\{\epsilon_{t-i}\right\}_{t=0}^{q}, \forall t
yt=μ+i=1qθiϵti+ϵty_{t}=\mu+\sum_{i=1}^{q} \theta_{i} \epsilon_{t-i}+\epsilon_{t}

GARCH

σt2:=E[ϵt2Ωt1]=α0+i=1qαiϵti2+i=1pβiσti2\sigma_{t}^{2}:=\mathbb{E}\left[\epsilon_{t}^{2} \mid \Omega_{t-1}\right]=\alpha_{0}+\sum_{i=1}^{q} \alpha_{i} \epsilon_{t-i}^{2}+\sum_{i=1}^{p} \beta_{i} \sigma_{t-i}^{2}

Exponential Smoothing

Fitting Time Series Models: The Box-Jenkins Approach

The three basic steps of the Box-Jenkins modeling approach:

Stationarity

Transformation to Ensure Stationarity

Identification

partial correlogram

Inforation Criterion

Model Diagnostics

A short summary of some of the most useful diagnostic tests for time series modeling in finance

Name Description
Chi-squared test Used to determine whether the confusion matrix of a classifier is statistically significant, or merely white noise
t-test Used to determine whether the output of two separate regression models are statistically different on i.i.d. data
Mariano-Diebold test Used to determine whether the output of two separate time series models are statistically different
ARCH test The ARCH Engle’s test is constructed based on the property that if the residuals are heteroscedastic, the squared residuals are autocorrelated. The Ljung–Box test is then applied to the squared residuals
Portmanteau test A general test for whether the error in a time series model is auto-correlated
Example tests include the Box-Ljung and the Box-Pierce test

Time Series Cross-Validation



Summary

We have covered the following objectives: