This exercise is part of the ECLR page.
Autoregressive processes are a basic building block of much of time-series econometrics. In this walkthrough we shall learn how to simulate such processes. This will also help with a good basic understanding of the properties of such processes.
As you will see, there is a very useful function that will help in simulating these processes, but we shall also use this to learn how to simulate processes yourself. This will be important for non-standard processes.
A general autoregressive process is represented by the following model
\[\begin{equation} v_t = \alpha + \beta t + \gamma_1 v_{t-1} + \gamma_2 v_{t-2} + ... + \gamma_p v_{t-p} + w_t \end{equation}\]
where \(w_t\) is a white noise process satisfying \(E[w_t] = 0\), \(Var[w_t] = \sigma^2\) and \(Cov[w_t,w_{t-k}] = 0\) for all \(k > 0\).
The above delivers a few well know specific processes
Process Name for \(v_t\) | Parameter restrictions |
---|---|
White noise | \(\alpha = \beta = \gamma_i = 0\) for all \(i\) |
Deterministic Trend model | \(\gamma_i = 0\) for all \(i\) |
Random walk | \(\gamma_1 = 1\), \(\alpha = \beta = \gamma_i = 0\) for all \(i>1\) |
Random walk with drift | \(\gamma_1 = 1\), \(\beta = \gamma_i = 0\) for all \(i>1\) |
AR(1) model | \(\beta = \gamma_i = 0\) for all \(i>1\) |
AR(p) model | \(\beta = 0\) |
AR(p) model with deterministic trend | \(\beta \ne 0\) |
We will now learn how to simulate such processes
An AR(1) process
\[\begin{equation} v_t = \alpha + \gamma_1 v_{t-1} + w_t \end{equation}\]
requires a number of pieces of information before we can simulate it. We need parameter values (\(\alpha\) and \(\gamma_1\)). For now we shall set these to \(\alpha=0\) and \(\gamma_1=0.8\). We also need random terms \(w_t\). In order to draw the latter we will need to know what distribution we should draw these from. For now we shall assume \(w_t \sim N(0,1)\) although one could chosse different distributions as well. You also need to know how many \(t\)s you want to simulate, as \(t\) goes from \(t=1,...,T\).
In order to simulate the first observation \(v_1\) we need to know what \(v_0\) is. We call this value the starting value \(v_0\). This is often set to 0. However, as you will soon see this may not always be an appropriate value.
# Simulation parameters
T = 1000
alpha = 0
gamma = 0.8
v0 = 0
# Draw random terms, we draw T+1 to make life easier below
w = rnorm(T+1, 0, 1) # draws T+1 values from the standard normal distribution
# store the simulated series here, T+1 elements as the first is v0
v = matrix(v0,T+1,1)
The last line created a container in which we save the simulated series. For now all values take the same value as \(v0\) but soon we will replace values in cells 2 to \(T+1\) with randomly simulated values following the above AR(1) process.
for (t in seq(2,T+1)){
v[t] = alpha + gamma * v[t-1] + w[t]
}
You can see that inside the loop we basically implemented the AR(1) process.
Why does the loop start with t=2
?
Now we can plot this series.
plot(v, type = "l") # type = "l" produces a line graph
This is your first simulated autoregressive series.
Experiment with the values the parameters take.
As \(\gamma_1\) gets closer and closer to 0 (from the initial \(\gamma_ = 0.8\)), the process looks .
As \(\gamma_1\) gets closer and closer to 1 (from the initial \(\gamma_ = 0.8\)), the process looks .
Comparing two simulated paths, one with \(\gamma_1 = 0.97\) and another with \(\gamma_1 = 1\), the following is true:
When \(|\gamma_1|<1.0\) the process is stationary and always returns back to its unconditional mean, here 0. When \(\gamma_1 = 1.0\) we have a random walk process which has no tendency to return to its unconditional mean.
With \(\gamma_1 = 1.1\) the process looks .
How would you have to change the code to simulate this AR(2) process:
\(v_t = 10 + 0.5 v_{t-1} + 0.4 v_{t-2} + w_t\)?
This requires a change in some parameters but also the addition of a second lag inside the loop.
# Simulation parameters
T = 1000
alpha = 10
gamma1 = 0.5
gamma2 = 0.4
v0 = 0
# Draw random terms, we draw T+2 to make life easier below
w = rnorm(T+2, 0, 1)
# store the simulated series here, T+2 elements as the first two are v0
v = matrix(v0,T+2,1)
for (t in seq(3,T+2)){
v[t] = alpha + gamma1 * v[t-1] + gamma2 * v[t-2] + w[t]
}
Your series should look something like this.
plot(v, type = "l")
The choice of starting value \(v_0=0\) is still appropriate.
You can see that the process starts with 0 but then moves up to
fluctuate around the value of 100. The reason for that is that for a
stationary AR(2) process the unconditional mean value of the process
will be \(E[v_t]=\frac{\alpha}{1-\gamma_1-\gamma_2}\),
which, with our parameter values, is 100. A better starting value would
be v0 = alpha/(1-gamma1-gamma2)
. Alternatively you could
still start with v0 = 0
but simulate, say, 1200
observations and then discard the first 200 observations.
Previously we used a value of \(\alpha = 0\) which meant that the the unconditional mean was also equal to 0.
Also note that the loop now started with t=3
as we
needed two starting values.
arima.sim
The arima.sim
function (part of the stats
package) makes simulating such models very straightforward without you
having to write any loops.
v.AR1 <- arima.sim(n = 1000, list(ar = c(0.8),ma = c()), sd = sqrt(1))
plot(v.AR1, type = "l")
Note that list(ar = c(0.8), ma = c())
sets and AR(1)
model parameter of \(\gamma_1=0.8\).
There is provision for also allowing for moving average processes, but
we leave that empty (ma = c()
) as we are not using MA
components here.
The AR(2) process we simulated earlier is now simulated as follows:
v.AR2 <- arima.sim(n = 1000, mean = 10, list(ar = c(0.5,0.4),ma = c()), sd = sqrt(1))
plot(v.AR2, type = "l")
Here the input mean = 10
sets \(\alpha = 10\) and
ar = c(0.5,0.4)
sets \(\gamma_1 =
0.5\) and \(\gamma_2 =
0.4\).
Also note that the function does automatically take care of the starting value problem we encountered when we simulated the series "by hand".
?arima.sim
) to see what type
of processes can be simulated.This exercise is part of the ECLR page.