# Vector autoregression analysis

## Introduction

The vector autoregression analysis estimates the linear dependencies among a number of time series. The analysis can produce fitted values and forecasts for those series.

In addition to estimating a given system, you can also automatically test different models and let the analysis pick the best one based on an information criteria.

There is a report that contains the estimated parameters of the system as well as a number of statistics that can be used as a test of the system's validity and stability.

The estimation is made using all common valid observations for the model series in the selected estimation sample.

## Estimation model

A vector autoregression can be thought of as a system of linear regressions, but the emphasis is on using lagged values of the dependent variables to model a set of variables. There is an equation for each variable that explains its evolution based on its own lags and the lags of other variables in the model.

The dependent variables are called *endogenous* variables. There may also be *exogenous* variables. Such variables are only explanatory not modeled in the system.

A model may be denoted as being of order *p*, called VAR(p), containing *K* endogenous variables. If there are 2 variables in a VAR(1) model, the system of equations can be written as:

The expression can be written in expanded form as:

$\left(\begin{array}{c}{y}_{1\text{,}t}\\ {y}_{2\text{,}t}\end{array}\right)=\left(\begin{array}{c}{v}_{1}\\ {v}_{2}\end{array}\right)+\left(\begin{array}{cc}{a}_{11}& {a}_{12}\\ {a}_{21}& {a}_{22}\end{array}\right)\left(\begin{array}{c}{y}_{1\text{,}t-1}\\ {y}_{2\text{,}t-1}\end{array}\right)+\left(\begin{array}{c}{u}_{1\text{,}t}\\ {u}_{2\text{,}t}\end{array}\right)$The equations can thus be explicitly written as:

${y}_{1\text{,}t}={v}_{1}+{a}_{11}{y}_{1\text{,}t-1}+{a}_{12}{y}_{2\text{,}t-1}+{u}_{1\text{,}t}$${y}_{2\text{,}t}={v}_{2}+{a}_{21}{y}_{1\text{,}t-1}+{a}_{22}{y}_{2\text{,}t-1}+{u}_{2\text{,}t}$

The present value of y depends on an *intercept v*, the lagged value of itself and the other variable, and then there is an error term *u*. Each error term is supposed to be uncorrelated with all lags of itself and lags of the other error terms.

An arbitrary number of successive forecasts can be calculated and you must specify an end date for the forecast calculation.

When a system contains exogenous variables, assume that these are included in the vector *x* together with their lags and possibly including lag 0 (contemporaneous variables) so that *x* contains s elements. The system of equations for a model called VARX(p, s) can then be written as:

When there are exogenous variables, forecasts can only be calculated as long as there is data available for all the exogenous variables. You might want to add forecasts to these variables before they are passed on to the VAR analysis.

For a *symmetric system*, where each equation contains the same explanatory variables and lags, OLS (ordinary least squares) is used as the estimation method. For *asymmetric systems*, GLS (generalized least squares) is used, which requires an iterative procedure. This is more computationally intense and the system might not converge fast enough to find a solution for large systems.

In order to examine a VAR system, an Impulse Response (IR) can be calculated between two given variables. While an econometrician may assume a system with several variables describing some economic relationship, it can still be interesting to isolate two of them and take a look at their particular dynamics, in one particular direction. IR calculates the response of one variable to an impulse in another, some period later in time. IR has also been called *dynamic multipliers*, because the simplest way to compute them is to multiply the reduced form VAR-matrix by itself i number of times for a horizon i. The effect of past values for all coefficients in the system are used in the calculation, but we only look at the number for the accumulated effect that one variable has on another. A way of understanding this, is that substituting the errors with a one where we investigate the impulse and zero everywhere else, the econometrician can trace this *unit shock* to the given variable at each time period. IR calculation only makes sense between endogenous variables. The Macrobond application presently allows only unit residual IR. A popular approach to IR has been Cholesky orthogonalization, where the model is first transformed by multiplying it with the Cholesky factor of the residual-covariance matrix, so that responses to orthogonal impulses are attained. This is a good idea as far as respecting the idea of isolated effects, but also requires the residuals to have finite variance (Hannsgen, 2010). Instead, the Macrobond user can investigate the residuals prior to interpreting the IR results. For simplicity, the IR is represented as a time series, with values starting at the forecast date, even though in reality they are both unit and dateless.

(Support for calculating IR was added in version 1.17)

###### VECM

The VAR analysis also allows for modelling of cointegrated variables. With this assumption, the variables on differenced form are explained by vectors on level form in addition to the usual VAR form. The rationale for this is that some variables co-move in the long run by the force of some linear process, while having other dynamics in the short run. This is often illustrated by the “drunk and his dog” example: the drunk is walking home with great difficulty and often gets lost, but eventually makes it home. The dog is running around but only so far away from its owner, and they eventually both make it home. So in the long run the two are always moving together, despite the fact that the walk is quite random and unrelated in the short run. Economic interpretations are plenty and allow for many more relations at the same time: long and short run interests, consumption and price level, consumption and investment, etc.

The whole system is specified in *Granger representation* as:

where:
$\Pi =\alpha {\beta}^{\text{T}}$
is the low rank matrix where the *cointegrating relations β* are loaded unto each equations by α. The dimension of these matrices r, is the cointegrating rank of the system. The “VAR part” contains the short term variables and the coefficients
${\Gamma}_{\text{i}}$
are known as adjustment coefficients. This allows the model to capture some non-linearity that the regular VAR would miss.

The vectors
${y}_{t-1}\Pi $
are the *error correcting vectors*. They are not errors as in residuals, but refers to the long run effects compensating for what is not captured in the short run. If rank should be zero so that Pi = 0, the model is theoretically reduced to a VAR on change form, i.e. all variables differenced once. If Pi is full rank, the system reduces to a VAR on levels form. (Note that both of these cases tell us nothing about stability of the dynamic system which it represents.) Since these are not proper VEC-models the MB application does not allow them, and throws an exception if you try to model it. (This includes the cases of automatic rank identification). Hence getting the VEC model right can be cumbersome and it is not necessarily superiority to a regular VAR.

The rank statistics are determined by MHM (Mackinnon-Haug-Michelis, 1999) critical values. Each rank level of the matrix $\Pi $ is tested. There are two options:

- The trace test at level
*r*has a hypothesis so that ${\text{H}}_{0}\text{:}\text{rank}=\text{r}$ and ${\text{H}}_{1}\text{:}\text{rank}=\text{k}$ - The maximum eigenvalue test at level r, on the other hand, tests the null hypothesis ${\text{H}}_{0}\text{:}\text{rank}=\text{r}$ against ${\text{H}}_{1}\text{:rank}=\text{r}+1$

With the Macrobond application two approaches to VECM can be used: Johansen and Ahn-Rensel-Saikkonen. Johansen’s (1986) approach is solved for in a maximum likelihood scheme. It starts by identifying the cointegrating vectors. These are then subtracted from the original dependent variables. This reduces the system to a regular VAR which is solved using OLS. It is the best known and perhaps most widely used VEC-model. Since the rank reduction of the matrix $\Pi $ is done by means of eigendecomposition, this model may be viewed as employing noise reduction to clean up $\Pi $ .

Ahn-Rensel-Saikkonen is based on least squares regression. It starts by estimating the whole system `x.yz` so that
$\Pi $
is full rank. Afterwards the proper rank reduction is made, decomposing
$\Pi $
to a lower rank, but without changing the short run coefficients
${\Gamma}_{\text{i}}$
. It is suggested in Brüggeman & Lütkepohl (2004) that this approach is more robust in a small samples than Johansen.

(Support for calculating VECM was added in version 1.18)

## Settings

#### Estimation sample range

Specify the limits of the estimation sample range. The default range will be the largest range where there is data for all the series.

#### Output residuals

When this option is selected a time series containing the residuals will be calculated.

#### Output the endogenous series

Select this option in order to include the endogenous series in the output.

#### Output the exogenous series

Select this option in order to include the exogenous series in the output.

#### Calculate impulse response

Select this option in order to calculate the impulse response of the specified length. Select in what equation the impulse should be applied and what variable.

(Available in version 1.17 and later.)

#### Confidence band

Confidence bands for forecasts of each equation are computed using the VAR estimator covariance matrix. Since the VAR is ideally a stable linear dynamic system, the forecasted values are dynamically generated. This means that they converge toward some mean (zero if normalized). Therefore, the error bands must also converge to a constant value, an upper and lower bound respectively. Because not much is known about the small sample properties about the Feasible Generalized Least Squares estimator used in the VAR, only asymptotic errors are computed. This makes the this makes the estimated error terms less reliable for estimations from short time series. It can be shown that the estimator variance of the FGLS is lower than or at least equal to that of standard OLS.

(Added in version 1.16)

#### Autocorrelation test lags

Select this option in order to include a Portmanteau autocorrelation test in the report. Specify the number of lags to include. The number of lags should be larger than the highest number of lags of the endogenous or exogenous variables.

#### Max endogenous lags

Specify the maximum number of lags to include for the endogenous variables. You can further refine which lags to include in the model on the “Lag settings for endogenous variables in the equations” tab.

#### Max exogenous lags

Specify the maximum number of lags to include for the exogenous variables. You can further refine which lags to include in the model on the “Lag settings for exogenous variables in the equations” tab.

#### Find best model based on max endogenous lags for information criteria

Select this option in order to let the system automatically test what combination of symmetric lags are optimal based on the selected information criteria.

You can select the minimum and maximum number of lags of the endogenous variables to test and also the minimum and maximum of different lags (regressors) to include in each round of tests.

Select the setting “Require stable process” in order to disqualify any model where the roots of the characteristic equation indicate that the model is not stable.

#### Type

Select if a series should be included in the model as a endogenous variable, exogenous variable or not at all.

#### Diff

Use the first order differentials of the endogenous variable in the model. The output will still be in the form of levels even if this option is selected.

This option is not available for VECM.

(Available in version 1.17 and later.)

#### Intercept

Select if the intercept should be included in the model for endogenous variables.

This option is not available per variable for VECM.

#### Restrict to CE

In VECM, both trend and intercept can be restricted to the cointegrating relations. This means that they are treated as deterministic variables within $\Pi $ , on level form. These occur either on level or change form, never both.

The variables for intercept and trend are added by settings in the VECM configuration box.

#### Equation name

Optionally specify the name of the equation to be used in the report.

#### Variable name

Optionally specify the name of the equation to be used in the report.

#### VECM

Enable VECM.

#### Configuration

Select method Johansen or Ahn-Rensel-Saikkonen

Include intercept : adds an intercept variable

Include linear trend : adds a trend variable

#### Automatic cointegration test

Select whether to automatically find best cointegration rank or to enter it manually.

The settings for automatic rank selection are described in the section Estimation model.

## Bibliography

Hannsgen, G. *Infinite-variance, Alpha-stable Shocks in Monetary SVAR,* Levy Economics Institute of Bard Collage, 2010

Lütkepohl, H. *New Introduction to Multiple Time Series Analysis.* Berlin: Springer, 2005

Lütkepohl, H & Saikkonen, Pentti. *Maximum Eigenvalue Versus Trace Tests for the Cointegration Rank for a VAR process.* Berlin: Humboldt University, 2000.

Johansen, Søren. *Likelihood based Inference in Cointegrated Vector Autoregression Models.* Copenhagen, Exford University Press, 1995

Mackinnon, Jamges J., Haug, Alfred A. & Michelis,Leo, *Numerical Distribution Functions of Fractional Unit Root and Cointegration Tests,* 1995