Demilicious: June 2011

Tuesday, June 21, 2011

CHAPTER 22: TIME SERIES ECONOMETRICS: FORECASTING

2 Methods of Forecasting

1. Autoregressive integrated moving average (ARIMA), popularly known as the Box-Jenkins methodology

To forecast the values of a time series, the basic Box-Jenkins strategy is as follows:

a) First examine the series for stationarity.

b) If the time series is not stationary, difference it one or more times to achieve stationarity.

c) The ACF and PACF of the stationary time series are then computed to find out if the series is purely autoregressive or purely of the moving average type or a mixture of the two.

d) The tentative model is then estimated.

e) The residuals from this tentative model are examined to find out if they are white noise.

f) The model finally selected can be used for forecasting.

2. Vector autoregression (VAR)

The VAR approach to forecasting considers several time series at a time. The distinguishing features of VAR are as follows:

a) It is a truly simultaneous system in that all variables are regarded as endogenous.

b) In VAR modeling, the value of a variable is expressed as the linear function of the past, or lagged, values of that variable and all other variables included in the model.

c) If each equation contains the same number of lagged variables in the system, it can be estimated by OLS without resorting to any systems methods.

d) The simplicity of VAR modeling may be its drawback.

e) If there are several lags in each equation, it is not always easy to interpret each coefficient, especially if the signs of the coefficients alternate.

f) There is considerable debate and controversy about the superiority of the various forecasting methods.

ü Approaches to Economic Forecasting

1. Exponential smoothing methods

2. Single-equation regression models

3. Simultaneous-equation regression models

4. Autoregressive integrated moving average models (ARIMA)

5. Vector autoregression

ü Measuring Volatility in Financial Time Series: The ARCH and GARCH Models

Volatility clustering – periods in which they exhibit wide swings for an extended time period followed by a period of comparative tranquility.

· Autoregressive conditional heterscedasticity (ARCH)

· Generalized autoregressive conditional heteroscedasticity (GARCH)

CHAPTER 21: TIME SERIES ECONOMETRICS: SOME BASIC CONCEPTS

ü Key Concepts

1. Stochastic processes

2. Stationary processes

3. Purely random processes

4. Nonstationary processes

5. Integrated variables

6. Random walk models

7. Cointegration

8. Deterministic and stochastic trends

9. Unit root tests

ü Stochastic Processes

A random or stochastic process is a collection of random variables ordered in time.

· Stationary Stochastic Processes

A stochastic process is said to be stationary if its mean and variance are constant over time and the value of the covariance between the two time periods depends only on the distance or gap or lag between the two time periods and not the actual time at which the covariance is computed.

· Nonstationary Stochastic Processes

2 Types of Random Walks

1. Random walk without drift (no constant/intercept term)

2. Random walk with drift (constant term is present)

ü Trend Stationary (TS) and Difference Stationary (DS) Stochastic Processes

A TS time series has a deterministic trend, whereas a DS time series has a variable, or stochastic trend.

ü Integrated Stochastic Processes

Properties of Integrated Series:

1. If X_t I(0) and Y_t I(1), then Z_t = (X_t + Y_t) = I(1), that is, linear combination or sum of stationary and nonstationary time series is nonstationary.

2. If X_t I(d), then Z_t = (a + bX_t) = I(d), where a and b are constants.

3. If Xt I(d₁) and Y_t I(d₂), then Z_t = (aX_t + bY_t) = I(d₂), where d₁ > d₂.

4. If X_t I(d) and Y_t I(d), then Z_t = (aX_t + bY_t) I(d*); d* is generally equal to d, but in some cases d*> d.

ü The Phenomenon of Spurious Regression

Regression one time series variable on one or more time series variables often gave nonsensical or spurious results.

ü Tests of Stationarity

1. Graphical Analysis

2. Autocorrelation Function (ACF) and Correlogram

ρ_k = _k

₀

= covariance at lag k

Variance

ü Cointegration: Regression of a Unit Root Time Series on Another Unit Time Series

Cointegration means that despite being individually nonstationary, a linear combination of two or more time series can be stationary.

· Testing for Cointegration

- Engle-Granger (EG) or Augmented Engle-Granger (AEG) Test

- Cointegrating Regression Durbin-Watson (CRDW) Test

· Cointegration and Error Correction Mechanism (ECM)

Error Correction Mechanism (ECM) – developed by Engle and Granger is ameans of reconciling the short-run behavior of an economic variable with its long-run behavior.

CHAPTER 20: SIMULTANEOUS-EQUATION METHODS

ü Approaches to Estimation

· Single-equation methods/limited information methods

· Systems methods/full information methods

Single-equation methods:

1. Ordinary Least Squares (OLS)

2. Indirect Least Squares (ILS)

3. Two-Stage Least Squares (2SLS)

ü Recursive Models and Ordinary Least Squares

Zero contemporaneous correlation – same-period disturbances in different correlations are uncorrelated.

ü Estimation of a Just Identified Equation: The Method of Indirect Least Squares (ILS)

Step 1: We first obtain the reduced-form equations.

Step 2: We apply OLS to the reduced-form equations individually.

Step 3: We obtain estimates of the original structural coefficients from the estimated reduced-form coefficients obtained in Step 2.

ü Estimation of an Overidentified Equation: The Method of Two-Stage Least Squares (2SLS)

Stage 1: To get rid of the likely correlation between Y₁ and u₂, regress first Y₁ on all the predetermined variables in the whole system, not just that equation.

Stege 2: The overidentified money supple equation can now be written as:

Y_2t = β_2t + β_2tY_1t + u_t

Features of 2SLS:

1. It can be applied to an individual equation in the system without directly taking into account any other equation(s) in the system.

2. Unlike ILS, which provides multiple estimates of parameters in the overidentified equations, 2SLS provides only one estimate per parameter.

3. It is easy to apply because all one needs to know is the total number of exogenous or predetermined variables in the system without knowing any other variables in the system.

4. Although specially designed to handle overidentified equations, the method can also be applied to exactly identified equations.

5. R² values in the reduced-form regressions are very high, the classical OLS estimates and 2SLS estimates will be very close.

CHAPTER 19: THE IDENTIFICATION PROBLEM

ü Notations and Definitions

2 Types of Simultaneous-Equation Model

Endogenous – those determined within the model

Predetermined – those determined outside the model

2 Categories of Predetermined Variables

Exogenous

Lagged Endogenous

A reduced-form equation is one that expresses an endogenous variable solely in terms of the predetermined variables and the stochastic disturbances.

ü The Identification Problem

- Whether numerical estimates of the parameters of a structural equation can be obtained from the estimated reduced-form coefficients.

- May be either exactly identified or overidentified.

Exactly identified if unique numerical values of the structural parameters can be obtained.

Overidentified if more than one numerical value can be obtained for some of the parameters of the structural equations.

ü Rules for Identification

The so-called order and rank conditions of identification lighten the task by providing a systematic routine.

ü The Order Condition of Identifiability

Definition 19.1 In a model M simultaneous equations in order for an equation to be identified, it must exclude at least M – 1 variables appearing in the model. It excludes exactly M – 1 variables, the equation is just identified. If it excludes more than M – 1 variables, it is overidentified.

Definition 19.2 In a model of M simultaneous equations, in order for an equation to be identified, the number of predetermined variables excluded from the equation must not be less than the number of endogenous variables included in that equation less 1, that is,

K – k ≥ m – 1

If K – k = m – 1, the equation is just identified, but if K – k > m – 1, it is overidentified.

ü The Rank Condition of Identifiability

In a model containing M endogenous variables, an equation is identified if and only if at least one nonzero determinant of order (M – 1)(M – 1) can be constructed from the coefficients of the variables excluded from that particular equation but included in the other equations of the model. The rank condition tells us if it is exactly identified or overidentified.

ü Tests of Simultaneity

Hausman Specification Test.

Step 1: Regress P_t on I_t and R_t to obtain v_t.

Step 2: Regress Q_ton P_t and v_t and perform t test on the coefficient of v_t. If it is significant, do not reject the hypothesis of simultaneity; otherwise, reject it.

CHAPTER 18: SIMULTANEOUS-EQUATION MODELS

ü The Nature of Simultaneous-Equation Models

1. In contrast, to single-equation models, in simultaneous-equation models more than one dependent, or endogenous, variable is involved, necessitating as many equations as the number of endogenous variables.

2. A unique feature of simultaneous-equation models is that the endogenous variable in one equation may appear as an explanatory variable in another equation of the system.

3. As a consequence, such an endogenous explanatory variable becomes stochastic and is usually correlated with the disturbance term of the equation in which it appears as an explanatory variable.

CHAPTER 17: DYNAMIC ECONOMETRIC MODELS: AUTOREGRESSIVE AND DISTRIBUTED-LAG MODELS

2 Types of Lagged Models

Distributed-Lag Model – the regression model includes not only the current but also the lagged values of the explanatory variables.

Autoregressive Model – the model includes one or more lagged values of the dependent variable among its explanatory variables.

ü The Reasons for Lags

1. Psychological Reasons

2. Technological Reasons

3. Institutional Reasons

ü Estimation of Distributed-Lag Models

· Ad Hoc Estimation of Distributed-Lag Models

· Priori Restrictions on the β’s

ü The Koyck Approach to Distributed-Lag Models

Features of the Koyck Transformation:

1. We started a distributed-lag model but ended up with an autoregressive model because Y_{t – 1}appears as one of the explanatory variables.

2. The appearance of Y_{t – 1}is likely to create some statistical problems.

3. In the original model, the disturbance term was u_t, whereas in the transformed model it is v_t = (u_t – λu_t – 1).

4. The presence of lagged Y violates one of the assumptions underlying the Durbin-Watson d Test.

· The Median Lag

Koyck Model: Median Lag = -log 2

log λ

· The Mean Lag

Mean Lag = ∑₀Kβk

∑₀βk

ü Rationalization of the Koyck Model: The Adaptive Expectations Model

0≤ ≤1 = coefficient of expectation

X_t – X_{t – 1}= (X_t– X_{t – 1}) = adaptive expectation, progressive expectation or error learning hypothesis.

ü Another Rationalization of the Koyck Model: The Stock Adjustment, or Partial Adjustment, Model

0≤ ≤1 = coefficient of adjustment

Y_t – Y_{t – 1}= (Y_t – Y_{t – 1}) = partial adjustment, or stock adjustment, hypothesis.

ü Estimation of Autoregressive Models

If an explanatory variable in a regression model is correlated with the stochastic disturbance term, the OLS estimators are not only biased but also not even consistent, that is, even if the sample size is increased indefinitely, the estimators do not approximate their true population values. Therefore, estimation of the Koyck and adaptive expectation models by the usual OLS procedure may yield seriously misleading results.

ü Detecting Autocorrrelation in Autoregressive Models: Durbin h Test

Features of h Statistic:

1. It does not matter how many X variables or how many lagged values of Y are included in the regression model.

2. The test is not applicable if [nvar(α₂)] exceeds 1.

3. Since the test is a large-sample test, its application in small samples is not strictly justified.

CHAPTER 16: PANEL DATA REGRESSION MODELS

ü Why Panel Data?

Advantages of Panel Data

1. They increase the sample size considerably.

2. By studying repeated cross-section observations, panel data are better suited to study the dynamics of change.

3. Panel data enable us to study more complicated behavioral models.

Balanced Panel – panel data in which each cross-sectional unit has the same number of time series observations.

Unbalanced Panel – panel data in which the number of observations differs among panel members.

ü Estimation of Panel Data Regression Models: The Fixed Effects Approach

1. All coefficients constant across time and individuals.

2. Slope coefficients constant but the intercept varies across individuals: The fixed effects or least-squares dummy variable (LSDV) regression model.

3. Slope coefficients constant but the intercept varies over individuals as well as time.

4. All coefficients vary across individuals.

ü A Caution in the Use of the Fixed Effects, or LSDV, Model

1. If you introduce too many dummy variables, you will run up against the degrees of freedom problem.

2. With so many variables in the model, there is always the possibility of multicollinearity, which might make precise estimation of one or more parameters difficult.

3. Suppose in the FEM we also include variables such as sex, color and ethnicity, which are time invariant too because an individual’s sex, color or ethnicity does not change over time.

4. We have to think carefully about the error term, u_it.

ü Estimation of Panel Data Regression Models: The Random Effects Approach

ü Panel Data Regressors: Some Concluding Comments

1. Hypothesis testing with panel data.

2. Heteroscedasticity and autocorrelation in EAU.

3. Unbalanced panel data.

4. Dynamic panel data models in which the lagged values of the regressand (Y_i) appears as an explanatory variable.

5. Simultaneous equations involving panel data.

6. Qualitative dependent variables and panel data.