What you want to know about Time Series modeling

recently, pop, investment, best, understand

Ok, so I've seen a lot of grief about the time series stuff that can pop up on exams. I also experienced myself recently in the morning exam from Schweser in Volume I (#3) on that time series shiznit. When I went back over the answers, I couldn't believe I actually got so" "flummoxed" that I missed the question about testing the residuals...

___________________________

First, understand what investment you're working with. Are you working with data that is best predicted by other data represented as independent variables? This is cross sectional data. Or, are you working with data that is best predicted by its own past values? This is time series.

If we use a time series data, plot the data and check for covariance stationarity (a good thing). What is this? Basically, it means that the data sticks around its mean, and there is a finite variance. If the data just skyrockets off the chart, it exhibits a trend and is not covariance stationary.

If it exhibits a trend, is it a linear trend? Or, a loglinear trend? A linear trend is data that increases at a constant number. A loglinear trend is data that increases exponentially, or at a constant % rate. Also, note for seasonality and a regime change when plotting the data. What's that? Sales data will be seasonal, with spikes in the 4th quarter. Interest rate data will experience a regime change, as the US Fed pursues different policies under different Fed Chairmen (think Alan Greenspan versus Paul Volcker).

If the data is not covariance stationary, first difference it! This is EASY. If you're working with monthly sales data, and you're forecasting May's sales data, take the difference between April and March, March and February, February and January, and so on. Remember, first differenced data will have n-1 observations.

If the model you are using is an AR model, it may likely exhibit a random walk. What is this? This essentially means that the current period value is the prior period value multiplied by a random value (b1). Think of currency movement here guys. A random walk is NOT COVARIANCE STATIONARY, and has no mean reverting level (b0 / 1-b1). First difference this data to make it covariance stationary. Do the same for random walk with drift data (on this, b0 = 1, not 0 as in random walk without drift).

------------------------------
As a quick aside, one set of time series data can be used to explain another set of time series data. In this event, a unit root can occur (think of this as "divide by 1"). If you are told that one set of data has a unit root, but the other does not, you can't proceed. But, if both have a unit root, and the Engle-Granger Dickey-Fuller t-test statistic determines that both sets of time series data are COINTEGRATED, then go ahead and proceed with the model.
------------------------------

Back to the AR model.

Continue to plot the data, confirm covariance stationarity (aka stationary). Now use the Durbin Watson statistic to test for serial correlation. This can come in positive and negative forms. On this, you test the autocorrelation of the residuals. The standard error for this is (1 / square root of #observations). What is the autocorrelation of the residuals? Just another fancy way of taking each observation's residual [(actual Y - expected Y)^2] and testing them for autocorrelation (aka serial correlation) with the DW statistic.

Assuming no statistical significance, serial correlation is not a problem.

Let's say serial correlation is a problem. Well, we use an AR model in that case. Serial correlation does not misspecify an AR model, because the independent variables are lags of the dependent variable...of course the data is correlated with itself! DUH!

Remember that an AR(1) with a seasonal lag sometimes can look like an AR(2), but it's not! Don't get tripped up on that.

BUUUT, if serial correlation is a problem, just insert another independent variable (lag of the dependent) and you get an AR(2). If serial correlation is still a problem, repeat and you get an AR(3). Do so until no problem.

Now, test for ARCH. Take the original residuals and square them. Regress each residual upon its prior period residual, and so on. If b1 is statistically significant from zero, you've got an ARCH process folks. Use generalized least squares method to correct for ARCH. This provides a better "fit" for the line.

FINALLY, perform an out of sample forecast performance evaluation and compare it to the RMSE. What is the RMSE? This is the in sample forecast performance evaluation (your theory versus the real world) and is simply the square root of the average of the standard errors. Just take all the standard errors (aka residuals, aka "e"), square them, and get their arithmetic average. Now, take the square root. That's your RMSE. You want a LOW RMSE...think of it like your Accruals ratios in FSA. You want the lowest number possible. Why? You're dealing with ERRORS friend!

That's that...