Time Series Regression: The Return

BUS 323 Forecasting and Risk Analysis

Selecting predictors

  • Often many potential predictors
  • For causal inference: theory
  • For improved fit: a measure of accuracy
    • We’ll look at a few.

Adjusted \(R^{2}\)

  • Additional regressors always increase \(R^{2}\).
    • Use adjusted \(R^{2}\) instead.

      \[ \bar{R}^{2} = 1 - (1 - R^{2}) \frac{T - 1}{T - k - 1} \]

    • \(T\): # of observations

    • \(k\): # of predictors

  • Maximizing \(\bar{R}^{2}\) minimizes standard error.

Cross-validation

  • Leave-one-out cross-validation
    • Remove observation \(t\)
    • Estimate the model based on remaining data
    • Compute the error for \(e_{t}^{*}\)
    • Repeat for \(t=1,...,T\)
    • Compute MSE from \(e_{1}^{*},...,e_{T}^{*}\).

Akaike’s Information Criterion

\[ AIC = T \times log(\frac{SSE}{T}) + 2(k+2) \]

- $k+2$: number of parameters to be estimated

Corrected Akaike’s Information Criterion

  • For small \(T\), AIC results in too many predictors.
  • Instead, use corrected AIC:

    \[ AIC_{c} = AIC = \frac{2(k+2)(k+3)}{T-k-3} \]

Implementation

  • Recall: Regression forecast for US consumption:
library(fpp3)
fit_consMR <- us_change |>
  model(tslm = TSLM(Consumption ~ Income + Production +
                                    Unemployment + Savings))
report(fit_consMR)

Implementation

  • Use glance() to obtain measures discussed earlier:
# A tibble: 1 × 4
  adj_r_squared    CV   AIC  AICc
          <dbl> <dbl> <dbl> <dbl>
1         0.763 0.104 -457. -456.

First differencing

  • First difference:

    \[ y_{t}' = y_{t} - y_{t-1} \]

Random walk model

  • White noise first difference: \(y_{t}' = \epsilon_{t}\)
  • \(y_{t}\) modeled as a random walk:

    \[ y_{t} = y_{t-1} + \epsilon_{t} \]

Example

  • Random walk models used for non-stationary data.
  • e.g. Financial data:
google_2015 <- gafa_stock |>
  filter(Symbol == "GOOG", year(Date) == 2015)

Example: plot

Example: first difference plot

Example: ACF plots

  • Check ACF for Google stock price and its first difference:
google_2015 |> ACF(Close) |>
  autoplot() + labs(subtitle = "Google closing stock price")

Example: ACF plots

google_2015 |> ACF(difference(Close)) |>
  autoplot() + labs(subtitle = "Google closing stock price first difference")

Random walk models

  • Typical features:
    • Long periods of apparent trends
    • Sudden changes in trend direction
  • Random walk model similar to naive model
  • Non-zero mean: \(y_{t} = c + y_{t-1} + \epsilon_{t}\) where \(c\) is the average of the changes between consecutive observations.
    • Drift method!

Second differencing

\[ \begin{align} y_{t}'' & = y_{t}' - y_{t-1}' \\ & = (y_{t} - y_{t-1}) - (y_{t-1} - y_{t-2}) \\ & = y_{t} - 2y_{t-1} + y_{t-2} \end{align} \]

  • “changes in changes”

Seasonal differencing

\[ y_{t}' = y_{t} - y_{t-m} \]

  • “lag-\(m\) differences”
  • Yields seasonal naive forecasts.

Implementation

  • To take differences easily, use difference()
  • To specify \(m\)-step differences, use difference(object, m)