Complete (S)ARIMA and ARCH chapters
This commit is contained in:
parent
92099f5094
commit
7e86deb1be
1 changed files with 118 additions and 1 deletions
119
main.tex
119
main.tex
|
@ -942,16 +942,132 @@ $$\hat{\beta} =(X^T \Sigma^{-1} X)^{-1} X^T \Sigma^{-1} y$$
|
||||||
With $Var(\hat{\beta}) = (X^T \Sigma^{-1} X)^{-1} \sigma^2$
|
With $Var(\hat{\beta}) = (X^T \Sigma^{-1} X)^{-1} \sigma^2$
|
||||||
|
|
||||||
\subsubsection{R example}
|
\subsubsection{R example}
|
||||||
|
Package \verb|nlme| has function \verb|gls()|. It does only work if the correlation structure of the errors is provided. This has to be determined from the residuals of an OLS regression first.
|
||||||
\begin{lstlisting}[language=R]
|
\begin{lstlisting}[language=R]
|
||||||
> library(nlme)
|
> library(nlme)
|
||||||
> corStruct <- corARMA(form=~time, p=2)
|
> corStruct <- corARMA(form=~time, p=2)
|
||||||
> fit.gls <- gls(temp~time+season, data=dat,correlation=corStruct)
|
> fit.gls <- gls(temp~time+season, data=dat,correlation=corStruct)
|
||||||
\end{lstlisting}
|
\end{lstlisting}
|
||||||
|
The output contains the regression coefficients and their standard errors, as well as the AR-coefficients plus some further information about the model (Log-Likelihood, AIC, ...).
|
||||||
|
|
||||||
|
\subsection{Missing input variables}
|
||||||
|
\begin{itemize}
|
||||||
|
\item Correlated errors in (time series) regression problems are often caused by the absence of crucial input variables.
|
||||||
|
\item In such cases, it is much better to identify the not-yet-present variables and include them into the regression model.
|
||||||
|
\item However, in practice this isn‘t always possible, because these crucial variables may be non-available.
|
||||||
|
\item \textbf{Note:} Time series regression methods for correlated errors such as GLS can be seen as a sort of emergency kit for the case where the non-present variables cannot be added. If you can do without them, even better!
|
||||||
|
\end{itemize}
|
||||||
|
|
||||||
|
\section{ARIMA and SARIMA}
|
||||||
|
\textbf{Why?} \\
|
||||||
|
Many time series in practice show trends and/or seasonality. While we can decompose them and describe the stationary part, it might be attractive to directly model them. \\
|
||||||
|
\vspace{.2cm}
|
||||||
|
\textbf{Advantages} \\
|
||||||
|
Forecasting is convenient and AIC-based decisions for the presence of trend/seasonality become feasible. \\
|
||||||
|
\vspace{.2cm}
|
||||||
|
\textbf{Disadvantages} \\
|
||||||
|
Lack of transparency for the decomposition and forecasting has a bit the flavor of a black-box-method. \\
|
||||||
|
|
||||||
|
\subsection{ARIMA(p,d,q)-models}
|
||||||
|
ARIMA models are aimed at describing series that have a trend which can be removed by differencing, and where the differences can be described with an ARMA($p,q$)-model. \\
|
||||||
|
\vspace{.2cm}
|
||||||
|
\textbf{Definition}\\
|
||||||
|
If
|
||||||
|
$$Y_t = X_t - X_{t-1} = (1-B)^d X_t \sim ARMA(p,q)$$
|
||||||
|
then
|
||||||
|
$$X_t \sim ARIMA(p,d,q)$$
|
||||||
|
In most practical cases, using $d = 1$ will be enough! \\
|
||||||
|
\vspace{.2cm}
|
||||||
|
\textbf{Notation}\\
|
||||||
|
$$\Phi(B)(1-B)^d X_t = \Theta(B)(E_t)$$
|
||||||
|
\vspace{.2cm}
|
||||||
|
\textbf{Stationarity}\\
|
||||||
|
ARIMA-processes are non-stationary if $d > 0$, option to rewrite as non-stationary ARMA(p,q).
|
||||||
|
|
||||||
|
\subsubsection{Fitting ARIMA in R}
|
||||||
|
\begin{enumerate}
|
||||||
|
\item Choose the appropriate order of differencing, usually $d = 1$ or (in rare cases) $d = 2$ , such that the result is a stationary series.
|
||||||
|
\item Analyze ACF and PACF of the differenced series. If the stylized facts of an ARMA process are present, decide for the orders $p$ and $q$.
|
||||||
|
\item Fit the model using the arima() procedure. This can be done on the original series by setting $d$ accordingly, or on the differences, by setting $d = 0$ and argument \verb|include.mean=FALSE|.
|
||||||
|
\item Analyze the residuals; these must look like White Noise. If several competing models are appropriate, use AIC to decide for the winner.
|
||||||
|
\end{enumerate}
|
||||||
|
|
||||||
|
\textbf{Example}\footnote{Full example in script pages 117ff}{} \\
|
||||||
|
Plausible models for the logged oil prices after inspection of ACF/PACF of the differenced series (that seems stationary): ARIMA(1,1,1) or ARIMA(2,1,1)
|
||||||
|
\begin{lstlisting}[language=R]
|
||||||
|
> arima(lop, order=c(1,1,1))
|
||||||
|
Coefficients:
|
||||||
|
ar1 ma1
|
||||||
|
-0.2987 0.5700
|
||||||
|
s.e. 0.2009 0.1723
|
||||||
|
sigma^2 = 0.006642: ll = 261.11, aic = -518.22
|
||||||
|
\end{lstlisting}
|
||||||
|
|
||||||
|
\subsubsection{Rewriting ARIMA as Non-Stationary ARMA}
|
||||||
|
Any ARIMA(p,d,q) model can be rewritten in the form of a non-stationary ARMA((p+d),q) process. This provides some deeper insight, especially for the task of forecasting.
|
||||||
|
|
||||||
|
\subsection{SARIMA(p,d,q)(P,D,Q)$^S$}
|
||||||
|
We have learned that it is also possible to use differencing for obtaining a stationary series out of one that features both trend and seasonal effect.
|
||||||
|
\begin{enumerate}
|
||||||
|
\item Removing the seasonal effect by differencing at lag 12 \\ \begin{center}$Y_t = X_t - X_{t-12} = (1-B^{12})X_t$ \end{center}
|
||||||
|
\item Usually, further differencing at lag 1 is required to obtain a series that has constant global mean and is stationary \\ \begin{center} $Z_t = Y_t - Y_{t-1} = (1-B^{12})Y_t = (1-B)(1-B^{12})X_t = X_t - X_{t-1} - X_{t-12} + X_{t-13}$ \end{center}
|
||||||
|
\end{enumerate}
|
||||||
|
The stationary series $Z_t$ is then modelled with some special kind of ARMA($p,q$) model. \\
|
||||||
|
\vspace{.2cm}
|
||||||
|
|
||||||
|
\textbf{Definition} \\
|
||||||
|
A series $X_t$ follows a SARIMA($p,d,q$)($P,D,Q$)$^S$-process if the following equation holds:
|
||||||
|
$$\Phi(B)\Phi_s (B^S) Z_t = \Theta(B) \Theta_S (B^S) E_t$$
|
||||||
|
Here, series Z t originated from $X_t$ after appropriate seasonal and trend differencing: $Z_t = (1-B)^d (1-B^S)^D X_t$ \\
|
||||||
|
\vspace{.2cm}
|
||||||
|
In most practical cases, using differencing order $d = D = 1$ will be sufficient. Choosing of $p,q,P,Q$ happens via ACF/PACF or via AIC-based decisions.
|
||||||
|
|
||||||
|
\subsubsection{Fitting SARIMA}
|
||||||
|
\begin{enumerate}
|
||||||
|
\item Perform seasonal differencing of the data. The lag $S$ is determined by the period. Order $D = 1$ is mostly enough.
|
||||||
|
\item Decide if additional differencing at lag 1 is required for stationarity. If not, then $d = 0$. If yes, then try $d = 1$.
|
||||||
|
\item Analyze ACF/PACF of $Z_t$ to determine $p,q$ for the short term and $P,Q$ at multiple-of-the-period dependency.
|
||||||
|
\item Fit the model using \verb|arima()| by setting \verb|order=c(p,d,q)| and \verb|seasonal=c(P,D,Q)| accordingly to your choices.
|
||||||
|
\item Check the accuracy of the model by residual analysis. The residuals must look like White Noise and +/- Gaussian.
|
||||||
|
\end{enumerate}
|
||||||
|
|
||||||
|
\section{ARCH/GARCH-models}
|
||||||
|
The basic assumption for ARCH/GARCH models is as follows:
|
||||||
|
$$X_t = \mu_t + E_t$$
|
||||||
|
where $E_t = \sigma_t W_t$ and $W_t$ is white noise. \\
|
||||||
|
Here, both the conditional mean and variance are non-trivial
|
||||||
|
$$\mu_t = E[X_t | X_{t-1},X_{t-2},\dots], \, \sigma_t^2 = Var[X_t | X_{t-1},X_{t-2},\dots]$$
|
||||||
|
and can be modelled using a mixture of ARMA and GARCH. \\
|
||||||
|
\vspace{.2cm}
|
||||||
|
For simplicity, we here assume that both the conditional and the global mean are zero $\mu = \mu_t = 0$ and consider pure ARCH processes only where:
|
||||||
|
$$X_t = \sigma_t W_t \; \mathrm{with} \; \sigma_t = f(X_{t-1}^2,X_{t-2}^2,\dots,X_{t-p}^2)$$
|
||||||
|
|
||||||
|
\subsection{ARCH(p)-model}
|
||||||
|
A time series X t is \textit{autoregressive conditional heteroskedastic} of order $p$, abbreviated ARCH($p$), if:
|
||||||
|
$$X_t = \sigma_t W_t$$
|
||||||
|
with $\sigma_t = \sqrt{\alpha_0 + \sum_{i=1}^p \alpha_p X_{t-i}^2}$
|
||||||
|
It is obvious that an ARCH($p$) process shows volatility, as:
|
||||||
|
$$Var(X_t | X_{t-1},X_{t-2},\dots]) = \alpha_0 + \alpha_1 Var(X_t | \dots]) + \dots + \alpha_p Var(X_t | \dots])$$
|
||||||
|
|
||||||
|
We can determine the order of an ARCH($p$) process in by analyzing ACF and PACF of the squared time series data. We then again search for an exponential decay in the ACF and a cut-off in the PACF.
|
||||||
|
|
||||||
|
\subsubsection{Fitting an ARCH(2)-model}
|
||||||
|
The simplest option for fitting an ARCH($p$) in R is to use function \verb|garch()| from \verb|library(tseries)|. Be careful, because the \verb|order=c(q,p)| argument differs from most of the literature.
|
||||||
|
\begin{lstlisting}[language=R]
|
||||||
|
> fit <- garch(lret.smi, order = c(0,2))
|
||||||
|
> fit
|
||||||
|
Call: garch(x = lret.smi, order = c(0, 2))
|
||||||
|
|
||||||
|
Coefficient(s):
|
||||||
|
a0 a1 a2
|
||||||
|
6.568e-05 1.309e-01 1.074e-01
|
||||||
|
\end{lstlisting}
|
||||||
|
We recommend to run residual analysis afterwards.
|
||||||
|
|
||||||
|
|
||||||
\section{General concepts}
|
\section{General concepts}
|
||||||
\subsection{AIC}
|
\subsection{AIC}
|
||||||
The \textit{Akaike-information-criterion} is useful for determining the order of an $ARMA(p,q)$ model. The formula is as follows:
|
The \textit{Akaike-information-criterion} is useful for determining the order of an $ARMA(p,q)$ model. The formula is as follows (\textbf{lower is better}):
|
||||||
$$AIC = -2 \log (L) + 2(p+q+k+1)$$
|
$$AIC = -2 \log (L) + 2(p+q+k+1)$$
|
||||||
where
|
where
|
||||||
\begin{itemize}
|
\begin{itemize}
|
||||||
|
@ -960,6 +1076,7 @@ where
|
||||||
\end{itemize}
|
\end{itemize}
|
||||||
For small samples $n$, often a corrected version is used:
|
For small samples $n$, often a corrected version is used:
|
||||||
$$AICc = AIC + \frac{2(p + q + k + 1)(p + q + k + 2)}{n - p - q - k - 2}$$
|
$$AICc = AIC + \frac{2(p + q + k + 1)(p + q + k + 2)}{n - p - q - k - 2}$$
|
||||||
|
|
||||||
\scriptsize
|
\scriptsize
|
||||||
|
|
||||||
\newpage
|
\newpage
|
||||||
|
|
Loading…
Reference in a new issue