Begin ts regression
This commit is contained in:
parent
fe179958fb
commit
92099f5094
1 changed files with 67 additions and 0 deletions
67
main.tex
67
main.tex
|
@ -882,6 +882,73 @@ In R, finding the AIC-minimizing $ARMA(p,q)$-model is convenient with the use of
|
|||
\vspace{.2cm}
|
||||
Using \verb|auto.arima()| should always be complemented by visual inspection of the time series for assessing stationarity, verifying the ACF/PACF plots for a second thought on suitable models. Finally, model diagnostics with the usual residual plots will decide whether the model is useful in practice.
|
||||
|
||||
\section{Time series regression}
|
||||
We speak of time series regression if response and predictors are time series, i.e. if they were observed in a sequence.
|
||||
\subsection{Model}
|
||||
In principle, it is perfectly fine to apply the usual OLS setup:
|
||||
$$Y_t = \beta_0 + \beta_1 x_{t1} + \dots + \beta_q x_{tp} + E_t$$
|
||||
Be careful: this assumes that the errors $E_t$ are uncorrelated (often not the case)! \\
|
||||
\vspace{.2cm}
|
||||
With correlated errors, the estimates $\hat{\beta}_j$ are still unbiased, but more efficient estimators than OLS exist. The standard errors are wrong, often underestimated, causing spurious significance. $\rightarrow$ GLS!
|
||||
\begin{itemize}
|
||||
\item The series $Y_t, x_{t1} ,\dots, x_{tp}$ can be stationary or non-stationary.
|
||||
\item It is crucial that there is no feedback from the response $Y_t$ to the predictor variables $x_{t1},\dots, x_{tp}$ , i.e. we require an input/output system.
|
||||
\item $E_t$ must be stationary and independent of $x_{t1},\dots, x_{tp}$, but may be Non-White-Noise with some serial correlation.
|
||||
\end{itemize}
|
||||
|
||||
\subsubsection{Finding correlated errors}
|
||||
\begin{enumerate}
|
||||
\item Start by fitting an OLS regression and analyze residuals
|
||||
\item Continue with a time series plot of OLS residuals
|
||||
\item Also analyze ACF and PACF of OLS residuals
|
||||
\end{enumerate}
|
||||
|
||||
\subsubsection{Durbin-Watson test}
|
||||
The Durbin-Watson approach is a test for autocorrelated errors in regression modeling based on the test statistic:
|
||||
$$D = \frac{\sum_{t=2}^N (r_t - r_{t-1})^2}{\sum_{t=1}^N r_t^2} \approx 2(1-\hat{\rho}_1) \in [0,4]$$
|
||||
|
||||
\begin{itemize}
|
||||
\item This is implemented in R: \verb|dwtest()| in \verb|library(lmtest)|. A p-value for the null of no autocorrelation is computed.
|
||||
\item This test does not detect all autocorrelation structures. If the null is not rejected, the residuals may still be autocorrelated.
|
||||
\item Never forget to check ACF/PACF of the residuals! (Test has only limited power)
|
||||
\end{itemize}
|
||||
Example:
|
||||
\begin{lstlisting}[language=R]
|
||||
> library(lmtest)
|
||||
> dwtest(fit.lm)
|
||||
data: fit.lm
|
||||
DW = 0.5785, p-value < 2.2e-16
|
||||
alt. hypothesis: true autocorrelation is greater than 0
|
||||
\end{lstlisting}
|
||||
|
||||
\subsubsection{Cochrane-Orcutt method}
|
||||
This is a simple, iterative approach for correctly dealing with time series regression. We consider the pollutant example:
|
||||
$$Y_t = \beta_0 + \beta_1 x_{t1} + \beta_2 x_{t2} + E_t$$
|
||||
with
|
||||
$$E_t = \alpha E_{t-1} + U_t$$
|
||||
and $U_t \sim N(0, \sigma_U^2)$ i.i.d. \\
|
||||
\vspace{.2cm}
|
||||
The fundamental trick is using the transformation\footnote{See script for more details}:
|
||||
$$Y_t' = Y_t - \alpha Y_{t-1}$$
|
||||
This will lead to a regression problem with i.i.d. errors:
|
||||
$$Y_t' = \beta_0' + \beta1 x'_{t1} \beta_2 x'_{t2} + U_t$$
|
||||
The idea is to run an OLS regression first, determine the transformation from the residuals and finally obtaining corrected estimates.
|
||||
|
||||
\subsection{Generalized least squares (GLS)}
|
||||
OLS regression assumes a diagonal error covariance matrix, but there is a generalization to $Var(E) = \sigma^2 \Sigma$. \\
|
||||
For using the GLS approach, i.e. for correcting the dependent errors, we need an estimate of the error covariance matrix $\Sigma = SS^T$. \\
|
||||
We can the obtain the (simultaneous) estimates:
|
||||
$$\hat{\beta} =(X^T \Sigma^{-1} X)^{-1} X^T \Sigma^{-1} y$$
|
||||
With $Var(\hat{\beta}) = (X^T \Sigma^{-1} X)^{-1} \sigma^2$
|
||||
|
||||
\subsubsection{R example}
|
||||
\begin{lstlisting}[language=R]
|
||||
> library(nlme)
|
||||
> corStruct <- corARMA(form=~time, p=2)
|
||||
> fit.gls <- gls(temp~time+season, data=dat,correlation=corStruct)
|
||||
\end{lstlisting}
|
||||
|
||||
|
||||
\section{General concepts}
|
||||
\subsection{AIC}
|
||||
The \textit{Akaike-information-criterion} is useful for determining the order of an $ARMA(p,q)$ model. The formula is as follows:
|
||||
|
|
Loading…
Reference in a new issue