Compare commits
No commits in common. "1a87ddc913dc84fa6a2529d94aacf54f0167631d" and "7731b7dbdd1be4e5f5fbd25ec655921c34767010" have entirely different histories.
1a87ddc913
...
7731b7dbdd
4 changed files with 4 additions and 344 deletions
2
Jenkinsfile
vendored
2
Jenkinsfile
vendored
|
@ -12,6 +12,6 @@ node {
|
||||||
}
|
}
|
||||||
|
|
||||||
stage('Publish PDF') {
|
stage('Publish PDF') {
|
||||||
sh 'scp -i /root/.ssh/id_rsa ats-zf.pdf thisfro@192.168.178.45:/opt/containers/apache2/html/download/latex-previews/ats-zf.pdf'
|
sh 'scp -i /root/.ssh/id_rsa ats-zf.pdf jannis@192.168.178.45:/var/www/html/download/latex-previews/ats-zf.pdf'
|
||||||
}
|
}
|
||||||
}
|
}
|
Binary file not shown.
Before Width: | Height: | Size: 54 KiB |
BIN
img/pacf.png
BIN
img/pacf.png
Binary file not shown.
Before Width: | Height: | Size: 18 KiB |
346
main.tex
346
main.tex
|
@ -180,7 +180,7 @@ mathematically formulated by strict stationarity.
|
||||||
\end{tabular}
|
\end{tabular}
|
||||||
|
|
||||||
\subsubsection{Weak} \label{weak-stationarity}
|
\subsubsection{Weak} \label{weak-stationarity}
|
||||||
It is impossible to «prove» the theoretical concept of stationarity from data. We can only search for evidence in favor or against it. \\
|
It is impossible to "prove" the theoretical concept of stationarity from data. We can only search for evidence in favor or against it. \\
|
||||||
\vspace{0.1cm}
|
\vspace{0.1cm}
|
||||||
However, with strict stationarity, even finding evidence only is too difficult. We thus resort to the concept of weak stationarity.
|
However, with strict stationarity, even finding evidence only is too difficult. We thus resort to the concept of weak stationarity.
|
||||||
|
|
||||||
|
@ -522,346 +522,6 @@ var.ts <- 1/n^2*acf(b,lag=0,type="cov")$acf[1]*(n+2*sum(((n-1):(n-10))*acf(b,10)
|
||||||
mean(b) + c(-1.96,1.96)*sqrt(var.ts)
|
mean(b) + c(-1.96,1.96)*sqrt(var.ts)
|
||||||
\end{lstlisting}
|
\end{lstlisting}
|
||||||
|
|
||||||
\subsection{Partial autocorrelation (PACF)}
|
|
||||||
The $k$-th partial autocorrelation $\pi_k$ is defined as the correlation between $X_{t+k}$ and $X_t$, given all the values in between.
|
|
||||||
$$\pi_k = Cor(X_{t+k},X_t | X_{t+1},...,X_{t+k-1} = x_{t+k-1})$$
|
|
||||||
\begin{itemize}
|
|
||||||
\item Given a time series X t , the partial autocorrelation of lag $k$, is the autocorrelation between $X_t$ and $X_{t+k}$ with the linear dependence of $X_{t+1}$ through to $X_{t+k-1}$ removed.
|
|
||||||
\item One can draw an analogy to regression. The ACF measures the „simple“ dependence between $X_t$ and $X_{t+k}$, whereas the PACF measures that dependence in a „multiple“ fashion.\footnote{See e.g. \href{https://n.ethz.ch/~jannisp/download/Mathematik-IV-Statistik/zf-statistik.pdf}{\textit{Mathematik IV}}}
|
|
||||||
\end{itemize}
|
|
||||||
$$\pi_1 = \rho_1$$
|
|
||||||
$$\pi_2 = \frac{\rho_2 - \rho_1^2}{1-\rho_1^2}$$
|
|
||||||
for AR(1) moderls, we have $\pi_2 = 0$, because $\rho_2 = \rho_1^2$, i.e. there is no conditional relation between $(X_t, X_{t+2} | X_{t+1})$
|
|
||||||
|
|
||||||
\begin{lstlisting}[language=R]
|
|
||||||
pacf(wave, ylim=c(1,1))
|
|
||||||
\end{lstlisting}
|
|
||||||
|
|
||||||
\begin{figure}[H]
|
|
||||||
\centering
|
|
||||||
\includegraphics[width=.25\textwidth]{pacf.png}
|
|
||||||
\caption{PACF for wave tank}
|
|
||||||
\label{fig:pacf}
|
|
||||||
\end{figure}
|
|
||||||
|
|
||||||
\section{Basics of modelling}
|
|
||||||
\subsection{White noise}
|
|
||||||
\begin{quote}
|
|
||||||
A time series $(W_1, W_2,..., W_n)$ is a \textbf{White Noise} series if the random variables $W_1 , W_2,...$ are i.i.d with mean zero.
|
|
||||||
\end{quote}
|
|
||||||
This implies that all $W_t$ have the same variance $\sigma_W^2$ and
|
|
||||||
$$Cov(W_i,W_j) = 0 \, \forall \, i \neq j$$
|
|
||||||
Thus, there is no autocorrelation either: $\rho_k = 0 \, \forall \, k \neq 0$. \\
|
|
||||||
\vspace{.2cm}
|
|
||||||
If in addition, the variables also follow a Gaussian distribution, i.e. $W_t \sim N(0, \sigma_W^2)$, the series is called \textbf{Gaussian White Noise}. The term White Noise is due to the analogy to white light (all wavelengths are equally distributed).
|
|
||||||
|
|
||||||
\subsection{Autoregressive models (AR)}
|
|
||||||
In an $AR(p)$ process, the random variable $X_t$ depends on an autoregressive linear combination of the preceding $X_{t-1},..., X_{t-p}$, plus a „completely independent“ term called innovation $E_t$.
|
|
||||||
$$X_t = \alpha_1 X_{t-1} + ... + \alpha_p X_{t-p} + E_t$$
|
|
||||||
Here, $p$ is called the order of the AR model. Hence, we abbreviate by $AR(p)$. An alternative notation is with the backshift operator $B$:
|
|
||||||
$$(1-\alpha_1 B - \alpha_2 B^2 - ... \alpha_p B^p) X_t = E_t \Leftrightarrow \Phi(B)X_t = E_t$$
|
|
||||||
Here, $\Phi(B)$ is called the characteristic polynomial of the $AR(p)$. It determines most of the relevant properties of the process.
|
|
||||||
|
|
||||||
\subsubsection{AR(1)-Model}\label{ar-1}
|
|
||||||
$$X_t = \alpha_1 X_{t-1} + E_t$$
|
|
||||||
where $E_t$ is i.i.d. with $E[E_t] = 0$ and $Var(E_t) = \sigma_E^2$. We also require that $E_t$ is independent of $X_s, s<t$ \\
|
|
||||||
\vspace{.2cm}
|
|
||||||
Under these conditions, $E_t$ is a causal White Noise process, or an innovation. Be aware that this is stronger than the i.i.d. requirement: not every i.i.d. process is an innovation and that property is absolutely central to $AR(p)$-modelling.
|
|
||||||
|
|
||||||
\subsubsection{AR(p)-Models and Stationarity}
|
|
||||||
$AR(p)$-models must only be fitted to stationary time series. Any potential trends and/or seasonal effects need to be removed first. We will also make sure that the processes are stationary. \\
|
|
||||||
\vspace{.2cm}
|
|
||||||
\textbf{Conditions}
|
|
||||||
Any stationary $AR(p)$-process meets
|
|
||||||
\begin{itemize}
|
|
||||||
\item $E[X_t] = \mu = 0$
|
|
||||||
\item $1-\alpha_1 z + \alpha_2 z^2 + ... + \alpha_p z^p = 0$ (verify with \verb|polyroot()| in R)
|
|
||||||
\end{itemize}
|
|
||||||
|
|
||||||
\subsection{Yule-Walker equations}
|
|
||||||
We observe that there exists a linear equation system built up from the $AR(p)$-coefficients and the CF-coefficients of up to lag $p$. \\
|
|
||||||
\vspace{.2cm}
|
|
||||||
We can use these equations for fitting an $AR(p)$-model:
|
|
||||||
\begin{enumerate}
|
|
||||||
\item Estimate the ACF from a time series
|
|
||||||
\item Plug-in the estimates into the Yule-Walker-Equations
|
|
||||||
\item The solution are the $AR(p)$-coefficients
|
|
||||||
\end{enumerate}
|
|
||||||
|
|
||||||
\subsection{Fitting AR(p)-models}
|
|
||||||
This involves 3 crucial steps:
|
|
||||||
\begin{enumerate}
|
|
||||||
\item Model Identification
|
|
||||||
\begin{itemize}
|
|
||||||
\item is an AR process suitable, and what is $p$?
|
|
||||||
\item will be based on ACF/PACF-Analysis
|
|
||||||
\end{itemize}
|
|
||||||
\item Parameter Estimation
|
|
||||||
\begin{itemize}
|
|
||||||
\item Regression approach
|
|
||||||
\item Yule-Walker-Equations
|
|
||||||
\item and more (MLE, Burg-Algorithm)
|
|
||||||
\end{itemize}
|
|
||||||
\item Residual Analysis
|
|
||||||
\end{enumerate}
|
|
||||||
|
|
||||||
\subsubsection{Model identification}
|
|
||||||
\begin{itemize}
|
|
||||||
\item $AR(p)$ processes are stationary
|
|
||||||
\item For all AR(p) processes, the ACF decays exponentially quickly, or is an exponentially damped sinusoid.
|
|
||||||
\item For all $AR(p)$ processes, the PACF is equal to zero for all lags $k > p$. The behavior before lag $p$ can be arbitrary.
|
|
||||||
\end{itemize}
|
|
||||||
If what we observe is fundamentally different from the above, it is unlikely that the series was generated from an $AR(p)$-process. We thus need other models, maybe more sophisticated ones.
|
|
||||||
|
|
||||||
\subsubsection{Parameter estimation}
|
|
||||||
Observed time series are rarely centered. Then, it is inappropriate to fit a pure $AR(p)$ process. All R routines by default assume the shifted process $Y_t = m + X_t$. Thus, we face the problem:
|
|
||||||
$$(Y_t - m) = \alpha_1(Y_{t-1} - m) + ... + \alpha_p(Y_{t-p} - m) + E_t$$
|
|
||||||
The goal is to estimate the global mean m , the AR-coefficients $\alpha_1 ,..., \alpha_p$, and some parameters defining the distribution of the innovation $E_t$. We usually assume a Gaussian, hence this is $\sigma_E^2$.\\
|
|
||||||
\vspace{.2cm}
|
|
||||||
We will discuss 4 methods for estimating the parameters:\\
|
|
||||||
\vspace{.2cm}
|
|
||||||
|
|
||||||
\textbf{OLS Estimation} \\
|
|
||||||
If we rethink the previously stated problem, we recognize a multiple linear regression problem without
|
|
||||||
intercept on the centered observations. What we do is:
|
|
||||||
\begin{enumerate}
|
|
||||||
\item Estimate $\hat{m} = \bar{y}$ and $x_t = y_t - m$
|
|
||||||
\item Run a regression without intercept on $x_t$ to obtain $\hat{\alpha_1},\dots,\hat{\alpha_p}$
|
|
||||||
\item For $\hat{\sigma_E^2}$, take the residual standard error from the output
|
|
||||||
\end{enumerate}
|
|
||||||
|
|
||||||
\vspace{.2cm}
|
|
||||||
|
|
||||||
\textbf{Burg's algorithm} \\
|
|
||||||
While OLS works, the first $p$ instances are never evaluated as responses. This is cured by Burg’s algorithm, which uses the property of time-reversal in stochastic processes. We thus evaluate the RSS of forward and backward prediction errors:
|
|
||||||
$$\sum_{t=p+1}^n \bigg[\bigg(X_t - \sum_{k=1}^p \alpha_k X_{t-k}\bigg)^2 + \bigg(X_{t-p} - \sum_{k=1}^p \alpha_k X_{t-p+k}\bigg)^2 \bigg]$$
|
|
||||||
In contrast to OLS, there is no explicit solution and numerical optimization is required. This is done with a recursive method called the Durbin-Levison algorithm (implemented in R).
|
|
||||||
|
|
||||||
\begin{lstlisting}[language=R]
|
|
||||||
f.burg <- ar.burg(llynx, aic=F, order.max=2)
|
|
||||||
\end{lstlisting}
|
|
||||||
|
|
||||||
\vspace{.2cm}
|
|
||||||
|
|
||||||
\textbf{Yule-Walker Equations} \\
|
|
||||||
The Yule-Walker-Equations yield a LES that connects the true ACF with the true AR-model parameters. We plug-in the estimated ACF coefficients:
|
|
||||||
$$\hat{\rho}(k) = \hat{\alpha_k}\hat{\rho}(k-1) + \dots + \hat{\alpha_p}\hat{\rho}(k-p), \, \mathrm{for} \, k=1,\dots,p$$
|
|
||||||
and solve the LES to obtain the AR-parameter estimates.\\
|
|
||||||
\vspace{.2cm}
|
|
||||||
In R we can use \verb|ar.yw()| \\
|
|
||||||
|
|
||||||
\vspace{.2cm}
|
|
||||||
|
|
||||||
\textbf{Maximum-likelihood-estimation} \\
|
|
||||||
Idea: Determine the parameters such that, given the observed time series $(y_1 ,\dots, y_n)$, the resulting model is the most plausible (i.e. the most likely) one. \\
|
|
||||||
This requires the choice of a probability model for the time series. By assuming Gaussian innovations, $E_t \sim N (0,\sigma_E^2)$ , any $AR(p)$ process has a multivariate normal distribution:
|
|
||||||
$$Y = (Y_1,\dots,Y_n) \sim N(m \cdot \vec{1},V)$$
|
|
||||||
with $V$ depending on $\vec{\alpha},\sigma_E^2$ \\
|
|
||||||
MLE then provides simultaneous estimates by optimizing:
|
|
||||||
$$L(\alpha,m,\sigma_E^2) \propto \exp \bigg( \sum_{t=1}^n(x_t - \hat{x_t}) \bigg)$$
|
|
||||||
|
|
||||||
\begin{lstlisting}[language=R]
|
|
||||||
> f.ar.mle
|
|
||||||
Call: arima(x = log(lynx), order = c(2, 0, 0))
|
|
||||||
\end{lstlisting}
|
|
||||||
|
|
||||||
\vspace{.2cm}
|
|
||||||
|
|
||||||
\textbf{Some remarks} \\
|
|
||||||
\begin{itemize}
|
|
||||||
\item All 4 estimation methods are asymptotically equivalent and even on finite samples, the differences are usually small.
|
|
||||||
\item All 4 estimation methods are non-robust against outliers and perform best on data that are approximately Gaussian.
|
|
||||||
\item Function \verb|arima()| provides standard errors for $\hat{m}; \hat{\alpha}_1 ,\dots, \hat{\alpha}_p$ so that statements about significance become feasible and confidence intervals for the parameters can be built.
|
|
||||||
\item \verb|ar.ols()|, \verb|ar.yw()| and \verb|ar.burg()| allow for convenient choice of the optimal model order $p$ using the AIC criterion. Among these methods, \verb|ar.burg()| is usually preferred.
|
|
||||||
|
|
||||||
\end{itemize}
|
|
||||||
|
|
||||||
\subsection{Model diagnostics}
|
|
||||||
\subsubsection{Residual analysis}\label{residual-analysis}
|
|
||||||
"residuals" = "estimated innovations"
|
|
||||||
$$\hat{E_t} = (y_t - \hat{m}) - (\hat{\alpha_1}(y_{t-1} - \hat{m}) - \dots - \hat{\alpha}_p(y_{t-1} - \hat{m}))$$
|
|
||||||
With assumptions as in Chapter \ref{ar-1} \\
|
|
||||||
|
|
||||||
\vspace{.2cm}
|
|
||||||
We can check these, using (in R: \verb|tsdisplay(resid(fit))|)
|
|
||||||
\begin{itemize}
|
|
||||||
\item Time-series plot of $\hat{E}_t$
|
|
||||||
\item ACF/PACF-plot of $\hat{E}_t$
|
|
||||||
\item QQ-plot of $\hat{E}_t$
|
|
||||||
\end{itemize}
|
|
||||||
|
|
||||||
The time-series should look like white-noise \\
|
|
||||||
\vspace{.2cm}
|
|
||||||
\textbf{Alternative} \\
|
|
||||||
Using \verb|checkresiduals()|: \\
|
|
||||||
A convenient alternative for residual analysis is this function from \verb|library(forecast)|. It only works correctly when fitting with \verb|arima()|, though.
|
|
||||||
|
|
||||||
\begin{lstlisting}[language=R]
|
|
||||||
> f.arima <- arima(log(lynx), c(11,0,0))
|
|
||||||
> checkresiduals(f.arima)
|
|
||||||
Ljung-Box test
|
|
||||||
data: Residuals from ARIMA(11,0,0) with non-zero mean
|
|
||||||
Q* = 4.7344, df = 3, p-value = 0.1923
|
|
||||||
Model df: 12. Total lags used: 15
|
|
||||||
\end{lstlisting}
|
|
||||||
|
|
||||||
The function carries out a Ljung-Box test to check whether residuals are still correlated. It also provides a graphical output:
|
|
||||||
\begin{figure}[H]
|
|
||||||
\centering
|
|
||||||
\includegraphics[width=.25\textwidth]{checkresiduals.png}
|
|
||||||
\caption{Example output from above code}
|
|
||||||
\label{fig:checkresiduals}
|
|
||||||
\end{figure}
|
|
||||||
|
|
||||||
\subsubsection{Diagsnostic by simulation}
|
|
||||||
As a last check before a model is called appropriate, simulating from the estimated coefficients and visually inspecting the resulting series (without any prejudices) to the original one can be beneficial.
|
|
||||||
\begin{itemize}
|
|
||||||
\item The simulated series should "look like" the original. If this is not the case, the model failed to capture (some of) the properties in the original data.
|
|
||||||
\item A larger or more sophisticated model may be necessary in cases where simulation does not recapture the features in the original data.
|
|
||||||
\end{itemize}
|
|
||||||
|
|
||||||
\subsection{Moving average models (MA)}
|
|
||||||
Whereas for $AR(p)$-models, the current observation of a series is written as a linear combination of its own past, $MA(q)$-models can be seen as an extension of the "pure" process
|
|
||||||
$$X_t = E_t$$
|
|
||||||
in the sense that the last q innovation terms $E_{t-1} , E_{t-2} ,...$ are included, too. We call this a moving average model:
|
|
||||||
$$X_t = E_t + \beta_1 E_{t-1} + \beta_2 E_{t-2} + \dots + \beta_q E_{t-q}$$
|
|
||||||
This is a time series process that is stationary, but not i.i.d. In many aspects, $MA(q)$ models are complementary to $AR(p)$.
|
|
||||||
|
|
||||||
\subsubsection{Stationarity of MA models}
|
|
||||||
We first restrict ourselves to the simple $MA(1)$-model:
|
|
||||||
$$X_t = E_t + \beta_1 E_{t-1}$$
|
|
||||||
The series $X_t$ is always weakly stationary, no matter what the choice of the parameter $\beta_1$ is.
|
|
||||||
|
|
||||||
\subsubsection{ACF/PACF of MA processes}
|
|
||||||
For the ACF
|
|
||||||
$$\rho(1) = \frac{\gamma(1)}{\gamma(0)} = \frac{\beta_1}{1+\beta_1^2} < 0.5$$
|
|
||||||
and
|
|
||||||
$$\rho(k) = 0 \, \forall \, k > 1$$
|
|
||||||
|
|
||||||
Thus, we have a «cut-off» situation, i.e. a similar behavior to the one of the PACF in an $AR(1)$ process. This is why and how $AR(1)$ and $MA(1)$ are complementary.
|
|
||||||
|
|
||||||
\subsubsection{Invertibility}
|
|
||||||
Without additional assumptions, the ACF of an $MA(1)$ does not allow identification of the generating model.
|
|
||||||
$$X_t = E_t + 0.5 E_{t-1}$$
|
|
||||||
$$U_t = E_t + 2 E_{t-1}$$
|
|
||||||
have identical ACF!
|
|
||||||
$$\rho(1) = \frac{\beta_{1}}{1+\beta_1^2} = \frac{1/\beta_1}{1+(1/\beta_1^2)}$$
|
|
||||||
|
|
||||||
\begin{itemize}
|
|
||||||
\item An $MA(1)$-, or in general an $MA(q)$-process is said to be invertible if the roots of the characteristic polynomial $\Theta(B)$ exceed one in absolute value.
|
|
||||||
\item Under this condition, there exists only one $MA(q)$-process for any given ACF. But please note that any $MA(q)$ is stationary, no matter if it is invertible or not.
|
|
||||||
\item The condition on the characteristic polynomial translates to restrictions on the coefficients. For any MA(1)-model, $|\beta_1| < 1$ is required.
|
|
||||||
\item R function \verb|polyroot()| can be used for finding the roots.
|
|
||||||
\end{itemize}
|
|
||||||
|
|
||||||
\textbf{Practical importance:} \\
|
|
||||||
The condition of invertibility is not only a technical issue, but has important practical meaning. All invertible $MA(q)$ processes can be expressed in terms of an $AR(\infty)$, e.g. for an $MA(1)$:
|
|
||||||
\begin{align*}
|
|
||||||
X_t &= E_t + \beta_1 E_{t-1} \\
|
|
||||||
&= E_t + \beta_1(X_{t-1} - \beta_1 E_{t-2}) \\
|
|
||||||
&= \dots \\
|
|
||||||
&= E_t + \beta_1 X_{t-1} - \beta_1^2 X_{t-2} + \beta_1^3X_{t-3} + \dots \\
|
|
||||||
&= E_t + \sum_{i=1}^\infty \psi_i X_{t-i}
|
|
||||||
\end{align*}
|
|
||||||
|
|
||||||
\subsection{Fitting MA(q)-models to data}
|
|
||||||
As with AR(p) models, there are three main steps:
|
|
||||||
\begin{enumerate}
|
|
||||||
\item Model identification
|
|
||||||
\begin{itemize}
|
|
||||||
\item Is the series stationary?
|
|
||||||
\item Do the properties of ACF/PACF match?
|
|
||||||
\item Derive order $q$ from the cut-off in the ACF
|
|
||||||
\end{itemize}
|
|
||||||
\item Parameter estimation
|
|
||||||
\begin{itemize}
|
|
||||||
\item How to determine estimates for $m, \beta_1 ,\dots, \beta_q, \sigma_E^2$?
|
|
||||||
\item Conditional Sum of Squares or MLE
|
|
||||||
\end{itemize}
|
|
||||||
\item Model diagnostics
|
|
||||||
\begin{itemize}
|
|
||||||
\item With the same tools/techniques as for AR(p) models
|
|
||||||
\end{itemize}
|
|
||||||
\end{enumerate}
|
|
||||||
|
|
||||||
\subsubsection{Parameter estimation}
|
|
||||||
The simplest idea is to exploit the relation between model parameters and autocorrelation coefficients («Yule-Walker») after the global mean $m$ has been estimated and subtracted. \\
|
|
||||||
In contrast to the Yule-Walker method for AR(p) models, this yields an inefficient estimator that generally generates poor results and hence should not be used in practice.
|
|
||||||
|
|
||||||
\vspace{.2cm}
|
|
||||||
It is better to use \textbf{Conditional sum of squares}:\\
|
|
||||||
This is based on the fundamental idea of expressing $\sum E_t^2$ in terms of $X_1 ,..., X_n$ and $\beta_1 ,\dots, \beta_q$, as the innovations themselves are unobservable. This is possible for any invertible $MA(q)$, e.g. the $MA(1)$:
|
|
||||||
$$E_t = X_t = \beta_1 X_{t-1} + \beta_1^2 X_{t-2} + \dots + (-\beta)^{t-1} X_1 + \beta_1^t E_0$$
|
|
||||||
Conditional on the assumption of $E_0 = 0$ , it is possible to rewrite $\sum E_t^2$ for any $MA(1)$ using $X_1 ,\dots, X_n $ and $\beta_1$. \\
|
|
||||||
\vspace{.2cm}
|
|
||||||
Numerical optimization is required for finding the optimal parameter $\beta_1$, but is available in R function \verb|arima()| with:
|
|
||||||
\begin{lstlisting}[language=R]
|
|
||||||
> arima(..., order=c(...), method="CSS")
|
|
||||||
\end{lstlisting}
|
|
||||||
|
|
||||||
\textbf{Maximium-likelihood estimation}
|
|
||||||
\begin{lstlisting}[language=R]
|
|
||||||
> arima(..., order=c(...), method="CSS-ML")
|
|
||||||
\end{lstlisting}
|
|
||||||
This is the default methods in R, which is based on finding starting values for MLE using the CSS approach. If assuming Gaussian innovations, then:
|
|
||||||
$$X_t = E_t + \beta_1 E_{t-1} + \beta_q E_{t-q}$$
|
|
||||||
will follow a Gaussian distribution as well, and we have:
|
|
||||||
$$X = (X_1, \dots, X_n) \sim N(0,V)$$
|
|
||||||
Hence it is possible to derive the likelihood function and simultaneously estimate the parameters $m;\beta_1,\dots,\beta_q;\sigma_E^2$.
|
|
||||||
|
|
||||||
\subsubsection{Residual analysis}
|
|
||||||
See \ref{residual-analysis}
|
|
||||||
|
|
||||||
\subsection{ARMA(p,q)-models}
|
|
||||||
An $ARMA(p,q)$ model combines $AR(p)$ and $MA(q)$:
|
|
||||||
$$X_t = \alpha_1 X_{t-1} + \dots + \alpha_p X_{t-p} + E_t + \beta_1 E_{t-1} + \dots + \beta_q E{t-q}$$
|
|
||||||
where $E_t$ are i.i.d. innovations (=a white noise process).\\
|
|
||||||
\vspace{.2cm}
|
|
||||||
It‘s easier to write $ARMA(p,q)$’s with the characteristic polynomials: \\
|
|
||||||
\vspace{.2cm}
|
|
||||||
$\Phi(B)X_t = \Theta(B)E_t$, where \\
|
|
||||||
$\Phi(z) = 1 - \alpha_1 z - \dots - \alpha_p z^p$, is the cP of the $AR$-part, and \\
|
|
||||||
$\Theta(z) = 1 + \beta_1 z + \dots + \beta_1 z^q$ is the cP of the $MA$-part
|
|
||||||
|
|
||||||
\subsubsection{Properties of ARMA(p,q)-Models}
|
|
||||||
The stationarity is determined by the $AR(p)$-part of the model:\\
|
|
||||||
If the roots of the characteristic polynomial $\Phi(B)$ exceed one in absolute value, the process is stationary.\\
|
|
||||||
\vspace{.2cm}
|
|
||||||
The invertibility is determined by the $MA(q)$-part of the model:\\
|
|
||||||
If the roots of the characteristic polynomial $\Theta(B)$ exceed one in absolute value, the process is invertible.\\
|
|
||||||
\vspace{.2cm}
|
|
||||||
Any stationary and invertible $ARMA(p,q)$ can either be rewritten in the form of a non-parsimonious $AR(\infty)$ or an $MA(\infty)$.\\
|
|
||||||
In practice, we mostly consider shifted $ARMA(p,q)$: $Y_t = m + X_t$
|
|
||||||
|
|
||||||
\begin{table}[H]
|
|
||||||
\centering
|
|
||||||
\begin{tabular}{l|l|l}
|
|
||||||
& ACF & PACF \\
|
|
||||||
\hline
|
|
||||||
$AR(p)$ & exponential decay & cut-off at lag $p$ \\
|
|
||||||
$MA(q)$ & cut-off at lag $q$ & exponential decay \\
|
|
||||||
$ARMA(p,q)$ & mix decay/cut-off & mix decay/cut-off \\
|
|
||||||
\end{tabular}
|
|
||||||
\caption{Comparison of $AR$-,$MA$-, $ARMA$-models}
|
|
||||||
\end{table}
|
|
||||||
|
|
||||||
\begin{itemize}
|
|
||||||
\item In an $ARMA(p,q)$, depending on the coefficients of the model, either the $AR(p)$ or the $MA(q)$ part can dominate the ACF/PACF characteristics.
|
|
||||||
\item In an $ARMA(p,q)$, depending on the coefficients of the model, either the $AR(p)$ or the $MA(q)$ part can dominate the ACF/PACF characteristics.
|
|
||||||
|
|
||||||
\end{itemize}
|
|
||||||
|
|
||||||
\subsubsection{Fitting ARMA-models to data}
|
|
||||||
See $AR$- and $MA$-modelling
|
|
||||||
|
|
||||||
\subsubsection{Identification of order (p,q)}
|
|
||||||
May be more difficult in reality than in theory:
|
|
||||||
\begin{itemize}
|
|
||||||
\item We only have one single realization of the time series with finite length. The ACF/PACF plots are not «facts», but are estimates with uncertainty. The superimposed cut-offs may be difficult to identify from the ACF/PACF plots.
|
|
||||||
\item $ARMA(p,q)$ models are parsimonius, but can usually be replaced by high-order pure $AR(p)$ or $MA(q)$ models. This is not a good idea in practice, however!
|
|
||||||
\item In many cases, an AIC grid search over all $ARMA(p,q)$ with $p+q < 5$ may help to identify promising models.
|
|
||||||
\end{itemize}
|
|
||||||
|
|
||||||
\scriptsize
|
\scriptsize
|
||||||
|
|
||||||
\section*{Copyright}
|
\section*{Copyright}
|
||||||
|
@ -872,8 +532,8 @@ Jannis Portmann, FS21
|
||||||
|
|
||||||
\section*{References}
|
\section*{References}
|
||||||
\begin{enumerate}
|
\begin{enumerate}
|
||||||
\item ATSA\_Script\_v210219.docx, M. Dettling
|
\item ATSA\_Script\_v219219.docx, M. Dettling
|
||||||
\item ATSA\_Slides\_v210219.pptx, M. Dettling
|
\item ATSA\_Slides\_v219219.pptx, M. Dettling
|
||||||
\end{enumerate}
|
\end{enumerate}
|
||||||
|
|
||||||
\section*{Image sources}
|
\section*{Image sources}
|
||||||
|
|
Loading…
Reference in a new issue