---
title: 'Practicals: GAMs and INLA for extremes'
author: "Thomas Opitz (BioSP, INRAE, Avignon, France)"
date: '`r format(Sys.time(), "%d %B %Y")`'
output:
  html_document:
    number_sections: yes
    toc: yes
    toc_depth: 2
  pdf_document:
    toc: yes
    toc_depth: '2'
fig_caption: yes
subtitle: (Version 1)
always_allow_html: yes
---

```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = FALSE, fig.align="center", fig.width = 6)
setwd("~/research/talks/2021-12_IBS-Germany_Extremes_and_INLA/practicals/")
```

# Introduction

This tutorial focuses on extreme-value modeling using the GAMs and the Integrated Nested Laplace Approximation (INLA) implemented in the INLA package, which is usually referred to as "R-INLA". We first load a few libraries that will be necessary. If R-INLA is not yet installed, please uncomment the corresponding line and run the relevant `install.packages()` command.

```{r Load libraries, echo=TRUE, tidy=TRUE, warning=FALSE, message=FALSE}
#install.packages("INLA",repos=c(getOption("repos"),INLA="https://inla.r-inla-download.org/R/stable"), dep=TRUE)
library(evgam) # frequentist GAMs, and Colorado precipitation dataset
library(extRemes) # for univariate extremes 
library(fields) #for plotting (image.plot)
# library(geostatsp) # library for geostatistical modeling making use of INLA
library(grDevices) # for color palettes
library(INLA) # implementation of INLA 
# library(inlabru) # high-level interface to R-INLA with additional modeling tools
library(maps) # mapping tools
library(sp) # spatial tools
library(viridis)
```

# Colorado precipitation data 

In the evgam package, we have a data frame with observations of daily precipitation amounts at 64 gauge stations in the state of Colorado, US, during the summer months of April to October (type "?COprcp" for more information about this dataset). 

```{r Load the data, echo=TRUE, warning=FALSE, message=FALSE, tidy=TRUE}
data(COprcp)
dim(COprcp) 
head(COprcp)
str(COprcp)
head(COprcp_meta)
n_sites = nrow(COprcp_meta)
coor = cbind(COprcp_meta$lon, COprcp_meta$lat)
dates=as.POSIXlt(COprcp$date)
dates_unique = sort(unique(dates))
# beginning of observation period:
head(dates_unique)
# end of observation period:
tail(dates_unique)
n_days = length(unique(dates))
years=1900+dates$year
range(years)
months= 1+dates$mon # month (1-12)
ydays = 1+dates$yday # day within year (1-366)
obs = COprcp$prcp
id_coor = COprcp$meta_row
obs_mat = matrix(NA, nrow = n_days, ncol = n_sites)
# fill the matrix:
for(i in 1:n_sites){
  rows_i = which(id_coor == i)
  obs_mat[match(dates[rows_i], dates_unique),i] = obs[rows_i]
}
# Are there any missing values?
mean(is.na(obs_mat)) # yes, around 1.6% of all values are missing. 
# Show the time series for two locations:
par(mfrow = c(1,2))
plot(dates_unique, obs_mat[,1], type = "h", ylim = c(0, max(obs_mat[,c(1,20)], na.rm = TRUE)), xlab = "Year", ylab = "Daily precipitation")
plot(dates_unique, obs_mat[,20], type = "h", ylim = c(0, max(obs_mat[,c(1,20)], na.rm = TRUE)), xlab = "Year", ylab = "Daily precipitation")
# export as pdf:
#dev.copy(pdf, file = "precipitation-series.pdf", width = 12)
#dev.off()
```

For potential later use, we transform longitude-latitude to a coordinate system that has been optimized for the United States and corresponds to metric distances for the study region. This is recommendable for accurate modeling of the spatial correlations since the metric distance of one unit latitude is (slightly) different from one unit longitude. We use the EPSG codes for projections (European Petroleum Survey Group): 

* 4326 for longitude-latitude;
* 2163 for US National Atlas Equal Area.

For the spatial distance unit, we use 1km. Note that the some implementations may sometimes show stability problems with very large coordinate values (e.g. expressed in meters). 

```{r Map the data, echo=TRUE, warning=FALSE, message=FALSE, tidy=TRUE}
coor_sp = SpatialPoints(coords=coor, proj4string=CRS("+init=epsg:4326"))
coor_metric = spTransform(coor_sp, CRS("+init=epsg:2163"))@coords/10^3
# We prepare the boundary coordinates of the state of Colorado:
mymap = map("state", regions="colorado", plot = FALSE)
# Transform coordinates (those that are not "NA") to EPSG 2163 (in km unit) using spTransform:
xy_region = SpatialPoints(coords=na.omit(cbind(mymap$x,mymap$y)),proj4string=CRS("+init=epsg:4326"))
xy_region = spTransform(xy_region, CRS("+init=epsg:2163"))@coords # in m unit
xy_region = xy_region/10^3 # in km unit
mymap$x[!is.na(mymap$x)] = xy_region[,1]
mymap$y[!is.na(mymap$y)] = xy_region[,2]
# We plot observation sites and the Colorado boundary:
plot(coor_metric, pch=19, cex=.5, asp = 1, xlab = "x (km)", ylab = "y (km)")
lines(mymap$x, mymap$y, lwd=2)
```

# Part I: Univariate extreme-value theory

## Annual maxima 

For modeling annual maxima, we generate a data matrix containing only the annual maxima, where rows correspond to years and columns correspond to observation locations. We use the aggregate-function to extract annual maxima (where we remove missing data during a year). The result of the aggregate-function is a data frame with first column indicating the year, and the other columns reporting the annual maxima. We transform this data frame to numerical matrix containing only the maxima values. For illustration, we plot the series of annual maxima for two observation locations. 

```{r Extract annual maxima, echo=TRUE, warning=FALSE, message=FALSE, tidy=TRUE}
maxima = aggregate(obs_mat, by = list(year = 1900 + dates_unique$year), FUN = max, na.rm = TRUE) # first column of maxima contains the year
maxima = as.matrix(maxima[,-1])
maxima[maxima == -Inf] = NA 
par(mfrow = c(1,2))
plot(unique(1900 + dates_unique$year), maxima[,1], type = "h", ylim = c(0, max(maxima[,1:2])), xlab = "Year", ylab = "Precipitation maximum")
plot(unique(1900 + dates_unique$year), maxima[,20], type = "h", ylim = c(0, max(maxima[,1:2])), xlab = "Year", ylab = "Precipitation maximum")
```
For a (very) simple GEVD model, we could estimate GEVD parameters for the entire sample of maxima. For this purpose, we here use the extRemes package. Several other packages implement maximum likelihood estimation of the GEV and GPD (e.g. packages evd, fExtremes).

```{r Estimate global GEVD, echo=TRUE, warning=FALSE, message=FALSE, tidy=TRUE}
max_sample = as.numeric(maxima)
max_sample = na.omit(max_sample)
fit = fevd(max_sample, type = "GEV", method = "MLE", period.basis = "year", use.phi = TRUE)
summary(fit)
# plot(fit) # visualizations to check goodness-of-fit
```
We obtain a slightly positive value of the shape parameter (xi), which is typical for precipitation data. 

## Threshold exceedances

For a (very) simple Peaks-over-Threshold model of precipitation data, we could choose a global threshold as a high quantile of positive precipitation intensities, for instance at the 95% level. After fixing the threshold, we can explore the resulting exceedances (how many?, histogram, etc.).
```{r Exceedances over global threshold, echo=TRUE, warning=FALSE, message=FALSE, tidy=TRUE}
prob = 0.95
thresh = quantile(obs_mat[obs_mat>0], prob, na.rm = TRUE)
sum(as.numeric(obs_mat > thresh), na.rm = TRUE)
hist((obs_mat-thresh)[obs_mat > thresh], xlab = "Excesses", main = "Histogram of excesses")
```
Next, we can estimate a global GPD model (where we first remove the missing data). 
```{r Estimate global GPD, echo=TRUE, warning=FALSE, message=FALSE, tidy=TRUE}
fit = fevd(x = obs_mat[!is.na(obs_mat)],threshold = thresh, type = "GP", method = "MLE", use.phi = TRUE)
summary(fit)
# plot(fit) # visualizations to check goodness-of-fit
```
Again we obtain a slightly positive value of the shape parameter (xi) that is slightly different from the GEVD estimation. 

Remark: Various tools exist to help with selecting an appropriate threshold above which the peaks-over-threshold model becomes stable (but the choice is rarely obvious and often not easy...). To find an appropriate threshold (or to validate our threshold from above), we could study a parameter stability plot for a range of possible thresholds. 

```{r Parameter stability plot, echo=TRUE, warning=FALSE, message=FALSE, tidy=TRUE}
threshrange.plot(obs_mat[!is.na(obs_mat)], type = "GP", r = c(10, 80), nint = 8)
```
Here, the estimates look relatively stable for different thresholds when taking into account the estimation uncertainty given by the error bars. In particular, estimates for lower thresholds are usually within the error bars of estimates at higher thresholds.  


# Part II: Generalized additive models for extremes 
We now explore the GAMs for extremes as implemented in evgam.

## A GAM with GEV response depending on elevation and spatial location

We continue working with the Colorado precipitation data from the evgam package. The package also provides data for the auxiliary variable of elevation. 

```{r Colorado elevation plot, echo=TRUE, warning=FALSE, message=FALSE, tidy=TRUE}
par(lwd = 2, cex.axis = 1.5, cex.lab = 1.5)
brks = pretty(COelev$z, 50)
cols = viridis(length(brks) - 1)
image(COelev$x, COelev$y, COelev$z, breaks = brks, col = cols, asp = 1, xlab = "Longitude", ylab = "Latitude")
col_points = cols[as.integer(cut(COprcp_meta$elev, brks))]
points(COprcp_meta$lon, COprcp_meta$lat, pch = 4)
#dev.copy(pdf, file = "map-colorado.pdf", height = 12)
#dev.off()
```
Could extreme-value behavior vary with elevation? 
For a simple exploratory check, let's plot averages of location-wise annual maxima against elevation.

```{r Average annual maxima vs. Elevation, echo=TRUE, warning=FALSE, message=FALSE, tidy=TRUE}
plot(COprcp_meta$elev, apply(maxima, 2, mean), xlab = "Elevation", ylab = "Average of maxima", pch = 19)
```
There could be a slight decreasing trend. 

We here explore a GEV model that has been developed by Youngman (2020). It includes two spline terms into the location parameter $\mu$ (for elevation, and for spatial location) and one spline term in the scale parameter (for spatial location). The shape parameter is kept constant (but is estimated).
For evgam, we have to provide all data (response and predictors) in a data frame. 

```{r Prepare data for evgam, echo=TRUE, warning=FALSE, message=FALSE, tidy=TRUE}
COprcp$year = format(COprcp$date, "%Y")
COprcp_gev = aggregate(prcp ~ year + meta_row, COprcp, max)
COprcp_gev = cbind(COprcp_gev, COprcp_meta[COprcp_gev$meta_row,])
```

We can now specify the structure of the linear predictors for the three GEV parameters (location, scale, shape). We use relatively high numbers of basis functions (k=30, k=20), such that the penalty mechanism can automatically determine the smoothness/wiggliness required for a good fit of the model. The default basis functions for two-dimensional splines (with lon and lat) are so-called "thin-plate splines"; alternatively, we could use tensor product splines with "te(...)" instead of "s(...)". For the one-dimensional spline for elevation, we do not use the default but use cubic splines (bs = "cr"). See the help page for more details.

```{r Specify the GEV formula for evgam, echo=TRUE, warning=FALSE, message=FALSE, tidy=TRUE}
# Uncomment the following two lines to consult the help pages for available spline types
# ?mgcv::s 
# ?mgcv::smooth.terms
form_gev = list(prcp ~ s(lon, lat, k = 30) + s(elev, bs = "cr"), ~ s(lon, lat, k = 20), ~ 1)
```
Now we can estimate this GEV model and print a summary of the fitted object. The summary shows estimated linear coefficients, and p-values are provided to assess if the nonlinear spline terms improve the goodness-of-fit of the model.
```{r Estimate the GEV with evgam, echo=TRUE, warning=FALSE, message=FALSE, tidy=TRUE}
fit_gev = evgam(form_gev, COprcp_gev, family = "gev")
summary(fit_gev)
```
Here, the inclusion of the spline terms seems to be beneficial since all p-values are very small. 

To explore the results, we can plot the estimated spline functions. The default outputs (with confidence information) are contour plots for two-dimensional splines, and estimated curves for one-dimensional splines. 

```{r Estimated GEV splines, echo=TRUE, warning=FALSE, message=FALSE, tidy=TRUE}
plot(fit_gev)
```
We can also plot spatial maps of estimated parameters (and other quantities, such as return levels). For this, the evgam package provides the elevation values over a rectangular grid of longitude and latitude values. We can use the predict function, in which we specify that we want to predict the "response" (i.e., the estimates for location, scale and shape obtained after back-transforming the linear predictor with the link function). 
```{r Map estimated GEV parameters, echo=TRUE, warning=FALSE, message=FALSE, tidy=TRUE}
COprcp_plot = expand.grid(lon = COelev$x, lat = COelev$y)
COprcp_plot$elev = as.vector(COelev$z)
pred_gev = predict(fit_gev, COprcp_plot, type = "response")
data2plot = COelev
data2plot$z[] = pred_gev$location
par(mfrow = c(1,2)) # 1-line 2-columns display
image(data2plot, asp = 1, col  = viridis(16))
title("GEV location")
data2plot$z[] = pred_gev$scale
image(data2plot, asp = 1, col  = viridis(16))
title("GEV scale")
```

## Exercice

Implement a GAM with GPD response for threshold exceedances. 
Note: you can follow the code examples in https://arxiv.org/abs/2003.04067.


# Part III: Integrated Nested Laplace Approximation

We implement two relatively simple extreme-value models in INLA to illustrate how the INLA-implementation works. 

## A GEV model that is nonstationary in elevation

We here estimate nonlinear elevation effect for the Colorado precipitation data by using a step function with a Gaussian first-order random walk prior. We start by creating a data frame with rows for the response variable and the covariates. For the random walk, we use 15 equidistant intervals based on the elevation values in the study area. For the intercept beta_0, we define a covariate "intercept".
If you want to read how the GEV distribution is parametrized in INLA, you can call inla.doc("gev") in the console (which will open a pdf file containing documentation). 

```{r Prepare INLA data (GEV), echo=TRUE, warning=FALSE, message=FALSE, tidy=TRUE}
names(COprcp_gev)
data_df = COprcp_gev
data_df$intercept = 1
nbreak = 16
breaks = min(COelev$z) + (max(COelev$z)- min(COelev$z)) * (0:nbreak)/nbreak
breaks[1] = breaks[1] - 1 # avoid observations on the boundary
breaks[nbreak] = breaks[nbreak] + 1 # avoid observations on the boundary
data_df$elev_group = as.integer(cut(COprcp_gev$elev, breaks))
table(data_df$elev_group)
```

We will use a Penalized Complexity Prior for the precision (i.e., 1/variance) hyperparameter of the random walk. Here we specify that the prior probability of having standard deviation larger than 1 is 0.5. We then specify the regression formula of the INLA model, and run the estimation. Finally, we can print a summary of the fitted object.  
Note: the "-1" in the INLA-formula means that we remove the internal "default" intercept, and we replace it by our user-defined intercept variable. 

```{r Define INLA model and run INLA (GEV), echo=TRUE, warning=FALSE, message=FALSE, tidy=TRUE}
# inla.doc("prec")
hyper_prec = list(theta = list(prior = "pc.prec", param=c(1, 0.5)))
my_formula = prcp ~ -1 + intercept + f(elev_group, model = "rw1", hyper = hyper_prec)
fit = inla(my_formula, 
         data = data_df, 
         family = "gev",
         verbose = FALSE) 
summary(fit)
```
We can now plot the estimated elevation effect along with 95% credible intervals. Therefore, we first define an auxiliary function that will be useful to plot the random effect coefficients (betas) estimated with INLA, and then we use it. 

```{r Plot elevation effect (GEV), echo=TRUE, warning=FALSE, message=FALSE, tidy=TRUE}
plot_random_effect=function(effect){
  ylim=range(effect$mean,effect$`0.025quant`,effect$`0.975quant`)
  plot(effect$ID, effect$mean, type="l", lwd=2, ylim=ylim, xlab = "ID", ylab = "Coefficient")
  lines(effect$ID, effect$`0.025quant`, lwd=2, col="blue", lty=2)
  lines(effect$ID, effect$`0.975quant`, lwd=2, col="blue", lty=2)
}
plot_random_effect(fit$summary.random$elev_group)
```

We find a nonlinear influence with two modes, similar to what we had estimated in the GAM model. 
Next, we provide a spatial visualization with a map of the elevation effect at the observed locations.

```{r Map elevation effect for observed locations (GEV), echo=TRUE, warning=FALSE, message=FALSE, tidy=TRUE}
head(fit$summary.random$elev_group)
est_elev_scaled = scale(fit$summary.random$elev_group$mean)/4
size_points = exp(est_elev_scaled)
plot(coor_metric, pch=19, cex=size_points, asp = 1, xlab = "x (km)", ylab = "y (km)")
lines(mymap$x, mymap$y, lwd=2)
```

Based on the gridded elevation data, we can also map the elevation effect on the grid. Note that we do not have observations for locations at very high altitude. Therefore, we extrapolate the step function for high elevations by taking the coefficient value of the class with highest observed location.

```{r Map elevation effect (GEV), echo=TRUE, warning=FALSE, message=FALSE, tidy=TRUE}
data2plot = COelev
beta_elev = fit$summary.random$elev_group$mean
beta_elev[13:16]  = beta_elev[12]
data2plot$z[] = beta_elev[cut(COelev$z, breaks)]
image(data2plot, asp = 1, col  = viridis(15))
```

Due to the step function approach (and the absence of a smooth longitude-latitude effect), the map here looks less smooth than the GAM estimate from evgam. 

## A time series model for threshold exceedances of temperatures

Next, we illustrate a latent AR1-time series model for threshold exceedances. Since there is only very weak dependence among threshold exceedances of precipitation data, we here instead consider a daily time series of temperature maxima at Fort Collins (Colorado), and we focus on the hot summer months of July and August. 

```{r Load the time series, echo=TRUE, warning=FALSE, message=FALSE, tidy=TRUE}
data(FCtmax)
obs_ts = FCtmax$tmax
any(is.na(obs_ts))
plot(obs_ts,type="h")
months_ts = 1+as.POSIXlt(FCtmax$date)$mon 
year_ts = 1900 + as.POSIXlt(FCtmax$date)$year
idx2keep = which(months_ts %in% 7:8)
obs_ts = obs_ts[idx2keep]
months_ts = months_ts[idx2keep]
year_ts = year_ts[idx2keep]
table(months_ts)
n_ts = length(obs_ts) 
plot(obs_ts,type="h")
```

Now we fix a threshold and generate exceedance indicators and positive exceedances. For the threshold, we fix the quantile at 90%. 
```{r Extract exceedances, echo=TRUE, warning=FALSE, message=FALSE, tidy=TRUE}
prob = 0.90
thresh = quantile(obs_ts, prob, na.rm = TRUE)
thresh
excess = pmax(obs_ts - thresh, 0)
mean(excess[excess > 0]) # average excess
is_exceed = as.numeric(obs_ts > thresh)
sum(is_exceed)
```
Our estimation procedure consists of two steps here. We construct and fit two separate models: the first one for the probability of exceedances over the threshold, the second one for the positive excess over the threshold.

Note:  we could also fit the two models simultaneously with INLA, using two likelihood types (binomial and GP) in the same INLA model, with random effects that are in common or have correlation between the two submodels. However, for simplicity, we here fit the models for exceedance probability and for the size of the excess separately. 

We generate a data frame with the response variable (called "y" generically) and covariates to be included in the model. To check for seasonal effects, we include the month as a covariate. We also generate a daily index for the AR1-model. For simplicity, we here treat the transition between years (August of year a-1 to July of year a) as "continuous" (alternatively, we would have to work with a "replicated" AR1-effect, with a separate AR1-series for each year).

```{r Create INLA data objects, echo=TRUE, warning=FALSE, message=FALSE, tidy=TRUE}
data_df = data.frame(y = is_exceed, intercept = 1, month = months_ts, year = year_ts, idx = 1:n_ts)
```

### Step 1: Modeling the exceedance probability

To learn about the formulation of the AR1 model, we can consult the documentation. The following command (if you remove the comment sign "#") opens a pdf file (or several pdf files if there are several related models):
```{r INLA documentation, echo=TRUE, warning=FALSE, message=FALSE, tidy=TRUE}
# inla.doc("ar1")
```

We will use Penalized Complexity Priors for the hyperparameters (standard deviation, and auto-correlation). See 

```{r PC priors, echo=TRUE, warning=FALSE, message=FALSE, tidy=TRUE}
hyper.prec = list(prior = "pc.prec", param=c(1,.5), initial=1)
# inla.doc("pccor1")
hyper.cor = list(prior = "pccor1", param = c(.75, .5), initial = log(0.75 / (1 - 0.75)))
```

We write the model formula with the f(...)-terms for the AR1 random effect.

```{r INLA formula (AR1), echo=TRUE, warning=FALSE, message=FALSE, tidy=TRUE}
form = y ~ -1 + intercept + f(idx, model = "ar1", hyper = list(theta1 = hyper.prec, theta2 = hyper.cor))
```

We here have Bernoulli variables (i.e. binary variables corresponding to a binomial distribution with 1 trial), so we will set "Ntrials = rep(1,n)" in the inla()-command.

Remark (missing response data): If there were missing response values (NA), we could keep them during the estimation. INLA will automatically calculate the predictions (=fitted values) for the missing responses. By default, an identity link function is used for missing values (but this is not what we want here). By specifying "link=1" in the control.predictor-argument, we will use the correct link function (i.e., the link function of the first likelihood "family" - here, we have only one likelihood family "binomial"). 

Running the following estimation with INLA takes one to several minutes:

```{r Run INLA (AR1), echo=TRUE, warning=FALSE, message=FALSE, tidy=TRUE}
fit_p=inla(form,
         data = data_df,
         family = "binomial",
         Ntrials = rep(1, n_ts), 
         control.predictor = list(compute=TRUE, link=1),
         control.fixed = list(prec = 1),
         #control.inla=list(strategy="simplified.laplace",int.strategy="ccd"),
         verbose=FALSE,
         num.threads = 2 #number of CPU threads that INLA can run in parallel
)
summary(fit_p)
```

We estimate  a posterior mean of the precision paramater at 1/4 approximately, i.e., the variance of the latent Gaussian AR1-series is 4 approximately. Moreover, we obtain a correlation coefficient Rho of 0.89 in the series, which indicates quite strong temporal correlation. 

We first plot the estimated latent AR1-effect. Then, we plot the data for only 1 year along with fitted values and the estimated random effects. 

```{r Plot AR1 results, echo=TRUE, warning=FALSE, message=FALSE, tidy=TRUE}
plot_random_effect(fit_p$summary.random$idx)
id_year = which(data_df$year == 2019)
plot((1:n_ts)[id_year], data_df$y[id_year], pch=4, xlab = "Day", ylab = "Exceedance probability")
lines((1:n_ts)[id_year], fit_p$summary.fitted.values$mean[id_year], col="blue", lwd=2)
```

### Step 2: Modeling the size of the excesses

Here we remove all data without observed positive exceedance for fitting the generalized Pareto distribution. Then, we write the INLA formula as before (same formula here).

```{r Prepare POT data for INLA, echo=TRUE, warning=FALSE, message=FALSE, tidy=TRUE}
idx2keep = which(!is.na(excess) & excess > 0)
n.exc = length(idx2keep)
n.exc
data_df = data.frame(y = excess[idx2keep], intercept = 1, month = months_ts[idx2keep], idx = idx2keep)
#hyper.prec = list(prior = "pc.prec", param=c(1,.5), initial=1)
form = y ~ -1 + intercept + f(idx, model = "ar1", hyper=list(theta1 = hyper.prec, theta2 = hyper.cor))
```

For the GPD in INLA, the log-link function is designed model a certain (fixed) quantile level of the generalized Pareto distribution. We here use the median, and we will specify this value in the control.family-argument of inla(...). You can consult inla.doc("loggamma") to see how the approximate PC prior of the shape parameter, corresponding to an exponential distribution (i.e. a gamma distribution with shape 1), can be implemented.
Since temperatures are usually not heavy tailed (i.e., xi is not strongly positive), we here set the penalization rate to a relatively high value of 10. Running the estimation with INLA then takes around several seconds. 

```{r Run INLA for exceedances, echo=TRUE, warning=FALSE, message=FALSE, tidy=TRUE}
q_gpd = 0.5
hyper.shape = list(prior = "loggamma", param=c(1,10), initial = log(.05))
fit_gp = inla(form,
         data = data_df,
         family = "gp",
         control.family = list(list(control.link = list(quantile = q_gpd), hyper = list(theta = hyper.shape))),
         control.compute = list(waic = TRUE),
         verbose = FALSE,
         num.threads = 2
)
summary(fit_gp)
```

Due to the high precision of almost 1000 (standard deviation of approximately 1/sqrt(1000)) in the estimated in AR1-series, the AR1-random effect is not important for the estimated model. We refit the model without this effect and compare the WAIC values. 

```{r Run INLA for exceedances (no AR1), echo=TRUE, warning=FALSE, message=FALSE, tidy=TRUE}
form = y ~ -1 + intercept 
fit_gpd = inla(form,
         data = data_df,
         family = "gp",
         control.family = list(list(control.link = list(quantile = q_gpd), hyper = list(theta = hyper.shape))),
         control.compute = list(waic = TRUE),
         #control.inla=list(strategy="simplified.laplace",int.strategy="ccd"),
         verbose = FALSE,
         num.threads = 2
)
summary(fit_gpd)
```
The WAIC value is essentially the same (even slightly lower and therefore better), so we could decide to keep the simpler model here.


# Part IV: Spatial modeling with INLA-SPDE

We consider a relatively simple model for a spatial trend. We use a Gaussian SPDE field as prior for the trend surface. 

We first extract the exceedances above the global 95%-quantile from the Colorado precipitation data. 

```{r Exceedances for INLA model, echo=TRUE, warning=FALSE, message=FALSE, tidy=TRUE}
prob = 0.95
thresh = quantile(COprcp$prcp[COprcp$prcp>0], prob, na.rm = TRUE)
sum(as.numeric(COprcp$prcp > thresh), na.rm = TRUE)
CO_exc = COprcp[COprcp$prcp > thresh,]
CO_exc$excess = CO_exc$prcp-thresh
```

We have to create a mesh (i.e., triangulation of space). The Gaussian variables will be defined on the mesh nodes and are then interpolated linearly between the nodes. We here use a spatial distance unit of 1km. We also define a vector of indices for the nodes of the mesh.

```{r Create mesh, echo=TRUE, warning=FALSE, message=FALSE, tidy=TRUE}
bound  = inla.nonconvex.hull(coor_metric, convex=-.05)
bound2 = inla.nonconvex.hull(coor_metric, convex=-.25)
plot(rbind(bound2$loc, bound2$loc[1,]), type="l", asp = 1, xlab = "x (km)", ylab = "y (km)")
lines(rbind(bound$loc, bound$loc[1,]))
points(coor_metric, pch=19, cex=.5)
# Define minimum edge length as 10km and maximum edge length as 25km 
# (within the study region) and 100km (in the extension zone).
mesh = inla.mesh.2d(loc=coor_metric, boundary=list(bound, bound2), cutoff=10, max.edge=c(25, 100), min.angle=c(21,21))
cat("Number of mesh nodes:", mesh$n, "\n")
plot(mesh, asp = 1) #looks OK
idx.spatial=inla.spde.make.index("spatial", n.spde=mesh$n)
```

We define an SPDE object with PC prior. The distribution of the PC priors is expressed as a probability to deviate from a baseline, which is a less complex model. For the variance parameter, the baseline corresponds to $\sigma^2=0$. For the spatial range parameter, the baseline has an infinite range, since this would correspond to perfect dependence (hence less variability). Below, we set the probability of having range $<$ 10 km to 5%, and that of having standard deviation larger than 10 to 50%. See documentation on PC priors for more details.

```{r Set prior SPDE model, echo=TRUE, warning=FALSE, message=FALSE, tidy=TRUE}
myspde = inla.spde2.pcmatern(mesh=mesh, alpha=2, prior.range=c(10,.05), prior.sigma=c(10, 0.5))
```

Next, we need an observation matrix to make the connection between mesh nodes and observation sites. 

```{r Build SPDE model, echo=TRUE, warning=FALSE, message=FALSE, tidy=TRUE}
A = inla.spde.make.A(mesh, loc = coor_metric, index = CO_exc$meta_row)
```

Next, we create a dataframe with the covariates used as linear effects, here with the intercept term 1, since in `R-INLA` it is more comfortable to include the intercept explicitly.

```{r Collect data and build formula, echo=TRUE, warning=FALSE, message=FALSE, tidy=TRUE}
covar.df = data.frame(intercept=rep(1, nrow(CO_exc)))
mystack = inla.stack(data = list(excess = CO_exc$excess), A=list(A, 1),
                   effects = list(idx.spatial, covar.df))
form = excess ~ -1 + intercept + f(spatial, model = myspde)
```

Note: since we handle the intercept explicitly, we have to include "-1". Recall that we have given the name "spatial" to the index of the spatial effect.

Let's run the `inla` function. To avoid that `inla` becomes too greedy with memory and cpu, we can fix the maximum number of parallel threads to 2.

The following INLA estimation takes approximatively 1 minute. 

```{r Estimating the INLA-SPDE model, echo=TRUE, warning=FALSE, message=FALSE, tidy=TRUE}
num.thr=2 #number of threads that INLA is allowed to run in parallel
q_gpd = 0.5
hyper.shape = list(prior = "loggamma", param=c(1,10), initial = log(.1))
fit=inla(form,
         data=inla.stack.data(mystack),
         family = "gp",
        control.family = list(list(control.link = list(quantile = q_gpd), hyper = list(theta = hyper.shape))),
         control.predictor = list(A=inla.stack.A(mystack), compute=FALSE),
         verbose=FALSE,
         num.threads=num.thr
         )
summary(fit)
``` 
The range of the spatial effect is approximately 360km, and its standard deviation is around 0.4.  

We can now plot the fitted spatial trend. Since standard plotting functions in `R` work on regular grids, we have to interpolate the estimated trend to such a grid, using the `inla.mesh.project` mechanism.

```{r Plot the spatial trend, echo=TRUE, warning=FALSE, message=FALSE, tidy=TRUE}
proj_grid=inla.mesh.projector(mesh, xlim=range(bound$loc[,1]), ylim=range(bound$loc[,2]), dims=c(200,200))
est_on_mesh = fit$summary.random$spatial$mean
image.plot(proj_grid$x,proj_grid$y, inla.mesh.project(proj_grid, field=est_on_mesh), asp = 1, xlab = "x (km)", ylab = "y (km)", col = viridis(16))
points(coor_metric, pch=19, cex=.25)
```

The posterior standard deviation (=uncertainty) of the spatial effect can also be mapped.

```{r Plot the sd of the spatial trend, echo=TRUE, warning=FALSE, message=FALSE, tidy=TRUE}
sd_on_mesh = fit$summary.random$spatial$sd
image.plot(proj_grid$x, proj_grid$y, inla.mesh.project(proj_grid, field=sd_on_mesh), asp = 1, xlab = "x (km)", ylab = "y (km)", col = viridis(16))
points(coor_metric, pch=19, cex=.25)
lines(rbind(bound$loc, bound$loc[1,]), lwd=2)
```
Note: as expected, there is usually lower uncertainty where we have many observations sites.