--- title: "somspace: Spatial classification with Self-Organizing Maps" author: "Yannis Markonis" date: "`r Sys.Date()`" output: rmarkdown::html_vignette: toc: true vignette: > %\VignetteIndexEntry{somspace: Spatial classification with Self-Organizing Maps} %\VignetteEncoding{UTF-8} %\VignetteEngine{knitr::rmarkdown} editor_options: chunk_output_type: console --- ```{r setup, include=FALSE} knitr::opts_chunk$set(echo = TRUE, eval = F) ``` This is is an introductory presentation of the `somspace` package, as implemented in the study of [Markonis and Strnad [2020]](https://doi.org/10.1177/0959683620913924), to detect spatial drought patterns in Europe. ## Data import In this study the Old World Drought Atlas [[Cook et al., 2015](https://doi.org/10.1126/sciadv.1500561)] was used, as pre-processed by [Markonis et al. [2018]](https://doi.org/10.1038/s41467-018-04207-7). The file can be downloaded [here](https://www.fzp.czu.cz/en/r-9409-science-research/r-9674-leading-research-groups/r-9669-hydrological-and-climate-variability/r-9713-team-news/webapp-for-hydroclimate-reconstruction.html), and is a `data.table` object saved as an `.rds` file. Alternatively, the `owda` data object that is included in the package can be used, which covers only period 1500-2012. ```{r} library(somspace) str(owda) ``` ## Data preparation First of all, we have to transform the raw data from `data.table` to `somin` class. If the original data have different structure/type them it is important to transform them to a 4-column `data.table`, containing time, lat, lon and the variable. ```{r} inp <- sominp(owda) #depending on data set size may take some time ``` The `somin` class is a list with three elements. The first one, `input_for_som`, is a `matrix` that will be used by `kohonen::som` function to generate the SOM. The second one, `coords`, is a `data.table` with the point ids and their coordinates. The last one, `input_dt` is also a `data.table` with the raw data that were used to generate `input_for_som`. ```{r} str(inp) ``` ## SOM generation The generation of SOM is straightforward: ```{r} my_som <- somspa(inp) ``` Which creates a `somsp` object that contains the som, the nodes properties and the original data. ```{r} str(somsp) ``` Since `somspa()` uses the function `som()` from `kohonen` package it can also take `som()` arguments. For instance, to create a 6 x 6 map after 1000 iterations: ```{r} my_som <- somspa(inp, grid = somgrid(6, 6), rlen = 1000) ``` ## SOM analysis The `somsp` objects can be easily plotte by `plot()` or summarized by `summary()` functions. ```{r} plot(my_som) summary(my_som) ``` Additionaly, the average time series of specific SOMs can be plotted by `plot_ts`: ```{r} plot_ts(my_som, 12) plot_ts(my_som, c(1, 12, 21, 39)) plot_ts(my_som, 1:max(my_som$summary$node)) #plots all soms ``` ## Further classification of SOMs in regions To further reduce the number of regions with similar characteristics `somregs()` is applied. It uses the `hclust()` function for Hierarchical cluster analysis and its default arguments can be changed accordingly. Please not that the `n` argument refers to the maximum number of regions and not to a single one. `somregs()` will merge SOMs to similar regions starting from 2 and reachinh `n` and will keep them all in the resulting `regs` object. ```{r} my_regions <- somregs(my_som, 12) my_regions <- somregs(my_som, 17, method = "ward.D2") ``` `regs` objects which are two-element lists. The `regions` element contains the information about the regions and the `input_dt` the time series of the original dataset. Similarly to `somsp` objects, `regs` can be plot as maps or time series. ```{r} plot(my_regions, c(2, 5, 9, 13), 2, 2) plot_ts(my_regions, 2) ``` ## Complex network analysis Finally, a specific classification of certain number of regions can be analysed through canonical networks and ploted as map with `cnet()`. The threshold, `thres`, refers to the cross-correlation coefficient between the averaged time series of each region; only values above `thres` are considered to be network connections. ```{r} cnet(my_regions, 12, 0.3) ```