somspace: Spatial classification with Self-Organizing Maps

This is is an introductory presentation of the somspace package, as implemented in the study of Markonis and Strnad [2020], to detect spatial drought patterns in Europe.

Data import

In this study the Old World Drought Atlas [Cook et al., 2015] was used, as pre-processed by Markonis et al. [2018]. The file can be downloaded here, and is a data.table object saved as an .rds file. Alternatively, the owda data object that is included in the package can be used, which covers only period 1500-2012.

library(somspace)
str(owda)

Data preparation

First of all, we have to transform the raw data from data.table to somin class. If the original data have different structure/type them it is important to transform them to a 4-column data.table, containing time, lat, lon and the variable.

inp <- sominp(owda) #depending on data set size may take some time

The somin class is a list with three elements. The first one, input_for_som, is a matrix that will be used by kohonen::som function to generate the SOM. The second one, coords, is a data.table with the point ids and their coordinates. The last one, input_dt is also a data.table with the raw data that were used to generate input_for_som.

str(inp)

SOM generation

The generation of SOM is straightforward:

my_som <- somspa(inp)

Which creates a somsp object that contains the som, the nodes properties and the original data.

str(somsp)

Since somspa() uses the function som() from kohonen package it can also take som() arguments. For instance, to create a 6 x 6 map after 1000 iterations:

my_som <- somspa(inp, grid = somgrid(6, 6), rlen = 1000)

SOM analysis

The somsp objects can be easily plotte by plot() or summarized by summary() functions.

plot(my_som)
summary(my_som)

Additionaly, the average time series of specific SOMs can be plotted by plot_ts:

plot_ts(my_som, 12)
plot_ts(my_som, c(1, 12, 21, 39)) 
plot_ts(my_som, 1:max(my_som$summary$node)) #plots all soms

Further classification of SOMs in regions

To further reduce the number of regions with similar characteristics somregs() is applied. It uses the hclust() function for Hierarchical cluster analysis and its default arguments can be changed accordingly. Please not that the n argument refers to the maximum number of regions and not to a single one. somregs() will merge SOMs to similar regions starting from 2 and reachinh n and will keep them all in the resulting regs object.

my_regions <- somregs(my_som, 12)
my_regions <- somregs(my_som, 17, method = "ward.D2") 

regs objects which are two-element lists. The regions element contains the information about the regions and the input_dt the time series of the original dataset. Similarly to somsp objects, regs can be plot as maps or time series.

plot(my_regions, c(2, 5, 9, 13), 2, 2)
plot_ts(my_regions, 2)

Complex network analysis

Finally, a specific classification of certain number of regions can be analysed through canonical networks and ploted as map with cnet(). The threshold, thres, refers to the cross-correlation coefficient between the averaged time series of each region; only values above thres are considered to be network connections.

cnet(my_regions, 12, 0.3)