TransWikia.com

Joining attributes by location for more than one located feature using R

Geographic Information Systems Asked on November 17, 2020

When joining attributes by location with the over function in R from the sp package, only the values for one of the polygons that intersects is retained. Is there a way in R that allows to take attributes of more than located feature? That is, to get ALL qualitative values the polygons that intersect/overlap. With an attribute column for the value of the first located feature, another for the second located feature, etc.

I have also asked in a separate question about Joining attributes by location for more than one located feature using QGIS

Sample code to perform simple spatial joins

#set up sample data
    install.packages('tmap')#use this library to upload sample spatial polygons
    library(tmap)
    data(Europe) # countries
    data(metro) # cities
    #transform sample data of points into polygons with same crs/projection for this example (trivial example because point.in.poly suited to this problem but just for the sake of an example
    metro.pts=spTransform(metro, crs(Europe))
    metro.poly=gBuffer(metro.pts, width=50000, byid=TRUE)
    
#load libraries for spatial join
    library(rgdal)
    library(sp)
    
 #spatial join of the two polygon shapefiles
        #"find the cities in each country"
        joined_one=cbind(over(Europe, metro.poly), as.data.frame(Europe))
    
    #but this only joins one of the cities in each country even when more than one city is present
    #if we check for spain, we only have Barcelona and not Madrid even though it is part of the city file
    na.omit(joined_first[joined_first$iso_a3=="ESP",])
    metro.poly[metro.poly$name=="Madrid"|metro.poly$name=="Barcelona",]

    #using the `fn` option (function) we can calculate arithmetic for quantitative variables as to get values for more than one polygon that intersects
    joined_qn=cbind(over(Europe, metro.poly[,5:8], fn=mean), as.data.frame(Europe))

2 Answers

I recommend you consider transitioning from sp to the sf package (this handy guide will help you along the way).

Here's an approach that uses sf::st_join() to achieve the spatial join you're looking for:

library(tmap) 
library(sf) 
library(tidyverse)

data(Europe) 
data(metro) 

europe_sf <- st_as_sf(Europe)

metro_pts_sf <- metro %>%
  st_as_sf() %>%
  st_transform(st_crs(europe_sf))

metro_poly_sf <- st_buffer(metro_pts_sf, dist = 5e4)

joined_sf <- st_join(europe_sf, metro_poly_sf)

Let's check the two Spanish cities you mention in the question:

joined_sf %>%
  filter(name.x %in% "Spain") %>%
  select(matches("name"), everything())

## Simple feature collection with 2 features and 28 fields
## geometry type:  MULTIPOLYGON
## dimension:      XY
## bbox:           xmin: 2766833 ymin: 1584304 xmax: 3833320 ymax: 2463100
## epsg (SRID):    NA
## proj4string:    +proj=laea +lat_0=52 +lon_0=10 +x_0=4321000 +y_0=3210000 +ellps=GRS80 +units=m +no_defs
##   name.x    name.y name_long iso_a3.x sovereignt continent            part
## 1  Spain Barcelona Barcelona      ESP      Spain    Europe Southern Europe
## 2  Spain    Madrid    Madrid      ESP      Spain    Europe Southern Europe
##   EU_Schengen   area  pop_est pop_est_dens gdp_md_est gdp_cap_est
## 1 EU Schengen 498800 40525002     81.24499    1403000     34620.6
## 2 EU Schengen 498800 40525002     81.24499    1403000     34620.6
##                      economy           income_grp life_exp well_being
## 1 2. Developed region: nonG7 1. High income: OECD     81.4   6.188263
## 2 2. Developed region: nonG7 1. High income: OECD     81.4   6.188263
##        HPI iso_a3.y pop1950 pop1960 pop1970 pop1980 pop1990 pop2000
## 1 44.06279      ESP 1809390 2467926 3482047 3836761 4100808 4355425
## 2 44.06279      ESP 1699752 2391543 3520861 4253190 4413870 5014411
##   pop2010 pop2020 pop2030                       geometry
## 1 4933548 5478238 5685152 MULTIPOLYGON (((3830463 187...
## 2 5787392 6476334 6707421 MULTIPOLYGON (((3830463 187...

Correct answer by Tiernan on November 17, 2020

You can use raster::union

Example data:

library(tmap)
library(rgeos)
library(rgdal)
library(raster)
data(Europe) 
# simplifying a bit
Eu <- Europe[Europe$iso_a3 %in% c('ESP', 'PRT', 'FRA'),1:2]
data(metro)
metro.pts <- spTransform(metro[, 'name'], crs(Europe))
metro.poly <- gBuffer(metro.pts, width=50000, byid=TRUE)

Original solution

u <- union(Eu, metro.poly)

or

i <- intersect(Eu, metro.poly) 

**Update, now I understand the question better **

You can returnList = TRUE in over:

v <- over(Eu, metro.poly, returnList=TRUE)

You could now do:

a <- lapply(1:length(v), function(i) cbind(iso_a3=Eu$iso_a3[i], v[[i]]))
b <- do.call(rbind, a)
m <- merge(Eu, b, by="iso_a3", all.x=TRUE, duplicateGeoms=TRUE)

In merge, the argument duplicateGeoms=TRUE is necessary because otherwise merge cannot handle having more than one (i.e. duplicate) value to assign to each feature.
To get the following:

data.frame(m)
#  iso_a3   name.x    name.y
#1    ESP    Spain Barcelona
#2    ESP    Spain    Madrid
#3    FRA   France     Lille
#4    FRA   France      Lyon
#5    FRA   France Marseille
#6    FRA   France     Paris
#7    PRT Portugal    Lisbon
#8    PRT Portugal     Porto

I think that this is the same as what Tiernan showed in perhaps more elegant sf code, but I do not think that this is what you asked for.

For what you want, we need to create a data.frame from a list with varying number of elements. There might be a better way, but here is one:

m <- max(sapply(v, nrow))
iso <- as.character(Eu$iso_a3)

x <- sapply(1:length(v), 
       function(i) c(iso[i], c(v[[i]][,1], rep(NA, m))[1:m])
     )

x <- data.frame(t(x), stringsAsFactors=FALSE)
colnames(x) <- c("iso_a3", paste0("metro_", 1:m))

Merge with the original polygons:

r <- merge(Eu, x, by="iso_a3", all.x=TRUE)
data.frame(r)

#  iso_a3     name   metro_1 metro_2   metro_3 metro_4
#1    ESP    Spain Barcelona  Madrid      <NA>    <NA>
#2    FRA   France     Lille    Lyon Marseille   Paris
#3    PRT Portugal    Lisbon   Porto      <NA>    <NA>

Answered by Robert Hijmans on November 17, 2020

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP