TransWikia.com

Merging attributes into shapefile (US County-level data) using R

Geographic Information Systems Asked on December 27, 2020

I’d like to make a plot of the US, with each county colored according to a scalar value (the average summer temperature).

I’m using R.

I have a shapefile with all the counties:

> shape = st_read(paste0(my_dir, "cb_2018_us_county_5m/cb_2018_us_county_5m.shp")
> head(shape)
Simple feature collection with 6 features and 9 fields
geometry type: MULTIPOLYGON
dimension: XY
bbox:      xmin: -120.0726 ymin: 30.28043 xmax: -83.34348 ymax: 39.37874
geographic CRS: NAD83
  STATEFP COUNTYFP COUNTYNS     NAME                       geometry
       39      071 01074048 Highland MULTIPOLYGON (((-83.86976 3...
       06      003 01675840   Alpine MULTIPOLYGON (((-120.0725 3...
       12      033 00295737 Escambia MULTIPOLYGON (((-87.62999 3...

And a csv file with temperature data:

> temp_data = read_csv(paste0(this_dir, "temperature_data.csv"))
> head(temp_data)
  GEOID    State  County Temperature
1001     ALABAMA AUTAUGA          25
1003     ALABAMA BALDWIN          31
1005     ALABAMA BARBOUR          15

I feel like I should be able to match these up – but I don’t see any identifying column that matches across both files. (The county names get close, but the US has a lot of Jefferson and Madison counties)

How can I merge the temperature data into the shapefile for plotting?

One Answer

The GEOID tag in your CSV is based on the US Census GEOID. This is a hierarchical identifier made by giving numbers to every state, county, census tract, &c, as follows:

GEOID explanation

From the site above:

FIPS codes for smaller geographic entities are usually unique within larger geographic entities. For example, FIPS state codes are unique within nation and FIPS county codes are unique within state. Since counties nest within states, a full county FIPS code identifies both the state and the nesting county. For example, there are 49 counties in the 50 states ending in the digits “001”. To make these county FIPS codes unique, the state FIPS codes are added to the front of each county (01001, 02001, 04001, etc), where the first two digits refer to the state the county is in and the last three digits refer specifically to the county.

So you can get a GEOID by concatenating STATEFP and COUNTYFP.

The most common problem people have with GEOIDs, or parts of them, is the tendency of data analysis software to treat them as being numbers. They are strings.

For instance, your dataset includes

  STATEFP COUNTYFP COUNTYNS     NAME                       geometry
       06      003 01675840   Alpine MULTIPOLYGON (((-120.0725 3...

So the appropriate GEOID for Alpine County is 06003. But, if the FP codes have been converted to numbers somewhere in your analysis pipeline you could easily get 63 or 9. This would be bad and prevent your data from joining.

Correct answer by Richard on December 27, 2020

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP