TransWikia.com

Append rows to shapefile on disk (not in memory)?

Stack Overflow Asked by zelusp on February 23, 2021

I’m trying to combine every block file from the 2010 census together into a single master block file for the US. I’m currently doing this in Google Colab and even on their pro subscription – which gives you about 25GB of RAM – I’m maxing out all available memory on the 45th file (I just have 5 more to go!). Code wise, I’m just building a list of dataframes that need to be concated together and ultimately written to disk:

gdfs = []
census_blocks_basepath = r'/content/drive/My Drive/Census/blocks/'
census_block_filenames = [f for f in os.listdir(census_blocks_basepath) if f.endswith('.shp')]
for index, block_filename in enumerate(census_block_filenames):
  file_name = os.path.join(census_blocks_basepath, block_filename)
  gdfs.append(gpd.read_file(file_name))
  print('Appended file %s, %s' % (index, block_filename))

gdf = gpd.GeoDataFrame(pd.concat(gdfs, ignore_index=True), crs=dataframesList[0].crs)
# gdf.reset_index(inplace=True, drop=True)
gdf.head(3)

Instead, I think I should:

  1. load a single geodataframe
  2. append it to a master dataframe that exists on disk (rather than in memory like csv.writer)
  3. delete the loaded geodataframe from 1 (to avoid memory accrual)
  4. then repeat 13 for all geodataframes remaining in the source directory

I don’t see documentation on whether geopandas supports disk based appends.. it only seems able to overwrite previous files via GeoDataFrame.to_file. That said, I see that geopandas has a GeoDataFrame.to_postgis method with a chunksize argument, which makes me think that it’s possible to append data onto a geofile on disk (or I’m wrong and that’s just a feature of postgis).

Any ideas?

One Answer

From MartinFleis

Yes, any file format which supports appending (and is supported by fiona) can be appended. You just have to specify mode="a".

df.to_file(filename, mode="a")

You can check if a mode is supported using

import fiona
fiona.supported_drivers

This is the current result r-read, a-append, w-write.

{'AeronavFAA': 'r',
 'ARCGEN': 'r',
 'BNA': 'raw',
 'DXF': 'raw',
 'CSV': 'raw',
 'OpenFileGDB': 'r',
 'ESRIJSON': 'r',
 'ESRI Shapefile': 'raw',
 'GeoJSON': 'rw',
 'GeoJSONSeq': 'rw',
 'GPKG': 'rw',
 'GML': 'raw',
 'GPX': 'raw',
 'GPSTrackMaker': 'raw',
 'Idrisi': 'r',
 'MapInfo File': 'raw',
 'DGN': 'raw',
 'PCIDSK': 'r',
 'S57': 'r',
 'SEGY': 'r',
 'SUA': 'r',
 'TopoJSON': 'r'}

Correct answer by zelusp on February 23, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP