TransWikia.com

Adding GeoPandas Dataframe to PostGIS table?

Geographic Information Systems Asked by thecornman on March 20, 2021

I have a simple GeoPandas Dataframe:

enter image description here

I would like to upload this GeoDataframe to a PostGIS table. I have a Database setup with the PostGIS extension already but can’t seem to add this Dataframe as a table.

I have tried the following:

engine = <>
meta = MetaData(engine)
eld_test = Table('eld_test', meta, Column('id', Integer, primary_key=True), Column('key_comb_drvr', Text), 
                 Column('geometry', Geometry('Point', srid=4326))) 
eld_test.create(engine) 
conn = engine.connect() 
conn.execute(eld_test.insert(), df.to_dict('records'))

4 Answers

Using Panda's to_sql method and SQLAlchemy you can store a dataframe in Postgres. And since you're storing a Geodataframe, GeoAlchemy will handle the geom column for you. Here's a code sample:

# Imports
from geoalchemy2 import Geometry, WKTElement
from sqlalchemy import *
import pandas as pd
import geopandas as gpd

# Creating SQLAlchemy's engine to use
engine = create_engine('postgresql://username:password@host:socket/database')


geodataframe = gpd.GeoDataFrame(pd.DataFrame.from_csv('<your dataframe source>'))
#... [do something with the geodataframe]

geodataframe['geom'] = geodataframe['geometry'].apply(lambda x: WKTElement(x.wkt, srid=<your_SRID>)

#drop the geometry column as it is now duplicative
geodataframe.drop('geometry', 1, inplace=True)

# Use 'dtype' to specify column's type
# For the geom column, we will use GeoAlchemy's type 'Geometry'
geodataframe.to_sql(table_name, engine, if_exists='append', index=False, 
                         dtype={'geom': Geometry('POINT', srid= <your_srid>)})

Worth noting that 'if_exists' parameter allows you to handle the way the dataframe will be added to your postgres table:

    if_exists = replace: If table exists, drop it, recreate it, and insert data.
    if_exists = fail: If table exists, do nothing.
    if_exists = append: If table exists, insert data. Create if does not exist.

Answered by Hamri Said on March 20, 2021

I have also had the same question you've asked and have spent many, many days on it (more than I care to admit) looking for a solution. Assuming the following postgreSQL table with the postGIS extension,

postgres=> d cldmatchup.geo_points;
Table "cldmatchup.geo_points"
Column   |         Type         |                               Modifiers                                
-----------+----------------------+------------------------------------------------------------------------
gridid    | bigint               | not null default nextval('cldmatchup.geo_points_gridid_seq'::regclass)
lat       | real                 | 
lon       | real                 | 
the_point | geography(Point,4326) | 

Indexes:
"geo_points_pkey" PRIMARY KEY, btree (gridid)

this is what I finally got working:

import geopandas as gpd
from geoalchemy2 import Geography, Geometry
from sqlalchemy import create_engine, MetaData, Table
from sqlalchemy.orm import sessionmaker
from shapely.geometry import Point
from psycopg2.extensions import adapt, register_adapter, AsIs

# From http://initd.org/psycopg/docs/advanced.html#adapting-new-types but 
# modified to accomodate postGIS point type rather than a postgreSQL 
# point type format
def adapt_point(point):
    from psycopg2.extensions import adapt, AsIs
    x = adapt(point.x).getquoted()
    y = adapt(point.y).getquoted()
    return AsIs("'POINT (%s %s)'" % (x, y))

register_adapter(Point, adapt_point)

engine = create_engine('postgresql://<yourUserName>:postgres@localhost:5432/postgres', echo=False)
Session = sessionmaker(bind=engine)
session = Session()
meta = MetaData(engine, schema='cldmatchup')

# Create reference to pre-existing "geo_points" table in schema "cldmatchup"
geoPoints = Table('geo_points', meta, autoload=True, schema='cldmatchup', autoload_with=engine)

df = gpd.GeoDataFrame({'lat':[45.15, 35., 57.], 'lon':[-35, -150, -90.]})

# Create a shapely.geometry point 
the_point = [Point(xy) for xy in zip(df.lon, df.lat)]

# Create a GeoDataFrame specifying 'the_point' as the column with the 
# geometry data
crs = {'init': 'epsg:4326'}
geo_df = gpd.GeoDataFrame(df.copy(), crs=crs, geometry=the_point)

# Rename the geometry column to match the database table's column name.
# From https://media.readthedocs.org/pdf/geopandas/latest/geopandas.pdf,
# Section 1.2.2 p 7
geo_df = geo_df.rename(columns{'geometry':'the_point'}).set_geometry('the_point')

# Write to sql table 'geo_points'
geo_df.to_sql(geoPoints.name, engine, if_exists='append', schema='cldmatchup', index=False)

session.close()

I can't say if my database connection logic is the best since I basically copied that from another link and was just happy that I was able to successfully automap (or reflect) my existing table with the geometry definition recognized. I've been writing python to sql spatial code for only a few months, so I know there is much to learn.

Answered by user1745564 on March 20, 2021

I have a solution which is requires only psycopg2 and shapely (in addition geopandas of course). It is generally bad practice to iterate through (Geo)DataFrame objects because it is slow, but for small ones, or for one-off tasks, it will still get the job done.

Basically it works by dumping the geometry to WKB format in another column and then re-casts it to GEOMETRY type when inserting.

Note that you will have to create the table ahead of time with the right columns.

import psycopg2 as pg2
from shapely.wkb import dumps as wkb_dumps
import geopandas as gpd


# Assuming you already have a GeoDataFrame called "gdf"...

# Copy the gdf if you want to keep the original intact
insert_gdf = gdf.copy()

# Make a new field containing the WKB dumped from the geometry column, then turn it into a regular 
insert_gdf["geom_wkb"] = insert_gdf["geometry"].apply(lambda x: wkb_dumps(x))

# Define an insert query which will read the WKB geometry and cast it to GEOMETRY type accordingly
insert_query = """
    INSERT INTO my_table (id, geom)
    VALUES (%(id)s, ST_GeomFromWKB(%(geom_wkb)s));
"""

# Build a list of execution parameters by iterating through the GeoDataFrame
# This is considered bad practice by the pandas community because it is slow.
params_list = [
    {
        "id": i,
        "geom_wkb": row["geom_wkb"]
    } for i, row in insert_gdf.iterrows()
]

# Connect to the database and make a cursor
conn = pg2.connect(host=<your host>, port=<your port>, dbname=<your dbname>, user=<your username>, password=<your password>)
cur = conn.cursor()

# Iterate through the list of execution parameters and apply them to an execution of the insert query
for params in params_list:
    cur.execute(insert_query, params)
conn.commit()

Answered by wfgeo on March 20, 2021

As of recently, geopandas has a to_postgis method. Woohoo!

Note: you will need psycopg2-binary, sqlalchemy2, and geoalchemy2 installed.

import geopandas
from sqlalchemy import create_engine

# Set up database connection engine
engine = create_engine('postgresql://user:password@host:5432/')

# Load data into GeoDataFrame, e.g. from shapefile
geodata = geopandas.read_file("shapefile.shp")

# GeoDataFrame to PostGIS
geodata.to_postgis(
    con=engine,
    name="table_name"
)

Answered by Brylie Christopher Oxley on March 20, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP