TransWikia.com

Seeking bulk geocoding service for Brazil that allows permanent lat/long data storage

Geographic Information Systems Asked by Fernando Franke on July 5, 2021

I have to geocode a huge dataset, with about 35 million addresses from Brazil. I was not able to find a good solution, even considering paid geocoders. So, I’d like to know if there is any geocoder service that has good accuracy and matching rate and, at the same time, allows bulk geocoding and also allows the storage of the lat/long results in a permanent database.

My initial idea was to do the geocoding using R, which has packages for many geocoders, like OSM/Nominatim (geocode_OSM), Google (ggmap) and HERE (HereR). Another possibility was to use QGIS, which also has geocoding plugins with the same 3 geocoders. So, I did a short benchmark with these geocoders and noticed that it would not be possible to use any of them:

  • OSM/Nominatim (https://nominatim.org/): It’s free, but does not allow bulk geocoding. Its accuracy and matching rate is also below my requirements.
  • Google Geocoding API: Has a good accuracy and matching rate. However, terms of use do not allow geocoding results storage. According to https://cloud.google.com/maps-platform/terms/maps-service-terms: "Customer can temporarily cache latitude (lat) and longitude (lng) values from the Geocoding API for up to 30 consecutive calendar days, after which Customer must delete the cached latitude and longitude values." Because of this restriction, I understand that it would not be allowed to do what I have to do using Google API;
  • HERE Geocoding API: Also has a good accuracy and matching rate, but with my test dataset both, accuracy and matching rate, were a little bit bellow than Googles API results. However, HERE API has similar terms of use restrictions than Google, not allowing results storage (https://knowledge.here.com/csm_kb?id=public_kb_csm_details&number=KB0016408).

I noticed that most of the API geocoders have similar restrictions than the observed in Google and HERE APIs. That’s probably because they are web services created for apps, but not for bulk geocoding.

So, I’d like to know if there are any other bulk geocoding options (free or paid), without lat/long results storage restrictions.

4 Answers

Install https://github.com/komoot/photon#installation with https://download1.graphhopper.com/public/extracts/by-country-code/br/ and you have a OSM-based geocoder without any further restrictions ready to go.

That's the beauty of a functioning open-source community.

Answered by bugmenot123 on July 5, 2021

https://opencagedata.com/ has a decent result quality, based on OSM data (and other datasets). They also allow storing results as long as you like and have a R library.

Answered by smomapz on July 5, 2021

Consider the Geocode Earth geocoding service, or the FOSS Pelias Geocoder (disclaimer: I am co-creator of both).

Pelias came out of Mapzen several years ago and has had continued development since then

It uses data from OpenStreetmap, OpenAddresses, and other open data sets together, to give the best results. In Brazil, OpenAddresses has coverage in all 26 states and the federal district, so coverage should be quite good.

As Pelias is open source, you're free to run it yourself and use the results in any manner compatible with the licenses of the underlying datasets, which certainly includes storing the results.

The Geocode Earth service allows you to get started instantly, and not have to worry about grabbing the freshest data, ensuring your geocoder is fast, always available, etc. We also allow you to cache results.

Both options grow the open source community and support open-data as well.

Answered by orangejulius on July 5, 2021

Brazil is not easy country to do always GEOCODING with 100% accuracy but we try to do it. We have large base of clients who do perform relatively large batches from Brazil every month. I propose we to try small batch for you with our Batch Geocoder and analyze the results. We follow GDPR and HIPAA guidelines for storage of data when permanent storage is desired. If the result is a satisfactory for you and your business, we can run the big batch.

Disclaimer:I founded csv2geo to help business from around the world make GEOCODING easy.

Contact me

Answered by ivan on July 5, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP