TransWikia.com

Creating a unique short ID from string value

Stack Overflow Asked by datta on November 27, 2020

I have some data that has unique IDs stored as a string in the form:

ddd.dddddaddd.dddddz

Where d is some digit and a/z is some alphabet character. The digits may be 0-9 and the characters are either E or W for the a and N or S for the z.

I’d like to turn this into a unique integer and what I’ve tried using the hashlib module returns:

>>> int(hashlib.sha256(str.encode(s)).hexdigest(), 16)
Output: a very long integer (on another system cannot copy it)

Is there a way to generate a unique integer ID from a string so that it does not exceed 12 digits? I know that I will never need a unique integer ID beyond 12 digits.

2 Answers

Just something simple:

>>> s = '123.45678W123.45678S'
>>> int(s.translate(str.maketrans('EWNS', '1234', '.')))
123456782123456784

Not the impossible 12 digits you're still asking for in the question, but under the 20 digits you allowed in the comments.

Correct answer by superb rain on November 27, 2020

As you are dealing with coordinates, I would try my best to keep the information in the final 12-digit ID.

  • If your points are global, it might be necessary to keep the degrees but they may be widespread, so you can sacrifice some information when it comes to precision.
  • If your points are local (all within a range of less than 10 degrees) you might skip the first two digits of the degrees and focus on the decimals.
  • As it may be possible that two points are close to each other, it may be prudent to reserve one digit as a serial number.

Proposal for widespread points:

s = "123.45678N123.45678E"
ident = "".join([s[0:6],s[10:16]]).replace(".","")
q = 0
if s[9]=="N":
    q+=1
if s[-1]=="E":
    q+=2
ident+=str(q)+'0'

The example would translate to 123451234530.

After computing the initial ident numbers for each ID, you should loop through them and increment the last digit if an ident is already taken.

This way you could easily reconstruct the location from the ID by just separating the first 10 digits to two degrees of the format ddd.dd and use the [-2] digit as an indicator of the quadrant (0:SW, 1:SE, 2:NW, 3:NE).

Answered by Martin Wettstein on November 27, 2020

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP