AnswerBun.com

Python compare two large file wordlists and print if match

Stack Overflow Asked by uzdisral on December 20, 2020

Let’s say I have two large files. One is md5_db.txt and second is hash list hash.txt.

md5_db.txt contains hash and pass:

accfa1212a61b379ba0b009549113863:11150
12fd5b2b866858281404434d1b9a0284:111968
cd418b51dc28d28a239d0658cdd3bca6:111983
e0c10f451217b93f76c2654b2b729b85:111aaa

hash.txt

cd418b51dc28d28a239d0658cdd3bca6
e0c10f451217b93f76c2654b2b729b85

Now I want to compare them and find the hash and if it matches in both files, print the hash with password. I’ve been trying to find the most effective solution but I get close and then it doesn’t work as wanted. The code I have is very simple, it doesn’t work yet, unless I will split the hash from password, then it will find match. Basically in nutshell I need the script to grab the hash.txt and compare it to md5_db.txt and print if matched.

with open('md5_db.txt', 'r') as file1:
    with open('hash.txt', 'r') as file2:
        same = set(line.strip() for line in file1)
        same = "n".join(same)

        for line in file2:
            word = line
            if word in same:
                print(word)

2 Answers

You need to load md5_db.txt into a dictionary:

with open('md5_db.txt') as md5_db_file:
    md5_db = dict(line.strip().split(":", 1)
                  for line in md5_db_file
                  if line.strip())

And then it's easy to loop over hash.txt and print any matches:

with open('hash.txt') as hash_file:
    for line in hash_file:
        h = line.strip()
        if h in md5_db:
            print(h, md5_db[h])

Correct answer by orlp on December 20, 2020

Dictionary is the best way to lookup once you load the key and values in memory

Code:

dct = dict()
with open('md5_db.txt', 'r') as md5_file, open('hash.txt', 'r') as hash_file:
    for line in md5_file:
        hash_passw = line[:-1].split(':')
        dct[hash_passw[0]] = hash_passw[1]
    print(dct)
    for hash in hash_file:
        passw = dct.get(hash[:-1])
        if passw:
            print(hash[:-1], passw)

Output:

{'accfa1212a61b379ba0b009549113863': '11150', '12fd5b2b866858281404434d1b9a0284': '111968', 'cd418b51dc28d28a239d0658cdd3bca6': '111983', 'e0c10f451217b93f76c2654b2b729b85': '111aaa'}
cd418b51dc28d28a239d0658cdd3bca6 111983
e0c10f451217b93f76c2654b2b729b85 111aaa

Answered by Aaj Kaal on December 20, 2020

Add your own answers!

Related Questions

How Can I Not Double Count Rows And Objects In My Query?

3  Asked on November 22, 2021 by ziegler199

       

Login with active directory

0  Asked on November 20, 2021 by artisan

       

External CSS not linking for all elements

1  Asked on November 20, 2021 by stratos-la

   

Why is ‘u{1D11E}’.charAt(0) not equal to ‘u{1D11E}’?

2  Asked on November 20, 2021 by json-prime

     

E: Unable to locate package python3-boto3

2  Asked on November 20, 2021 by rosi-darmawati

       

x and o in tic tac toe keeps alternating

0  Asked on November 20, 2021 by channee-mathmath

     

Remove using a button

2  Asked on November 20, 2021 by gastn

   

remove 0 and add ‘ ‘ in every number in pandas

1  Asked on November 20, 2021 by spt-hsb

   

nav-tabs doesn’t change on click

0  Asked on November 20, 2021 by beureu

     

Stick element on bottom on page scroll

3  Asked on November 20, 2021

         

Exchange each 100 in a dataframe for another value

1  Asked on November 20, 2021 by tarik-benrabah

     

GraphQL & Using a nested subscribe function

1  Asked on November 20, 2021 by u-rogel

   

Ask a Question

Get help from others!

© 2022 AnswerBun.com. All rights reserved. Sites we Love: PCI Database, MenuIva, UKBizDB, Menu Kuliner, Sharing RPP, SolveDir