TransWikia.com

How to compare two different files line by line in unix?

Unix & Linux Asked by Velu on January 29, 2021

File1:

123
234
345
456

File2:

123
234
343
758

Expected output:
File3:

TRUE
TRUE
FALSE
FALSE

so the code should compare two files and print ‘TRUE’ if it matches otherwise it should print ‘FALSE’ in the new file. Could anyone please provide the solution for this?

6 Answers

Tried with awk command and it worked fine


awk 'NR==FNR{a[$1];next}{if ($1 in a){print "TRUE"} else{print "False"}}' file1 file2

output

TRUE
TRUE
False
False

Answered by Praveen Kumar BS on January 29, 2021

In bash, reading from each file in a while loop, comparing the read lines and printing TRUE or FALSE appropriately:

while IFS= read -r -u3 line1; IFS= read -r -u4 line2; do
    [[ $line1 == $line2 ]] && echo TRUE || echo FALSE
done 3<file1 4<file2

The two calls to read reads from file descriptor 3 and 4 respectively. The files are redirected to these with two input redirections into the loop.

Answered by glenn jackman on January 29, 2021

Python 3

with open('file1') as file1, open('file2') as file2:
    for line1, line2 in zip(file1, file2):
        print(line1 == line2)

Output:

True
True
False
False

If you need TRUE and FALSE in uppercase, replace the print line with one of these:

print(str(line1 == line2).upper())
print('TRUE' if line1 == line2 else 'FALSE')

Answered by wjandrea on January 29, 2021

Assuming both files have the same number of lines:

awk '{getline f2 < "file2"; print f2 == $0 ? "TRUE" : "FALSE"}' file1

That's doing a numerical comparison if the strings to compare are numbers and lexical otherwise. For instance, 100 and 1.0e2 would be considered identical. Change to f2"" == $0 to force a lexical comparison in any case.

Depending on the awk implementation, lexical comparison will be done as if by using memcmp() (byte-to-byte comparison) or as if by using strcoll() (whether the two strings sort the same in the locale's collation order). That can make a difference in some locales where the order is not properly defined for some characters, not on all decimal digit input like in your sample.

Answered by Stéphane Chazelas on January 29, 2021

Assuming the files contain no tab-characters:

$ paste file1 file2 | awk -F 't' '{ print ($1 == $2 ? "TRUE" : "FALSE") }'
TRUE
TRUE
FALSE
FALSE

This uses paste to create two tab-delimited columns, with the contents of the two files in either column. The awk command compares the two columns on each line and prints TRUE if the columns are the same and otherwise prints FALSE.

Answered by Kusalananda on January 29, 2021

Use diff command as following, in bash or any other shell that supports <(...) process substitutions or you can emulate it as shown here:

diff --new-line-format='FALSE'$'n' 
     --old-line-format='' 
     --unchanged-line-format='TRUE'$'n' 
<(nl file1) <(nl file2)

Output would be:

TRUE
TRUE
FALSE
FALSE

--new-line-format='FALSE'$'n', print FALSE if lines were differ and with --old-line-format='' we disable output if line was differ for file1 which is known as old file to diff command (We could swap these as well, meaning that one of them should print FALSE another should be disabled.)

--unchanged-line-format='TRUE'$'n', print TRUE if lines were same. the $'n' C-style escaping syntax is used to printing a new line after each line output.

Answered by αғsнιη on January 29, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP