TransWikia.com

In awk, how do I prepend a string to a pattern search print

Unix & Linux Asked by Nia on December 19, 2021

Sorry for the easy question, I can’t seem to find the answer. We have a file that contains a lot of emails.

<To: [email protected]>
<Cc: [email protected]>
<From: [email protected]>

hello person here is random text

etc., repeat with 20+ emails

The teacher wants, in awk, the students to print the lines containing emails, while prepending "email =" to the beginning. his hint was to use To: From: and Cc: as pattern searches. His ideal output would be

email = [email protected]
email = [email protected]
email = [email protected]

but he will also take

email = <To: [email protected]>
email = <Cc: [email protected]>
email = <From: [email protected]>

I can get it to print the latter example (I’m clueless how to remove the garbage characters from each example, they are not consistent from one email to another) but how do I prepend email = ?

so far I have

awk '/To:|From:|Cc:/ {print}' file

I’ve also tried a bunch of other weird things that have only given syntax errors

Thank you!

3 Answers

If the object is to learn awk then you could try out some of the other functionality. This relies a bit on the formatting of your file but then, so do the other answers, so

awk -F'>|<| ' -v OFS=" = " '/To:|From:|Cc:/{print "E-mail", $(NF-1)}' file

E-mail = [email protected]
E-mail = [email protected]
E-mail = [email protected]

First you can set up your input field separator -F as an extended regular expression, an ERE, so that it breaks your input line $0 on < or > or

-F'>|<| ' 

Doing this cleans the extra character from the end of the email address but leaves $1 empty (because of the leading <), $2 contains your filtering field, $3 the email address and there is an empty $4 at the end (because of the trailing >)

Then being lazy you can set the output field separator to = so whenever you print a series of fields the default behaviour is to insert = and not just between the fields

-v OFS=" = " 

Your filtering is the same on $0 but when you print, because you split the input string $0 into 4 (and the number of fields, the special awk variable, NF=4) you want to print $3, which is also $(NF-1) in this case

'/To:|From:|Cc:/{print "E-mail", $(NF-1)}' file

And because you have pulled your filter field out expressly then you can guard against an accidental match elsewhere in $0 by looking for an explicit match in $2 if you want.

'$2=="To:"||$2=="From:"||$2=="Cc:"{......

Answered by bu5hman on December 19, 2021

Command

awk -F ":" '/To|Cc|From/{gsub(">","",$2);print "email ="$2}' filename

output

email = [email protected]
email = [email protected]
email = [email protected]

Python

#!/usr/bin/python
import re
k=re.compile(r'To|Cc|From')
q=open('filename','r')
for i in q:
    if re.search(k,i):
        w=i.split(":")[1].replace(">","")
        print "email = {0}".format(w).strip()

output

email =  [email protected]
email =  [email protected]
email =  [email protected]

Answered by Praveen Kumar BS on December 19, 2021

It looks like the following would do what you need:

awk '/To:|From:|Cc:/ {$2=substr($2,1,length($2)-1); print "email = ", $2 }' file

This bit:

{$2=substr($2,1,length($2)-1)

Takes the second element in the string, which is something like [email protected]>, and removes the last character, >.

Then you just print the string you want "email = ", along with the edited second element, $2:

print "email = ", $2

The result looks like this:

email =  [email protected]
email =  [email protected]
email =  [email protected]

It's worth familiarizing yourself with how print works: https://www.gnu.org/software/gawk/manual/html_node/Print-Examples.html

Answered by mmaluff on December 19, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP