Narrowing sed Results in While Loop

find $SRC -type f -name *.emlx |
while read FILE
do
if :
then sed -n '/From/p' $FILE
fi
done > $DEST-output.txt

The loop above spits out a .txt file with several lines that look like this:

From: John Smith <jsmith@company.com>

How can I narrow that sed result to spit out the email only? Maybe the "From:" line but only include the data in between <> symbols containing @. I'm running these results into a ldapsearch query which is why I need the email only.

You may try following also for just email, if your file only contains email information

$ echo "From: John Smith <jsmith@company.com>" | sed 's/.*<\|>//g'
jsmith@company.com
$ echo "From: John Smith <jsmith@company.com>" | grep -Po '(?<=<).*(?=>)'
jsmith@company.com
$ echo "From: John Smith <jsmith@company.com>" | awk 'gsub(/.*<|>.*/,x)'
jsmith@company.com

--edit--

if your interest is searching line starting with string From then following might be helpful

$ awk '/^From:/ && gsub(/.*<|>.*/,x)' file
$ sed -n '/^From/ s/.*<\|>//p' file
1 Like

Something like this should work for both the angular bracket style ( name surname <emailaddr> )and the direct type email addresses ( emailaddr ):

awk '/From:/{gsub(/[<>]/,x,$NF); print $NF}'

It is best to quote the wildcard name specification to avoid unwanted expansion. Also, you could probably use the -exec clause instead of a while loop, then you could also use the + operator for more efficient operation, e.g.:

find "$SRC" -type f -name '*.emlx' -exec awk '/From:/{gsub(/[<>]/,x,$NF); print $NF}' {} + > "$DEST-output.txt"
1 Like

There's also mails having "From:" lines ending with <br/> , and HTML- headers replacing < with < and > with > ; some even put the username in parentheses AFTER the email - address - try this to capture all of those as well:

sed -n 's/</</;s/>/>/;s/ *[(>].*$//;s/^From:.*[< ]//p' file
1 Like

Hello,

Following may help too.

1st:

echo "From: John Smith <jsmith@company.com>" |  awk -F"[<>]"  '{print $2}'
jsmith@company.com

2nd:

echo "From: John Smith <jsmith@company.com>" | sed 's/\(.*: \)\(.*<\)\(.*\)\(>.*\)/\3/g'
jsmith@company.com

Thanks,
R. Singh

1 Like

Well, how about these mail file lines:

From: root@xxxx.com (root)      
From: username <username@xxxx.com><br/>            
1 Like