Using Regex

Here i am writing a script to check&display only the valid mail address from a file

echo "Plz enter the Target file name with path"
read path
if [ -f "$path" ]
then
echo "The valid mail address are:"
email=$(grep -E -o "\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,6}\b" $path )
echo "$email"
fi

The file contains the data like this:

hmmeeranrizvi18@gmail.com
dfeugfbfveuifg@gmail,com
Mohamed@msystechnologies.com
12345@gmail.com
raja@jj@gmail.com
krish@yahoo.commm
jack34@97899.in

#when i executed this script it shows me the output like this

rizvi@rizvi-VirtualBox:~$ ./unt.sh
Plz enter the Target file name with path
/home/rizvi/unwant.sh
The valid mail address are:
hmmeeranrizvi18@gmail.com
Mohamed@msystechnologies.com
12345@gmail.com
jj@gmail.com
krish@yahoo.commm
jack34@97899.in

#how the last 3 email id would be valid?
is i did any mistake in my script plz help me guyz.

Why wouldn't they be valid? They all match your regex (2+ chars, @, 1+ chars, dot, 2-6 chars).

raja@jj@gmail.com is this a valid email id?

I don't think so.

Hi.

The syntax for email addresses is complicated. Here's a short perl code that tries to catch most correct addresses:

#!/usr/bin/env perl

# @(#) p1       Demonstrate recognize email addresses, perl common regex.

use strict;
use warnings;

use Regexp::Common qw[Email::Address];
use Email::Address;

my (@a) = <>;

print " Addresses being examined:\n";
print @a;

print "\n";
print " Acceptable email address string anywhere on line:\n";
for (@a) {
  my (@found) = /($RE{Email}{Address})/g;
  my (@addrs) = map $_->address, Email::Address->parse("@found");
  print "X-Addresses: ", join( ", ", @addrs ), "\n";
}

print "\n";
print " Entire line must match acceptable email address:\n";
for (@a) {
  my (@found) = /^($RE{Email}{Address})$/g;
  my (@addrs) = map $_->address, Email::Address->parse("@found");
  print "X-Addresses: ", join( ", ", @addrs ), "\n";
}

print "\n";
print "See document at https://en.wikipedia.org/wiki/Email_address#Syntax\n";

exit(0);

producing:

$ ./p1 data1
 Addresses being examined:
hmmeeranrizvi18@gmail.com
dfeugfbfveuifg@gmail,com
Mohamed@msystechnologies.com
12345@gmail.com
raja@jj@gmail.com
krish@yahoo.commm
jack34@97899.in

 Acceptable email address string anywhere on line:
X-Addresses: hmmeeranrizvi18@gmail.com
X-Addresses: dfeugfbfveuifg@gmail
X-Addresses: Mohamed@msystechnologies.com
X-Addresses: 12345@gmail.com
X-Addresses: raja@jj
X-Addresses: krish@yahoo.commm
X-Addresses: jack34@97899.in

 Entire line must match acceptable email address:
X-Addresses: hmmeeranrizvi18@gmail.com
X-Addresses: 
X-Addresses: Mohamed@msystechnologies.com
X-Addresses: 12345@gmail.com
X-Addresses: 
X-Addresses: krish@yahoo.commm
X-Addresses: jack34@97899.in

See document at https://en.wikipedia.org/wiki/Email_address#Syntax

Best wishes ... cheers, drl

1 Like

Thanks for the info drl

---------- Post updated at 12:29 PM ---------- Previous update was at 12:25 PM ----------

Here i just modified my script and this works as expected

email=$(grep -E -o "^[A-Za-z0-9._]+@[a-z]+\.[a-z]{1,3}$" $path )

it shows me the output like this

Plz enter the Target file name with path
/home/rizvi/unwant.sh
The valid mail address are:
hmmeeranrizvi18@gmail.com
Mohamed@msystechnologies.com
12345@gmail.com
1 Like

Valid hostnames start with a letter but can add with a digit, and can have an embedded hyphen. Also can contain several dots.
Example:
abc@m-7.test.com

grep -E -o '^[^@]+@([a-z][-a-z0-9]*[a-z0-9]+\.)+[a-z]{2,3}$'