Using grep to select specific patterns in text file?

How do I use grep to select words that start with I or O, end in box, and contain at least one letter in between them?

the text file mailinfo.txt contains

Inbox
the Inbox
Is a match box
Doesn't match
INBOX
Outbox
Outbox1
InbOX
Ibox
I box

If the command works correctly it should select

Inbox
Is a match box
Outbox

Here's what I've been trying:

grep -a  '[I|O]''[a-z].\+''box\>' mailinfo.txt 

And I only get:

Is a match box
Outbox

Why doesn't it retrieve "Inbox" as well? And how to I make it?

Thanks

with sed:

sed -n '/^[IO].*[a-zA-Z].*box$/p' mailinfo.txt

Hello steezuschrist96,

Welcome to forums, hope you will enjoy learning and sharing knowledge here with us, could you please try following and let us know how it goes then.

awk '($0 ~ /^[IO]/ && $0 ~ /box$/ && $0 ~ /[IO].*[a-zA-Z].*box/)'  Input_file
OR more precisely
awk '($0 ~ /^[IO].*[a-zA-Z].*box$/)'   Input_file

Output will be as follows.

Inbox
Is a match box
Outbox

EDIT: You could use simple grep too, I had tested this in GNU grep .

grep "^[IO].*[a-zA-Z].*box$"   Input_file

Thanks,
R. Singh

1 Like

Yes! That worked, thank you!

Note: it is best to use character classes rather than character ranges. This will match all letters, not just the 2x26. Also, this avoid issues with collation order (for example in some locales [A-Z] might include lower case letters):

grep '^[IO].*[[:alpha:]].*box$'

For example:

$ echo 'I�box' | grep '^[IO].*[a-zA-Z].*box$' 
$ echo 'I�box' | grep '^[IO].*[[:alpha:]].*box$' 
I�box
1 Like