Strange RegExp Behaviour

Hello,

I was trying to identify lines who has a word of the following pattern "xyyx" (where x, and ys are different characters).

I was trying the following grep -

egrep '(\S)([^\1])\2\1'

This pattern do catches the wanted pattern, but it also catches "GGGG" or "CCCC" patterns. I was trying to check this on different regex infrastructure and it worked fine (catched only the wanted pattern).

Does anyone has an idea how to change the regex so it catches only the wanted pattern?

To summarize the problem - you can check -

echo 'GGGGGG' | egrep '(\S)([^\1]\2\1' 

Thanks a lot in advance!
Eyal.

try..

egrep '(.)(.)\2\1' inputfile > outfile

Hey!

Thanks for the fast reply !

But the RegExp you wrote will accept strings of the form 'xxxx' and 'yyyy'.
I'm looking for a pattern that maches on 'xyyx' (Where x is different character then y).

Thanks a lot again.
Eyal.

The given egrep matches string of the form 'xyyx'. Please try once with sample data and let us know. It works fine for me here..

example of matches:
1331
2552
5775
ryyr
gjjg

I agree it does but it also captures thing I wouldn't want to catch like - GGGGG

Try running -
echo 'GGGGGGGG' | egrep '(.)([^\1]\2\1'

And you'll get GGGGGG echoed on the screen.

I wouldn't want it to capture anything else but thing of the form 'xyyx'.

Thanks a lot.
Eyal.

This one seems to work out.. try..

egrep -w '(.)([^\1])\2\1' inputfile > outfile