Hello All,
I have come across a question from colleague about complex regex, so I written a regex using grep's -P option in PCRE regex. Since its a new learning for me, so thought to share with forums.
Lets say we have a Input_file with following test data:
cat Input_file
PROJECT = 1.1.1.1
Project = 1.1.1.1.1.1.1.1
PROJECT = "1.1.1.1.1"
ProJEct = '1.1'
Now conditions here are first keyword project is fixed but could be in any case, then versions side is the main thing which we need to get as an output. In versions apart from first major version all can have alphabets also.
So I have come up with:
grep -ioP 'project\D+\K(\d+\.([\d,a-z,A-Z]+\.){1,}[\d,a-z,A-Z]+|\d+\.[\d,a-z,A-Z]+|\d+)' Input_file
Explanation of above code:
-i
: means ignore case for grep which will help us to match any kind of Project string in lines.
-o
: means give only exact match of the line.
-P
: means it enables PCRE regex suite for grep, which has all kind of regex mechanism in it.
Now coming to main code part:
project\D+
: Look for string project(in any case) till all NON digits value(\D denotes it).
\K
: means forget all previous matches this is a GREAT feature of grep and I LOVED it
d+\.([\d,a-z,A-Z]+\.){1,}[\d,a-z,A-Z]+|\d+\.[\d,a-z,A-Z]+|\d+
: Here I am matching digits OR digits with alphabets with one or more occurences and only digits too for all lines, to cover all kind of cases.
Since after \k
( denotes the match which should be printed so it will print only matched part in lines.
I am still learning PCRE regex, any suggestions, improvements are super allowed
Cheers and Happy learning.
Thanks,
R. Singh