Assistance required with awk and regular expressions

jimbojames · April 28, 2014, 9:03pm

Hello there,

I am trying to get my head around the section below of a script we use that incorporates AWK and Regular Expressions.

{ match($0,"The broker[^.][^.]*[.]");print $1,$2,$3 ":", substr($0, RSTART,RLENGTH)}

I have a basic understanding of how match works, what I am struggling with is the ($0,"The broker[^.][^.]*[.]") section.

What is $0, and what does [^.][^.]*[.] signify?

Any help would be greatly appreciated!

Don_Cragun · April 28, 2014, 10:24pm

This line of awk code does two things:

Look in the current input line ( $0 ) for a sentence starting with The broker followed by one character that is not a period ( [^.] ) followed by zero or more characters that are not periods ( [^.]* ) followed by a period ( [.] ).
Print the 1st input field followed by the output field separator (aka OFS) ( $1, , the 2nd field followed by the OFS ( $2, ), the 3rd field followed by a colon followed by the OFS ( $3 ":", ) followed by the sentence found by the match() function if a sentence was found or an empty string if no sentence was found( substr($0, RSTART, RLENGTH) ).

jimbojames · April 28, 2014, 11:19pm

Thank you Don, that is a great help!