Sed/UNIX help

Can someone explain what this sed statement mean to me?

zcat 2014-04-09.txt.gz | grep -a 'CodeOne:.*CodeTwo=101' | grep -v 'List=Pass' | sed 's/^.*Pass=\([0-9]*\),.*Reject=\([0-9]*\),.*All=\([a-zA-Z]*\),.*$/\1,\2,\3/'

I think what confusing mean is the .* and the *$. Now exactly sure how they work.

Change any input line that matches:

arbitraryPASS=digits,arbitraryReject=digits,arbitraryAll=letters,arbitrary

to a line of the form:

digits,digits,letters
1 Like

.* in the sed RE means any number of characters (including 0 characters).
The ^.* at the beginning and the .*$ at the end of your RE do not do anything but waste a few (or many, dependent on the sed implementation) CPU cycles.
Well, the ^.* , because it is a maximum (greedy) match, shifts the remainder to the rightmost match.

Thanks. That's quite helpful. Explain /\1,\2,\3/. It apear this mean string 1,2 and 3?

So basically whats's going is sed substitute arbitraryPASS=digits,arbitraryReject=digits,arbitraryAll=letters,arbitrary with digits,digits, letters?

\1 in the replacement string in the sed substitute command is replaced by the set of characters matched by expression in the 1st \(expression\) in the basic regular expression used in the sed substitution command's search pattern, \2 is replaced by the set of characters matched by the 2nd \(expression\) , and \3 is replaced by the set of characters matched by the 3rd \(expression\) .

1 Like

Thanks. That make sense. Question. Would Pass, Reject and All be part of that expression? Or is it just looking for the values of Pass, Reject and All like ([0-9]\ and ([0-9]\ and a-zA-Z]*\?

In the basic regular expression specifying the search pattern in the sed substitute command:

s/^.*Pass=\([0-9]*\),.*Reject=\([0-9]*\),.*All=\([a-zA-Z]*\),.*$/\1,\2,\3/

are Pass= , Reject= , and All= matched by any of the expressions shown in red text between \( and \) pairs? No!

The BRE: abc\(123\)def will match any string that contains the characters abc123def contiguously and in that order. If that BRE is used to match a string in a sed substitute command, \1 in the replacement string will be replaced by the characters that matched the subexpression between the subexpression delimiters \( and \) (in this case the substring 123 ).

1 Like

That explains a lot. Thanks for all your help.