Sed/UNIX help

scj2012 · April 15, 2014, 1:27pm

Can someone explain what this sed statement mean to me?

zcat 2014-04-09.txt.gz | grep -a 'CodeOne:.*CodeTwo=101' | grep -v 'List=Pass' | sed 's/^.*Pass=\([0-9]*\),.*Reject=\([0-9]*\),.*All=\([a-zA-Z]*\),.*$/\1,\2,\3/'

I think what confusing mean is the .* and the *$. Now exactly sure how they work.

Don_Cragun · April 15, 2014, 1:43pm

Change any input line that matches:

arbitraryPASS=digits,arbitraryReject=digits,arbitraryAll=letters,arbitrary

to a line of the form:

digits,digits,letters

MadeInGermany · April 15, 2014, 1:43pm

.* in the sed RE means any number of characters (including 0 characters).
The ^.* at the beginning and the .*$ at the end of your RE do not do anything but waste a few (or many, dependent on the sed implementation) CPU cycles.
Well, the ^.* , because it is a maximum (greedy) match, shifts the remainder to the rightmost match.

scj2012 · April 15, 2014, 2:06pm

Thanks. That's quite helpful. Explain /\1,\2,\3/. It apear this mean string 1,2 and 3?

So basically whats's going is sed substitute arbitraryPASS=digits,arbitraryReject=digits,arbitraryAll=letters,arbitrary with digits,digits, letters?

Don_Cragun · April 15, 2014, 2:26pm

\1 in the replacement string in the sed substitute command is replaced by the set of characters matched by expression in the 1st $expression$ in the basic regular expression used in the sed substitution command's search pattern, \2 is replaced by the set of characters matched by the 2nd $expression$ , and \3 is replaced by the set of characters matched by the 3rd $expression$ .

scj2012 · April 15, 2014, 2:35pm

Thanks. That make sense. Question. Would Pass, Reject and All be part of that expression? Or is it just looking for the values of Pass, Reject and All like ([0-9]\ and ([0-9]\ and a-zA-Z]*\?

Don_Cragun · April 15, 2014, 3:49pm

In the basic regular expression specifying the search pattern in the sed substitute command:

s/^.*Pass=\([0-9]*\),.*Reject=\([0-9]*\),.*All=\([a-zA-Z]*\),.*$/\1,\2,\3/

are Pass= , Reject= , and All= matched by any of the expressions shown in red text between $ and $ pairs? No!

The BRE: abc$123$def will match any string that contains the characters abc123def contiguously and in that order. If that BRE is used to match a string in a sed substitute command, \1 in the replacement string will be replaced by the characters that matched the subexpression between the subexpression delimiters $ and $ (in this case the substring 123 ).

scj2012 · April 15, 2014, 4:12pm

don cragun:

In the basic regular expression specifying the search pattern in the sed substitute command:
s/^.*Pass=$[0-9]*$,.*Reject=$[0-9]*$,.*All=$[a-zA-Z]*$,.*$/\1,\2,\3/
are Pass= , Reject= , and All= matched by any of the expressions shown in red text between $ and $ pairs? No!

The BRE: abc$123$def will match any string that contains the characters abc123def contiguously and in that order. If that BRE is used to match a string in a sed substitute command, \1 in the replacement string will be replaced by the characters that matched the subexpression between the subexpression delimiters $ and $ (in this case the substring 123 ).

That explains a lot. Thanks for all your help.