Scratching my head over old post

Hello !

Pondering over some old post & I need some help.

Here is the link below -

[LEFT]Pull Intermediate Strings
[/LEFT]
When I run this,

echo " process(130) Deleting Text on line 11 (ESN:27723211621B01DJ68AG) because a number wasnt 'AVAILABLE'  and is not found in the database" | awk '$0=$2' FS=\( RS=\) "

I get the following output

130 
ESN:27723211621B01DJ68AG 

Next,

echo " process(130) :27723211621B01DJ68AG) because a number wasnt 'AVAILABLE'  and is not found in the database" | sed "s/^.*(\([^)]*\)).*(....\(.*\)).*'\(.*\)'.*/\1 \2 \3/"

130 27723211621B01DJ68AG AVAILABLE

Questions -

  1. In there , the magic is happening all because of $0=$2. Can someone please explain to me as to what does $0=$2 means ?
  2. Also can someone go over the sed operation here as I tried to understand this for the past 2 hours and I am not getting anywhere with this.

best regards,
Lee.

In awk $0 refers to the whole record (line) while $1, $2 ... refer to the individual fields. So $0=$2 simply replaces the whole record with field 2.

In sed the (escaped) parentheses earmark part of a pattern for a later reference.

Take for instance this:

$ echo "charlie farley" | sed 's/\(charlie\) .*/\1 chaplin/'
charlie chaplin
$ echo "piggy malone" | sed 's/\(charlie\) .*/\1 chaplin/'
piggy malone

So basically I want to change "charlie farley" into "charlie chaplin" (or any other charlie) but leave anything else alone. I this extremely simple example I could have written

$ echo "charlie farley" | sed 's/charlie .*/charlie chaplin/'

But consider this

$ echo "charlie farley" | sed 's/\(charl.*\) .*/\1 chaplin/'

Now I can catch "charles" or "charley" too. More general.

The \1 refers to the first earmarked pattern. But what if I do two of them?

$ echo "Charles Chaplin" | sed 's/\(.*\) \(.*\)/\2, \1/'
Chaplin, Charles

This last example takes two strings separated by a space and reverses them, putting a comma between.

Hope that helps.

Andrew

1) awk with RS set to ) doesn't break records at line ends but on occurrences of the RS char / pattern. So, the first record (also referred to as $0 in awk ) is

 process(130

, which is split in two at the field separator FS=\( .

$0=$2

then overwrites the entire record with its second field, and, because used as an awk-pattern not equal 0 (= TRUE), takes the default action print.

2) Your echo text is corrupted in your post as the second left parenthesis is missing. The sed substitutes the BRE

^               start of line
.*(             (.) any character (*) zero or more times up to a left parenthesis
\([^)]*\)       first sub-expression (delimited by escaped! parentheses): zero or more non-right-par.
).*(....        next right par. followed by any char zero + times, then left par., then any four chars
\(.*\)          second sub-expr.: any chars zero or more times
).*'            next right par. followed by any char 0+ times, then apostrophe (single quote)
\(.*\)          third sub-expr.: any chars 0+ times
'.*             rest of line starting from apostrophe

by first sub-expr. space second sub-expr. space third sub-expr.