awk question

hankooknara · December 27, 2006, 2:24pm

Lets say I have program named "programA"
and you run like below

programA uniquecallerID somefilename

which will grep out something like this

------------somefilename-----------

asdfasdfasdf
asdfasdf
callIDabc
asdfd

sdfd
callIDabc
sdrsdf
sdfasdfsadfdsf

asdfasdfasdfasdfasdf
callIDabd
asdfasd

casdfsdf
callIDabc
sdf

asdfasdfasdfasdf
callIDabd
sdfs
-----------------------------

If this program is ran like

programA callIDabd somefilename

output would show

asdfasdfasdfasdfasdf
callIDabd
asdfasd

asdfasdfasdfasdf
callIDabd
sdfs

Below is the actual program that does this but I don't understand how it works.
Can someone kindly explain how this thing works?

Big thanks in advance.

#!/usr/bin/bash
cid=$1
cidname=$2
nawk 'BEGIN {FS="\n"; RS=""; ORS="\n\n"}
{if (NF==1) hdrvar=$0; else if ($0~/'$cid'/) print hdrvar,"\n",$0 }' $cidname

vish_indian · December 29, 2006, 12:28am

I'll try
#!/usr/bin/bash
cid=$1
cidname=$2
nawk 'BEGIN {FS="\n"; RS=""; ORS="\n\n"}
{if (NF==1) hdrvar=$0; else if ($0~/'$cid'/) print hdrvar,"\n",$0 }' $cidname

cid=$1 assigns the value you pass as argument1 to cid which is later searched for in else if ($0~/'$cid'/) .

cidname=$2 is the filename that you want to search data in

Logic is something like this,
If the line read contains only one field, assign that value to hdrvar (if (NF==1) hdrvar=$0).
Else
Check to see if the line contains the string mentioned in variable cid, then print value in hdrvar and the matched line. (else if ($0~/'$cid'/) print hdrvar,"\n",$0)

So, it prints the line matching the given string and the last one preceeding it having one field only.

Perderabo · December 29, 2006, 2:16am

That FS and RS is a little bit obscure...

FS=/n redefines the field separator to be the newline character. So each line is a field.
RS="" in some magic way tells awk (or nawk or gawk) that a blank line is the record separator.

So we are setting things up to process multi-line records.

hankooknara · December 30, 2006, 6:27pm

I will need to review more as I am not understanding...

Perderabo · December 30, 2006, 9:40pm

What is it that you do not understand?

ripat · December 31, 2006, 3:12am

Not that I want to add confusion here but the same result could have been achieved with more simple code that is easier to understand.

cid=$1
cidname=$2
nawk '
  BEGIN {FS="\n"; RS=""; ORS="\n\n"}
  /'$cid'\n/ {print}
' $cidname

First line BEGIN {...}
This line is executed only once. Before awk looks at the first line of the file.

As mentioned above, the FS and RS variables are set so that each block of data, separated by a empty line (""), is seen as a record. Each record is made of fields separated by the new line character (\n). ORS (Output Record Separator) does exactly that. It sets the output separator.
/pattern/ {action}
Starting here awk will scan the input file $cidname, one record at the time. If it finds a record (bloc of several lines, remember?) that contains the /pattern/, awk prints that record.

hankooknara · January 1, 2007, 10:20pm

excellent job all of you.. i actually understood..

now to the bottom part.. this part.. i am gonna try to understand myself as well as trying's ripat's script... u guys r great..

{if (NF==1) hdrvar=$0; else if ($0~/'$cid'/) print hdrvar,"\n",$0 }' $cidname