Array usage issue with AWk

Shahul · June 5, 2011, 3:59pm

Hi friends,
I m trying to write small script in awk alone.
And I have tried below logic for one of my automation - taking the first column of the file in array
and will read in for loop for each time when i grep the array value in the same file
I should get complete output whichever is matching in which I would do some other checks.
but failing somewhere in my script,also never worked on reading the file twice in AWK
Quick help would be appreciated.

fileA
xxxxxx   567890  456789
kjlajl   3456789 489080
xxxxxx   klkio   467830

Output expected:
xxxxx "array executed"(from file after reading array one by one in for loop)
xxxxx  567890  456789
xxxxx   klkio   467830
kjlajl "array execcuted"(again from file after reading array)
kjlajl  3456789 489080

code I have tried:
$ nawk -F, 'NR==FNR{arr1[i++]=$1}
 {
for(i in arr1)
 {if($0~arr1) print arr1"executed""\n"$0}}' fileA

Thanks
Sha

agama · June 5, 2011, 4:52pm

Since you are only reading one file, NR will always be the same as FNR so there is no reason to include that test as it's always true. Secondly, it's never good form to use a variable, i in this case, for loop control when it is used outside of the loop with the assumption that it has some previously sane value.

Here is an example that illustrates two methods, one if output order must match the same order that field 1 was seen, and one if any output order is ok.

# if order is important
awk '
    {
        if( !seen[$1]++ )               # track order that field 1 was observed
            order[oidx++] = $1;

        map[$1,idx[$1]++]= $0;          # save each line based on the first field
    }
    END {
        for( o = 0; o < oidx; o++ )      # for each field 1 in the order seen
        {
            printf( "executed: %s\n", order[o] );
            for( m = 0; m < idx[order[o]]; m++ )   # for each line associated with field 1 value
                printf( "%s\n", map[order[o],m] );

        }
    }
' testfile

# if output order isn't important
awk '
    {
        map[$1,idx[$1]++]= $0;          # save each line based on the first field
    }
    END {
        for( i in idx )              # for each field 1 seen (any order)
        {
           printf( "executed: %s\n", i );
           for( m = 0; m < idx; m++ )   # for each line associated with this field 1
               printf( "%s\n", map[i,m] );
        }
    }
' testfile

I just noticed you're using nawk , you can replace awk with nawk in the examples and you should be fine.

Shahul · June 5, 2011, 5:21pm

Thanks a lot for the detailed Answer aagama!!

It worked

rdcwayx · June 5, 2011, 8:41pm

sort infile |awk '!a[$1]++ {print "executed: " $1}1'