Doubt in awk

Hi,

I got a below requirement from this forum, but the solution provided was not clear.

Below is the requirement

 
Input file
 
A 1 Z
A 1 ZZ
B 2 Y
B 2 AA
 
 
Required output
 
B Y|AA
A Z|ZZ

Solution provided wa

/usr/xpg4/bin/awk '{a[$1]=a[$1]?a[$1]"|"$3:$3}END{for (i in a){print i,a }}' inputfile

The portion is in red was not understandable.

What I assumption is

{a[$1]=a[$1]?a[$1]"|"$3:$3}

if a[$1] the first row fields are same for next row then including pipeline and printing field $3 , but I am sure of : (colon)

I tried below to understand the code

/usr/xpg4/bin/awk -F" " '{a[$1]=a[$1]?a[$1]:$3}END{for (i in a){print i,a }}' inputfile

but getting syntax error.

{a[$1]=a[$1]?a[$1]"|"$3:$3} --> If array a with index $1 contains some element then concatinate existing array element with current line $3 where separator between existing element and new element (column 3 $3 from current line read) is pipe |
else (meaning so far no such index in array a ) then a[$1] equal to current line $3

1 Like

Hi Akshay,

Thanks for your explanation

As you said the pipeline is seprator then it should be in second place (as emntioned below) not in the 4th place, also not sure what colon : plays here

B |Y AA
A |Z ZZ

colon : is like else statement in ternary

Example :
A traditional if-else construct in C, Java and JavaScript is written:

if (a > b) {
    result = x;
} else {
    result = y;
}

This can be rewritten as the following statement:

result = a > b ? x : y;

I get result properly see here

$ awk '{a[$1] = a[$1] ? a[$1] "|" $3: $3}END{for(i in a) print i,a}' <<EOF
A 1 Z
A 1 ZZ
B 2 Y
B 2 AA
EOF

A Z|ZZ
B Y|AA

---------- Post updated at 02:27 PM ---------- Previous update was at 02:17 PM ----------

This is the best and safest way actually I prefer.

awk '{a[$1] = ( $1 in a ) ? a[$1] "|" $3: $3}END{for(i in a) print i,a}'

Thanks.

I understood a bit now, but how this

a[$1]

is printing

A Z

then

"|" $3: $3

is printing

|ZZ

my understanding is

 
a[$1] "|"    $3 :  $3}
|        |    |      |
|        |    |      |
V        V    V      V
A        |    Z     ZZ
B        |    Y     AA
 
But why its printing like below
 
A Z|ZZ
B Y|AA
 
 
 

Hope this will clear your doubts please note we are printing array elements in END block

When awk reads line number  : 1 Array a[A] = Z           row : A 1 Z                             
When awk reads line number  : 2 Array a[A] = Z|ZZ        row : A 1 ZZ                            
When awk reads line number  : 3 Array a = Y           row : B 2 Y                             
When awk reads line number  : 4 Array a = Y|AA        row : B 2 AA