Pivot data using awk

Hi

My Input is like below

DELETE|MPI|AUD_UPD_AGENT|MPISYS
INSERT|MPI|AUD_UPD_AGENT|MPISYS
SELECT|MPI|AUD_UPD_AGENT|MPISYS
UPDATE|MPI|AUD_UPD_AGENT|MPISYS
DELETE|MPI|BDYMOD|MPISYS
INSERT|MPI|BDYMOD|MPISYS
SELECT|MPI|BDYMOD|MPISYS
UPDATE|MPI|BDYMOD|MPISYS
DELETE|MPI|BDYMOD_DESC|MPISYS
INSERT|MPI|BDYMOD_DESC|MPISYS
SELECT|MPI|BDYMOD_DESC|MPISYS
UPDATE|MPI|BDYMOD_DESC|MPISYS
DELETE|MPI|BUZZ_DESC|MPISYS
INSERT|MPI|BUZZ_DESC|MPISYS
SELECT|MPI|BUZZ_DESC|MPISYS
UPDATE|MPI|BUZZ_DESC|MPISYS
DELETE|temp|BUZZ_MOD|MPISYS
INSERT|temp|BUZZ_MOD|MPISYS
SELECT|temp|BUZZ_MOD|MPISYS
REFERENCES|temp|BUZZ_MOD|MPISYS

The output should be

Schema|Table|Select|Insert|Update|Delete|REFERENCES
MPI|AUD_UPD_AGENT|Yes|Yes|Yes|Yes|NO
MPI|BDYMOD|Yes|Yes|Yes...				
MPI|BDYMOD_DESC|..				
MPI|BUZZ_DESC|..				
temp|BUZZ_MOD|..				

And your awk script? What did it produce?

awk -F"|" '
{ o[$2 FS $3]=1; p[$2 FS $3 FS $1]=1; }
END {
   t="SCHEMA|TABLE|SELECT|INSERT|UPDATE|DELETE|REFERENCES"
   print t; n=split(t, a, FS);
   for (i in o) {
      l="";
      for (j=3; j<=n; j++) l=l FS ((p[i FS a[j]]) ? "Yes" : "No");
      print i l;
   }
}
' infile
1 Like

Could you please explain how it works

Hello dineshaila,

Following may help you in same.

awk -F"|" '                                                          ##### Setting up the field seprator to pipe(|) as per your Input_file.
{ o[$2 FS $3]=1; p[$2 FS $3 FS $1]=1; }                              ##### creating an array named o whose index is $2 and $3 whose value is 1 and an another array whose name is p with indexs $2, $3 and $1 and value is 1.
END {                                                                ##### END section starts here.
   t="SCHEMA|TABLE|SELECT|INSERT|UPDATE|DELETE|REFERENCES"           ##### mentioning a variable t whose value is "SCHEMA|TABLE|SELECT|INSERT|UPDATE|DELETE|REFERENCES"
   print t; n=split(t, a, FS);                                       ##### printing value of t and spliting the variable t into an array named a whose field seprator is pipe(|).
   for (i in o) {                                                    ##### starting for loop for array named o.
      l="";                                                          ##### nullyfing variable l here.
      for (j=3; j<=n; j++) l=l FS ((p[i FS a[j]]) ? "Yes" : "No");   ##### starting for loop from j's value 3 to till value of variable n, where n is the number of fields of variable named t. Now setting up value of l, value of array p is present for this then l's value should be yes or it should be no.
      print i l;                                                     ##### printing the value of variable i and then variable l.
   }
}
' Input_file                                                         ##### mentioning the Input_file name here. 
 

Thanks,
R. Singh