Format output using awk

greycells · October 16, 2013, 8:51pm

Hello all , need help with this ...

Input File

 
DEV                              %                     POOL
0CB4 FBA     2211300     81792   4  IE  RAID-5(3+1)  R5_EFD100_1                 
     -             -   1805376  82  IF  RAID-1       M2_FC300_1                  
     -             -    132360   6  IS  RAID-6(6+2)  R6_SA2000_1                 
0CB5 FBA     1883700    852732  45  IF  RAID-1       M2_FC300_1                  
     -             -    338112  18  IS  RAID-6(6+2)  R6_SA2000_1                 
0CB6 FBA     1638000     12096   1  IE  RAID-5(3+1)  R5_EFD100_1                 
     -             -    396528  24  IF  RAID-1       M2_FC300_1                  
     -             -    303024  18  IS  RAID-6(6+2)  R6_SA2000_1                 
0CB7 FBA     1605240     16380   1  IE  RAID-5(3+1)  R5_EFD100_1                 
     -             -     50664   3  IF  RAID-1       M2_FC300_1                  
     -             -    100752   6  IS  RAID-6(6+2)  R6_SA2000_1

Output needed

 
DEV,R5_EFD100_1%:M2_FC300_1%:R6_SA2000_1
0CB4,4:82:6
0CB5,0:45:18  
0CB6,1:24:18
0CB7,1:3:6

Need to format the input using the DEV ,% and POOL Column
If any pool is missing for a DEV , put 0 instead , for example in 2nd row
pool R5_EFD100_1 is not present for 0CB5 , so R5_EFD100_1% would be 0 for that column

Thanks !

Chubler_XL · October 16, 2013, 9:36pm

Here is an awk solution:

awk '
NF==8{DEV[++L]=$1}
NF==7{$0=DEV[L]" "$0}
NF==8{POOL[$8];P[$1,$8]=$5}
END {
   printf "DEV"
   for(p in POOL) printf "%s", (i++ ? "%:" : ",") p
   printf "\n"
   for(i=1;i<=L;i++) {
       printf "%s", DEV
           f=0
           for(p in POOL) printf "%s", (f++ ? ":" : ",") P[DEV,p]+0
           printf "\n"
   }
}
' infile

greycells · October 16, 2013, 9:52pm

Thanks a lot ... it works fine ... just the sequence of pools is different

for the above example .. i get the output in this format
DEV,M2_FC300_1%:R6_SA2000_1:R5_EFD100_1%

The columns below these are fine ... thats not the issue
The pools in the last column may not always come in the same sequence .
Can i "hard code" the pools ... so that the output is always in this format ..

DEV,R5_EFD100_1%:M2_FC300_1%:R6_SA2000_1

Thanks again ....
Thanks

RavinderSingh13 · October 16, 2013, 10:02pm

Hello Chubler_XL,

Could you please explain it, we will be grateful to you.

Thanks,
R. Singh

Chubler_XL · October 16, 2013, 10:48pm

Yes we can Hard code columns like this:

awk '
BEGIN{Pcnt=split("R5_EFD100_1,M2_FC300_1,R6_SA2000_1", POOL, ",")}
NF==8{DEV[++L]=$1}
NF==7{$0=DEV[L]" "$0}
NF==8{P[$1,$8]=$5}
END {
   printf "DEV"
   for(j=1;j<=Pcnt;j++) printf "%s", (i++ ? "%:" : ",") POOL
   printf "\n"
   for(i=1;i<=L;i++) {
       printf "%s", DEV
       f=0
       for(j=1;j<=Pcnt;j++) printf "%s", (f++ ? ":" : ",") P[DEV,POOL[j]]+0
       printf "\n"
   }
}
' infile

---------- Post updated at 12:48 PM ---------- Previous update was at 12:35 PM ----------

Explanation of solution in post #2 is:

NF==8{DEV[++L]=$1}
When line has 8 fields append field 1 to array DEV so with example data DEV[1]="0CB4", DEV[2]="0CB5" DEV[3]="0CB6"

NF==7{$0=DEV[L]" "$0}
When line has 7 fields append most recent DEV[] field to beginning of line giving 8 fields in line again.

NF==8{POOL[$8];P[$1,$8]=$5}
Keep list of POOLS used in POOL[] array, Store % value (field 5) in P[] array with index of DEV,POOL so P[0CB4,R5_EFD100_1]=4 P[0CB4,M2_FC300_1]=82

END {
   printf "DEV"
   for(p in POOL) printf "%s", (i++ ? "%:" : ",") p

When at the end of the input file print headings (i++ ? "%:" : ",") will print comma for first heading and percentage-sign colon for others

   for(i=1;i<=L;i++) {
       printf "%s", DEV
       f=0
       for(j=1;j<=Pcnt;j++) printf "%s", (f++ ? ":" : ",") P[DEV,POOL[j]]+0
       printf "\n"
   }

print DEV code and all values in the P[DEV,POOL] array in order. The +0 ensures missing values appear as zero.