Using awk to reformat file output

kieranfoley · August 14, 2014, 7:16am

Hi there. I need to reformat a large file. Here is a sample of the file.

 
NETIK0102_UCS_Boot_a,NETIK0102_UCS_Boot_b
5200 2438 70G
5200 2439 70G
NETIK0102_UCS_HBA0_a,NETIK0102_UCS_HBA1_b,NETIK0102_UCS_HBA2_a,NETIK0102_UCS_HBA3_b
2673 19D7 55G
2673 19C0 30G
2673 19F5 120G
SEIADWFMPRD1-a,SEIADWFMPRD1-b
2673 1992 8.43G
2673 1993 8.43G
2673 19B1 16.9G

The line separated by commas needs to be split and I need to print the strings in the first column...

Here is the way I need the data presented.

 
NETIK0102_UCS_Boot_a 5200 2438 70G
NETIK0102_UCS_Boot_a 5200 2439 70G
NETIK0102_UCS_Boot_b 5200 2438 70G
NETIK0102_UCS_Boot_b 5200 2439 70G
NETIK0102_UCS_HBA0_a 2673 19D7 55G
NETIK0102_UCS_HBA0_a 2673 19C0 30G
NETIK0102_UCS_HBA0_a 2673 19F5 120G
NETIK0102_UCS_HBA1_b 2673 19D7 55G
NETIK0102_UCS_HBA1_b 2673 19C0 30G
NETIK0102_UCS_HBA1_b 2673 19F5 120G
NETIK0102_UCS_HBA2_a 2673 19D7 55G
NETIK0102_UCS_HBA2_a 2673 19C0 30G
NETIK0102_UCS_HBA2_a 2673 19F5 120G
NETIK0102_UCS_HBA3_b 2673 19D7 55G
NETIK0102_UCS_HBA3_b 2673 19C0 30G
NETIK0102_UCS_HBA3_b 2673 19F5 120G
SEIADWFMPRD1-a 2673 1992 8.43G
SEIADWFMPRD1-a 2673 1993 8.43G
SEIADWFMPRD1-a 2673 19B1 16.9G
SEIADWFMPRD1-b 2673 1992 8.43G
SEIADWFMPRD1-b 2673 1993 8.43G
SEIADWFMPRD1-b 2673 19B1 16.9G

T

SriniShoo · August 14, 2014, 7:48am

awk '/,/ {rcrsv(); c=split($0, a, ","); n=0; delete b; getline} {b[++n] = $0}
func rcrsv() {
  for(i=1; i<=c; i++) {for(j=1; j<=n; j++) {print a, b[j]}}}
END {rcrsv()}' file

rbatte1 · August 14, 2014, 7:57am

Hello kieranfoley,
I have a few to questions pose in response first:-

Is this homework/assignment? There are specific forums for these.
What have you tried so far?
What output/errors do you get?
What OS and version are you using?
What are your preferred tools? (C, shell, perl, awk, etc.)
What logical process have you considered? (to help steer us to follow what you are trying to achieve)

Most importantly, What have you tried so far?

There are probably many ways to achieve most tasks, so giving us an idea of your style and thoughts will help us guide you to an answer most suitable to you so you can adjust it to suit your needs in future.

We're all here to learn and getting the relevant information will help us all.

kieranfoley · August 14, 2014, 8:34am

Hi rbatte1 thank you for your reply.

No this is not homework. I use awk/sed/grep scripts on occasion and at times I get caught in a rut when more complex scripting is required. For this particular thread I have been tasked with having to pull relevent info from a file is produced daily from our SAN storage enviroment. My background is unix, I use shell scripts more than anything else to filter out information when needs be.

This is a sample of the original file.

 
/clusters/cluster-1/exports/storage-views/NETIK0102_Boot:
Name                      Value
------------------------  -----------------------------------------------------------------------------
caw-enabled               true
controller-tag            -
initiators                [NETIK0102_UCS_Boot_a, NETIK0102_UCS_Boot_b]
operational-status        ok
port-name-enabled-status  [P00000000476043B3-A0-FC01,true,ok, P00000000476046CC-A0-FC00,true,ok,
                          P00000000477043B3-B0-FC00,true,ok, P00000000477046CC-B0-FC01,true,ok]
ports                     [P00000000476043B3-A0-FC01, P00000000476046CC-A0-FC00,
                          P00000000477043B3-B0-FC00, P00000000477046CC-B0-FC01]
virtual-volumes           [(0,device_Symm5200_2438_1_vol,VPD83T3:60001440000000106046cc5f567eb9b9,70G),
                          (1,device_Symm5200_2439_1_vol,VPD83T3:60001440000000106046cc5f567eb9bf,70G)]
write-same-16-enabled     true

/clusters/cluster-1/exports/storage-views/NETIK0102_Shared:
Name                      Value
------------------------  -------------------------------------------------------------------------------
caw-enabled               true
controller-tag            -
initiators                [NETIK0102_UCS_HBA0_a, NETIK0102_UCS_HBA1_b, NETIK0102_UCS_HBA2_a,
                          NETIK0102_UCS_HBA3_b]
operational-status        ok
port-name-enabled-status  [P00000000476043B8-A0-FC02,true,ok, P0000000047604458-A0-FC03,true,ok,
                          P00000000477043B8-B0-FC03,true,ok, P0000000047704458-B0-FC02,true,ok]
ports                     [P00000000476043B8-A0-FC02, P0000000047604458-A0-FC03,
                          P00000000477043B8-B0-FC03, P0000000047704458-B0-FC02]
virtual-volumes           [(0,device_Symm2363_19D7_1_vol,VPD83T3:60001440000000106046cc5f567eb7dc,55G),
                          (1,device_Symm2363_19C0_1_vol,VPD83T3:60001440000000106046cc5f567eb769,30G),
                          (2,device_Symm2363_19F5_1_vol,VPD83T3:60001440000000106046cc5f567eb872,120G),
                          (3,device_Symm2363_1A03_1_vol,VPD83T3:60001440000000106046cc5f567eb8ae,280G),
                          (4,device_Symm2363_19DC_1_vol,VPD83T3:60001440000000106046cc5f567eb7f5,66G),
                          (5,device_Symm2363_1A6C_1_vol,VPD83T3:60001440000000106046cc5f567eb93f,600G),
                          (6,device_Symm2363_1998_1_vol,VPD83T3:60001440000000106046cc5f567eb9ad,10G),
                          (7,device_Symm2363_19CE_1_vol,VPD83T3:60001440000000106046cc5f567eb9b3,40G),
                          (8,device_Symm2363_1A7E_1_vol,VPD83T3:60001440000000106046cc5f567eb9c5,1000G),
                          (9,device_Symm2363_19B9_1_vol,VPD83T3:60001440000000106046cc5f567eb9cb,20G),
                          (10,device_Symm2363_19BA_1_vol,VPD83T3:60001440000000106046cc5f567eb9d1,20G),
                          (11,device_Symm2363_19BB_1_vol,VPD83T3:60001440000000106046cc5f567eb9d7,20G),
                          (12,device_Symm2363_19BC_1_vol,VPD83T3:60001440000000106046cc5f567eb9dd,20G),
                          (13,device_Symm2363_1AA3_1_vol,VPD83T3:60001440000000106046cc5f567eb9e3,2.93T),
                          (14,device_Symm2363_1A94_1_vol,VPD83T3:60001440000000106046cc5f567eb9e9,750G),
                          (15,device_Symm2363_1A88_1_vol,VPD83T3:60001440000000106046cc5f567eb9ef,300G)]
write-same-16-enabled     true

From this file I have been able to use the following commands to pull out the releven information.

cat file | awk '/initiators/,/write-same-16-enabled/ {print $0}' \
 | grep -v operational-status \
 | grep -v P00000000 \
 | grep -v write-same-16-enabled \
 | grep -v '\[\]' \
 | sed 's/ //g' \
 | sed 's/initiators/initiators    /g' \
 | nawk '/a,$|b,$|A,$|B,$/ {printf "%s",substr($0, 1, length-1)",";next} 1' \
 | sed 's/initiators    \[//g;s/\]//g' | sed 's/virtual-volumes\[//g' \
 | nawk '{if(/^\(/) {FS=","; print$2,$4 }else {print$0}}' \
 | sed 's/device_Symm2363_/2673 /g' \
 | sed 's/device_Symm5200_/5200 /g' \
 | sed 's/_1_vol//g' \
 | sed 's/)//g'

The code above is not very efficient I know but it works. I will go back to cleaning it up once I get it all working the way I want it. But for the past few days I have struggled with trying to get the last bit regarding what I posted earlier.

I thank you for your help on this.

---------- Post updated at 01:34 PM ---------- Previous update was at 01:25 PM ----------

Hi SriniShoo,

I have tried you code but I am getting the following error. I tried using awk and nawk (I am running this on a Solaris box).

nawk: you can only delete array[element] at source line 1
 context is
        /,/ {rcrsv(); c=split($0, a, ","); n=0; delete >>>  b; <<<
nawk: syntax error at source line 1
nawk: illegal statement at source line 1

rbatte1 · August 14, 2014, 8:40am

I've taken the liberty to split your huge single line of code onto several to make it more readable.

There are lots of steps in this processing. Some grep commands in sequence could be combined, as could the sed commands.

Could you explain a bit more about the logic you are using to generate so many steps?

You could re-write the section:-

 | grep -v operational-status \
 | grep -v P00000000 \
 | grep -v write-same-16-enabled \
 | grep -v '\[\]' \

.... as

 | egrep -v "operational-status|P00000000|write-same-16-enabled|\[\]" \

kieranfoley · August 14, 2014, 9:41am

Yes thanks for cleaning up the greps. I have also cleaned up the sed substitutions somebit. The reason I have ended up with so many steps is because I have been doing this bit by bit. I run an awk command and sed to filter out piece by piece and I have being piping the commands together. Not very efficient I know. This is a little bit cleaner but not to your liking I'm sure. Thanks!

 
cat file | awk '/initiators/,/write-same-16-enabled/ {print $0}' \
| egrep -v "operational-status|P00000000|write-same-16-enabled|\[\]" \
| sed 's/ //g;s/initiators/initiators    /g' \
| nawk '/a,$|b,$|A,$|B,$/ {printf "%s",substr($0, 1, length-1)",";next} 1' \
| sed 's/initiators    \[//g;s/\]//g;s/virtual-volumes\[//g' \
| nawk '{if(/^\(/) {FS=","; print$2,$4 }else {print$0}}' \
| seds/device_Symm2363_/2673 /g;\
s/device_Symm5200_/5200 /g;s/_1_vol//g;s/)//g'

---------- Post updated at 02:41 PM ---------- Previous update was at 01:58 PM ----------

SriniShoo,

when I take out the "delete" it works. The way I want it. Thanks!!

 
awk '/,/ {rcrsv(); c=split($0, a, ","); n=0; getline} {b[++n] = $0}
func rcrsv() {
  for(i=1; i<=c; i++) {for(j=1; j<=n; j++) {print a, b[j]}}}
END {rcrsv()}' file