AWK Merge Fields for Print Output

I've got a file with each record on a separate line and each record contains 34 fields separated by a colon and i'm trying to re-arrange the order of the fields and merge together certain fields separated by a slash (like field7/field28). I tried using an awk print statement like

awk -F: 'BEGIN {OFS=":"}{ print $1, $2, $24, $23, $25, $17, $26, $4, $27, $5, $6, $7"/"$28, $8"/"$29, $9"/"$30, $10"/"$31, $11"/"$32, $33, $12, $13, $21, $14, $3, $34, $19, $22, $18, $20, $15, $16 }' ~/Desktop/fake_data2.txt > ~/Desktop/Output.txt

but if one or both of the fields are empty it will print the slash; which is not needed if they are both empty or if the second field is empty.

Being an AWK newbie i need help in writing the AWK statement to do what i meant it to do, not what i told it to do. Can i put an IF statement inside a print statement?

GIVEN INPUT:

---$1----:--$2-:-----$3----:-$4-:-$5-:--$6-:--$7-:-$8-:-$9-:-$10-:-$11-:-$12:-$13--:--$14--:$15:$16:-$17--:-$18:-$19-:$20:$21:$22:-------$23-----:24:-$25-:-$26:-$27:-----$28----:-$29-:-$30-:-$31-:$32:-$33:-$34

SEP 19-98:SSSSS:       $650:5555:2222:9    :4    :4   :2   :2    :2    :2000:   375:FAKETF :   :   :LAPS  :  2 :  76 :   :   :  1:      $1,550   :FT:NW1LC:0283:1284:            :     :     :3    :3  :    :AAAA
SEP 26-98:TTTTT:           :1111:3333:9    :6    :6   :6   :7    :6    :1000:   380:ROTHE  :   :   :LAPS  :  6 :  86 :   :   :  3:      $2,620   :FT:NW1LC:0300:1330:            :     :     :9Z   :7  :    :BBBB
SEP 28-98:UUUUU:           :7777:6666:7    :4    :3   :4   :5    :9    :2000:  4080:ROTHE  :   :   :LAPS  :  0 :  56 :   :   :  4:      $2,600   :FT:FNWTC:0284:1293:            :     :     :4    :5H :    :CCCC
OCT 12-98:VVVVV:     $3,000:4444:0000:5    :5    :5   :4   :4    :1    :3000:  2490:ROTHE  :   :   :LAPS  :  5 :  50 :   :   :  6:      $6,600   :FX:CCQ3Y:0282:1301:            :     :     :3H   :H  :    :DDDD
OCT 19-98:WWWWW:     $3,000:0101:1010:7    :7    :4   :1   :3    :1    :5000:   800:ROTHE  :   :   :LAPS  :  1 :  59 :   :   : 10:      $6,500   :FT:CCQ3Y:0273:1293:            :     :     :2    :XX :3333:EEEE
OCT 26-98:XXXXX:       $300:0011:0010:2    :5    :5   :3   :3    :5    :2000:   360:ROTHE  :   :   :LAPS  :  3 :  41 :   :   :  5:      $6,250   :FT:CCQ3Y:0291:1294:            :     :     :5H   :7T :    :FFFF
JAN  6-98:YYYYY:     $1,500:8888:3333:2    :4    :4   :2   :1    :1    :1000:   130:VIKING :   :   :LAPS  :  9 :  25 :   :   :  2:      $3,120   :ZZ:CHGNW:0292:1313:       5    :7    :H    :1H   :T  :9999:GGGG

EXPECTED OUTPUT:

---$1----:--$2-:24:-------$23-----:--$25:-$17--:-$26:-$4-:-$27:-$5-:--$6-:-$7/$28:---$8/$29---:--$9/$30-:$10/$31:$11/$32:-$33:-$12:-$13--:$21:--$14--:----$3--:-$34:-$19-:$22:-$18:$20:$15:$16
		
SEP 19-98:SSSSS:FT:      $1,550   :NW1LC:LAPS  :0283:5555:1284:2222:9    :4      :4           :2        :2/3    :2/3    :    :2000:   375:   :FAKETF :    $650:AAAA:  76 :  1:  2 :   :   :   
SEP 26-98:TTTTT:FT:      $2,620   :NW1LC:LAPS  :0300:1111:1330:3333:9    :6      :6           :6        :7/9Z   :6/7    :    :1000:   380:   :ROTHE  :        :BBBB:  86 :  3:  6 :   :   :   
SEP 28-98:UUUUU:FT:      $2,600   :FNWTC:LAPS  :0284:7777:1293:6666:7    :4      :3           :4        :5/4    :9/5H   :    :2000:  4080:   :ROTHE  :        :CCCC:  56 :  4:  0 :   :   :   
OCT 12-98:VVVVV:FX:      $6,600   :CCQ3Y:LAPS  :0282:4444:1301:0000:5    :5      :5           :4        :4/3H   :1/H    :    :3000:  2490:   :ROTHE  :  $3,000:DDDD:  50 :  6:  5 :   :   :   
OCT 19-98:WWWWW:FT:      $6,500   :CCQ3Y:LAPS  :0273:0101:1293:1010:7    :7      :4           :1        :3/2    :1/XX   :3333:5000:   800:   :ROTHE  :  $3,000:EEEE:  59 : 10:  1 :   :   :   
OCT 26-98:XXXXX:FT:      $6,250   :CCQ3Y:LAPS  :0291:0011:1294:0010:2    :5      :5           :3        :3/5H   :5/7T   :    :2000:   360:   :ROTHE  :    $300:FFFF:  41 :  5:  3 :   :   :   
JAN  6-98:YYYYY:ZZ:      $3,120   :CHGNW:LAPS  :0292:8888:1313:3333:2    :4/5    :4/7         :2/H      :1/1H   :1/T    :9999:1000:   130:   :VIKING :        :GGGG:  25 :  2:  9 :   :   :   

WHAT I GOT WITH MY BAD CODE:

SEP 19-98:SSSSS:FT:      $1,550   :NW1LC:LAPS  :0283:5555:1284:2222:9    :4    /            :4   /     :2   /     :2    /3    :2    /3  :    :2000:   375:   :FAKETF :       $650:AAAA:  76 :  1:  2 :   :   :   
SEP 26-98:TTTTT:FT:      $2,620   :NW1LC:LAPS  :0300:1111:1330:3333:9    :6    /            :6   /     :6   /     :7    /9Z   :6    /7  :    :1000:   380:   :ROTHE  :           :BBBB:  86 :  3:  6 :   :   :   
SEP 28-98:UUUUU:FT:      $2,600   :FNWTC:LAPS  :0284:7777:1293:6666:7    :4    /            :3   /     :4   /     :5    /4    :9    /5H :    :2000:  4080:   :ROTHE  :           :CCCC:  56 :  4:  0 :   :   :   
OCT 12-98:VVVVV:FX:      $6,600   :CCQ3Y:LAPS  :0282:4444:1301:0000:5    :5    /            :5   /     :4   /     :4    /3H   :1    /H  :    :3000:  2490:   :ROTHE  :     $3,000:DDDD:  50 :  6:  5 :   :   :   
OCT 19-98:WWWWW:FT:      $6,500   :CCQ3Y:LAPS  :0273:0101:1293:1010:7    :7    /            :4   /     :1   /     :3    /2    :1    /XX :3333:5000:   800:   :ROTHE  :     $3,000:EEEE:  59 : 10:  1 :   :   :   
OCT 26-98:XXXXX:FT:      $6,250   :CCQ3Y:LAPS  :0291:0011:1294:0010:2    :5    /            :5   /     :3   /     :3    /5H   :5    /7T :    :2000:   360:   :ROTHE  :       $300:FFFF:  41 :  5:  3 :   :   :   
JAN  6-98:YYYYY:ZZ:      $3,120   :CHGNW:LAPS  :0292:8888:1313:3333:2    :4    /       5    :4   /7    :2   /H    :1    /1H   :1    /T  :9999:1000:   130:   :VIKING :     $1,500:GGGG:  25 :  2:  9 :   :   :

If anybody could give me some direction, i would appreciate the help!

You where almost there already. Simply put together the output string step by step, using if(....)s to determine if you should put in the slashes or not. You will have to provide the field separators yourself, but that is easy, yes?:

/* somewhere in the middle of the code... */
output=""                                     // empty the output buffer

output=sprintf("%10s:", field1);              // first field, exactly 10 chars wide

if( field2 == "" && field3 == "" ) {
     output=sprintf("%s          :", output); // 10 blanks after the field1
} else {
     output=sprintf("%s %s / %s:", output, field2, field3); // otherwise "f2 / f3"
}
output=sprintf("%s%10s:", output, field4);    // add field 4 to output

...                                           // add many more fields the same way

printf("%s\n", output)                        // at last  put a newline char in and print it

bakunin

Thanks for the reply! I tried your suggestions by creating an awk program called blockoutput.awk using the following code to try merging just the first set of fields i want to merge ($7 and $28):

#TERMINAL CMD: awk -f ~/Desktop/blockoutput.awk ~/Desktop/input_data.txt
#TEST TO MERGE OUTPUT OF SPECIFIC FIELDS WITHIN RECORDS
#$7/$28, $8/$29, $9/$30, $10/$31, $11/$32

BEGIN {FS=":"; OFS=":"}
{ print $1, $2, $24, $23, $25, $17, $26, $4, $27, $5, $6
output=""									  				# empty the output buffer
field1=$7
field2=$28
output=sprintf("%10s:", field1)             				# first field, exactly 10 chars wide

if( field2 == "" && field3 == "" )
     output=sprintf("%s          :", output)  				# 10 blanks after the field1
 else 
     output=sprintf("%s %s / %s:", output, field1, field2) 	        # otherwise "f1 / f2"

output=sprintf("%s%10s:", output, field4);    				# add field 4 to output

#...                                           				# add many more fields the same way

printf("%s", output)                        				

print $33, $12, $13, $21, $14, $3, $34, $19, $22, $18, $20, $15, $16 }

Unfortunately, it did not work as i intended and returned the following with the records split and double entries for field7 and slashes on every line:

SEP 19-98:SSSSS:FT:      $1,550   :NW1LC:LAPS  :0283:5555:1284:2222:9    
     4    : 4     /             :          :    :2000:   375:   :FAKETF :       $650:AAAA:  76 :  1:  2 :   :   :   
SEP 26-98:TTTTT:FT:      $2,620   :NW1LC:LAPS  :0300:1111:1330:3333:9    
     6    : 6     /             :          :    :1000:   380:   :ROTHE  :           :BBBB:  86 :  3:  6 :   :   :   
SEP 28-98:UUUUU:FT:      $2,600   :FNWTC:LAPS  :0284:7777:1293:6666:7    
     4    : 4     /             :          :    :2000:  4080:   :ROTHE  :           :CCCC:  56 :  4:  0 :   :   :   
OCT 12-98:VVVVV:FX:      $6,600   :CCQ3Y:LAPS  :0282:4444:1301:0000:5    
     5    : 5     /             :          :    :3000:  2490:   :ROTHE  :     $3,000:DDDD:  50 :  6:  5 :   :   :   
OCT 19-98:WWWWW:FT:      $6,500   :CCQ3Y:LAPS  :0273:0101:1293:1010:7    
     7    : 7     /             :          :3333:5000:   800:   :ROTHE  :     $3,000:EEEE:  59 : 10:  1 :   :   :   
OCT 26-98:XXXXX:FT:      $6,250   :CCQ3Y:LAPS  :0291:0011:1294:0010:2    
     5    : 5     /             :          :    :2000:   360:   :ROTHE  :       $300:FFFF:  41 :  5:  3 :   :   :   
JAN  6-98:YYYYY:ZZ:      $3,120   :CHGNW:LAPS  :0292:8888:1313:3333:2    
     4    : 4     /        5    :          :9999:1000:   130:   :VIKING :     $1,500:GGGG:  25 :  2:  9 :   :   :   

Where did i go wrong?

You took what was meant as a code fragment for demonstration purposes and put it to work without trying to understand it. First off, you have the variables "field3" and "field4" nowhere assigned, so chances are they don't hold what they are supposed to hold.

Second, i just found a typo in the code, which you also haven't noticed:

output=sprintf("%s %s / %s:", output, field1, field2)

should of course be

output=sprintf("%s %s / %s:", output, field2, field3)

but as it was just to show you the mechanism you could write that completely different anyways. I didn't care if the 10 blanks i put in if the two fields were empty are really correct, maybe it's more or less. The whole purpose of the if()-statement is: "if the one field AND the other field is empty, we put blanks at the end of the output string, otherwise the content of the first field, then a slash, then the second field". THIS was, what the code tried to show you. The blanks are just there to maintain the column format of the output. Possible output should like:

field1="A" or empty
field2="B" or empty

... A/B ...  // fields non-empty - print them with a "/" in between
...    ...  // fields empty - print an equal amount of spaces

So, go over your code again, remove any output which is just "passed through" (all the fields which are just read and written without any modification) and analyze the output of the stripped-down program. Find out, where the program produces the desired results and where it doesn't. Change it accordingly. Only when you have solved the tricky parts put in the other fields again, one by one. Let the program run in the different stages of development to see if it still works the way you want it to work.

By making little steps instead of giant leaps between runs you can always isolate occurring problems as they are introduced. If you write 10 lines and they do what the should, then add another 5 lines and the program doesn't do what it should any more you will *know* already that the problem is within the 5 lines last written.

There is an old roman proverb, "divide et impera" (set apart and rule). The same is true for programming. Nobody is able to analyze, write or conceive huge amounts of code at once, make them into small portions and study one after the other.

It is like solving the problem "build a house": don't try to build a house, try to build a wall first, as a house hast four walls and a roof. Don't even try to build a wall, try to reduce building the wall to solving the problem "lay one brick onto the other" - now, *this* is a manageable problem, which is easily solved, so "build a wall" is a long succession of "lay one brick onto the other"-problems. From there, develop the next step, say "how to combine 2 walls to form an edge", etc., etc., until you have a "build a house"-program, which is now a large array of little solveable problems, which are solved separately.

bakunin

There's alot i do not understand about the AWK language, or computer programming for that matter :slight_smile: Thanks for the schooling.

The typo you found in my code, was what i thought was a typo in your original demonstration framework code, since as you pointed out field3 was never defined, i thought you had originally made a typo.

I'm going to trash the program and start over as i do not understand most of how it works.

For my next attempt at this problem i tried:


#TERMINAL CMD: awk -f ~/Desktop/blockoutput.awk ~/Desktop/data_file.txt
#TEST TO MERGE OUTPUT OF SPECIFIC FIELDS
#$7/$28, $8/$29, $9/$30, $10/$31, $11/$32

BEGIN {FS=":"}

{ printf("%-s: %-s: %-s: %-s: %-s: %-s: %-s: %-s: %-s: %-s: %-s:", $1, $2, $24, $23, $25, $17, $26, $4, $27, $5, $6)
col7=$7
col28=$28
if( col28 !="            ")
     printf("%-s/%-s: %-s: %-s: %-s: %-s: %-s: %-s: %-s: %-s: %-s: %-s: %-s: %-s\n", col7,col28, $33, $12, $13, $21, $14, $3, $34, $19, $22, $18, $20, $15, $16)
 else 
     printf("%-s%-s: %-s: %-s: %-s: %-s: %-s: %-s: %-s: %-s: %-s: %-s: %-s: %-s\n", col7,col28, $33, $12, $13, $21, $14, $3, $34, $19, $22, $18, $20, $15, $16) }

Which Returns:

SEP 19-98: SSSSS: FT:       $1,550   : NW1LC: LAPS  : 0283: 5555: 1284: 2222: 9    :4                :     : 2000:    375:    : FAKETF :        $650: AAAA:   76 :   1:   2 :    :    
SEP 26-98: TTTTT: FT:       $2,620   : NW1LC: LAPS  : 0300: 1111: 1330: 3333: 9    :6                :     : 1000:    380:    : ROTHE  :            : BBBB:   86 :   3:   6 :    :    
SEP 28-98: UUUUU: FT:       $2,600   : FNWTC: LAPS  : 0284: 7777: 1293: 6666: 7    :4                :     : 2000:   4080:    : ROTHE  :            : CCCC:   56 :   4:   0 :    :    
OCT 12-98: VVVVV: FX:       $6,600   : CCQ3Y: LAPS  : 0282: 4444: 1301: 0000: 5    :5                :     : 3000:   2490:    : ROTHE  :      $3,000: DDDD:   50 :   6:   5 :    :    
OCT 19-98: WWWWW: FT:       $6,500   : CCQ3Y: LAPS  : 0273: 0101: 1293: 1010: 7    :7                : 3333: 5000:    800:    : ROTHE  :      $3,000: EEEE:   59 :  10:   1 :    :    
OCT 26-98: XXXXX: FT:       $6,250   : CCQ3Y: LAPS  : 0291: 0011: 1294: 0010: 2    :5                :     : 2000:    360:    : ROTHE  :        $300: FFFF:   41 :   5:   3 :    :    
JAN  6-98: YYYYY: ZZ:       $3,120   : CHGNW: LAPS  : 0292: 8888: 1313: 3333: 2    :4    /       5    : 9999: 1000:    130:    : VIKING :      $1,500: GGGG:   25 :   2:   9 :    :    

EXPECTED OUTPUT:

---$1----:--$2-:24:-------$23-----:--$25:-$17--:-$26:-$4-:-$27:-$5-:--$6-:-$7/$28:---$8/$29---:--$9/$30-:$10/$31:$11/$32:-$33:-$12:-$13--:$21:--$14--:----$3--:-$34:-$19-:$22:-$18:$20:$15:$16
		
SEP 19-98:SSSSS:FT:      $1,550   :NW1LC:LAPS  :0283:5555:1284:2222:9    :4      :4           :2        :2/3    :2/3    :    :2000:   375:   :FAKETF :    $650:AAAA:  76 :  1:  2 :   :   :   
SEP 26-98:TTTTT:FT:      $2,620   :NW1LC:LAPS  :0300:1111:1330:3333:9    :6      :6           :6        :7/9Z   :6/7    :    :1000:   380:   :ROTHE  :        :BBBB:  86 :  3:  6 :   :   :   
SEP 28-98:UUUUU:FT:      $2,600   :FNWTC:LAPS  :0284:7777:1293:6666:7    :4      :3           :4        :5/4    :9/5H   :    :2000:  4080:   :ROTHE  :        :CCCC:  56 :  4:  0 :   :   :   
OCT 12-98:VVVVV:FX:      $6,600   :CCQ3Y:LAPS  :0282:4444:1301:0000:5    :5      :5           :4        :4/3H   :1/H    :    :3000:  2490:   :ROTHE  :  $3,000:DDDD:  50 :  6:  5 :   :   :   
OCT 19-98:WWWWW:FT:      $6,500   :CCQ3Y:LAPS  :0273:0101:1293:1010:7    :7      :4           :1        :3/2    :1/XX   :3333:5000:   800:   :ROTHE  :  $3,000:EEEE:  59 : 10:  1 :   :   :   
OCT 26-98:XXXXX:FT:      $6,250   :CCQ3Y:LAPS  :0291:0011:1294:0010:2    :5      :5           :3        :3/5H   :5/7T   :    :2000:   360:   :ROTHE  :    $300:FFFF:  41 :  5:  3 :   :   :   
JAN  6-98:YYYYY:ZZ:      $3,120   :CHGNW:LAPS  :0292:8888:1313:3333:2    :4/5    :4/7         :2/H      :1/1H   :1/T    :9999:1000:   130:   :VIKING :        :GGGG:  25 :  2:  9 :   :   :   

My code only merges the first field set $7/$28 right now. Do i have to continue doing If-Else statements to merge the next set of fields i want to merge for output like: $8/$29 $9/$30 $10/$31 $11/$32?

Is there a better way to do what i am trying to do?