Add new columns based on existing columns

sam_2921 · April 16, 2015, 5:52am

Hi all,

I am kind of stuck with printing my desired output. Please help me if you know how it can work.

My input file(tab separated):

NW_0068.1  41,16   100,900
NW_0699.1  4,2,19  200,700,80

My Output file (desired):

NW_0068.1  41,16   100,900  100 - 141  
NW_0068.1  41,16   100,900   900 - 916
NW_0699.1  4,2,19  200,700,800   200 - 204
NW_0699.1  4,2,19  200,700,800    700 - 702
NW_0699.1  4,2,19  200,700,800     800 - 819

My current code :

awk 'BEGIN {FS = "\t" }
           {
            block_size = $2 ; split(block_size, b, ","); 
            start_pos = $3  ; split(start_pos, s, ",") ; 
                  {for (i in b)
                       ex_size = s + b 
                       print $0, s,"-",ex_size 
                  } 
           }' test_input.txt

My current output:

NW_0068.1  41,16   100,900 900 - 916
NW_0699.1  4,2,19  200,700,800 800 - 819

When I print value of "i" within the loop it does show 1,2 for first record and 1,2,3 for the second but only the last value is printed in the final output :rolleyes: Any hints??

Also, it doesn't make any difference if I print the output inside or outside of for loop, why is this so?

Thanks!

RudiC · April 16, 2015, 7:04am

Please use code tags for data as well! In above, the TABS in the data are lost.

Your braces are misplaced. Try moving the one before for (i in b) just behind it.

Scrutinizer · April 16, 2015, 7:59am

In addition to that what RudiC wrote, the order of a for (variable in array) loop is not defined. Instead, you could try this:

awk '
       BEGIN {
           FS = OFS= "\t" 
       }
       {
           block_size = $2 ; m=split(block_size, b, ","); 
           start_pos = $3  ; split(start_pos, s, ",") ; 
           for (i=1; i<=m; i++) {
               ex_size = s + b 
               print $0, s,"-",ex_size 
           } 
       }
' file

sam_2921 · April 20, 2015, 4:28am

Many thanks for your feedback... RudiC and Scrutinizer!