Awk formatting of a data file - nested for loops?

catwoman · July 4, 2008, 12:17am

Hello - is there any way in awk I can do...

4861 x(1) y(1) z(1)
4959 x(1) y(1) z(1)
5007 x(1) y(1) z(1)
4861 x(2) y(2) z(2)
4959 x(2) y(2) z(2)
5007 x(2) y(2) z(2)
4861 x(3) y(3) z(3)
4959 x(3) y(3) z(3)
5007 x(3) y(3) z(3)

to become...
4861 x(1) y(1) z(1) 4861 x(2) y(2) z(2) 4861 x(3) y(3) z(3)
4959 x(1) y(1) z(1) 4959 x(2) y(2) z(2) 4959 x(3) y(3) z(3)
5007 x(1) y(1) z(1) 5007 x(2) y(2) z(2) 5007 x(3) y(3) z(3)

In order to do this I've tried printing all lines associated with the 4861 measurement first, and then the same with 4959 and 5007, and was then going to merge them. For the first part, this is what I've got so far...

{for (k=1;k<4;k++) {
for (j=1+k;j<11;j=j+3) {
if (NR == j)
printf("%s\n", $0)
}
}}

..but this just prints out the lines in their original order. Is the reason why I can't do this because awk processes on a line by line basis rather than being able to dictate which lines I want to print out in a set order?

The only other thing I can think to do is merge all lines in this array, and then print the various fields I want to be on the same line.

Cheers

Annihilannic · July 4, 2008, 12:38am

You could accumulate the data in an array indexed by the first field, then print them out at the end, e.g.

awk '
        { a[$1]=a[$1]$0" " }
        END { for (i in a) { print a } }
' inputfile > outputfile

catwoman · July 4, 2008, 2:21am

Ok that's brilliant thanks.

But what if the initial wavelengths sometimes varied, so you might get:

4862 x(1) y(1) z(1)
4958 x(1) y(1) z(1)
5007 x(1) y(1) z(1)
4860 x(2) y(2) z(2)
4959 x(2) y(2) z(2)
5007 x(2) y(2) z(2)
4861 x(3) y(3) z(3)
4959 x(3) y(3) z(3)
5008 x(3) y(3) z(3)

for example. So you couldn't match the lines based on the homogeneity of the value in $1, and therefore had to on the line number. Could you do it this way?

Annihilannic · July 4, 2008, 2:27am

In that case index the array by NR%3, or by pulling out the first value in brackets.