Convert two column data into 8 columns

NickC · December 5, 2005, 8:46am

Apologies if this has been covered - I did search but couldn't find what I was looking for.

I have a simple X-Y input file. I want to convert it from two columns into 8 columns - 4 pairs of X-Y data. So my input file looks like

X1 Y1
X2 Y2
X3 Y3
X4 Y4
X5 Y5
etc

And I want it to look like this

X1 Y1 X2 Y2 X3 Y3 X4 Y4
X5 Y5 X6 Y6 etc.

I would prefer this to be in specfic format - 8 characters per column. But if that is not possible, each field can be separated with commas.

I thought awk with a printf would be the best way to make this happen, but I can't get it to work.

Unbeliever · December 5, 2005, 10:10am

awk '{printf("%8s%8s",$1,$2);a+=1;if ( a%4 == 0) print "";}' inputfile.txt

NickC · December 5, 2005, 10:50am

Worked like a charm - thanks!! Could you briefly explain what the stuff following the prinf means? With printf, it seems you're printing column 1 and 2 side by side of 8 characters per column. Then it seems you're repeating this 4 times to get the next 3 sets of data. But how does it create a new line once 4 sets of data have been printed across?

Unbeliever · December 5, 2005, 11:07am

Here is the code spread out a bit more for readability, with line numbers proceeding so I can reference.

1: printf("%8s%8s",$1,$2);
2: a+=1;
3: if ( a%4 == 0)
4:   print "";

In this case the whole piece of code is run for each line of input.

Line1:
This prints out columns 1 and 2 ($1 and $2) of the input line ... the '%8s' means they get printed out in 8 character columns left padded with spaces.

Line 2:
Once the line has been read and the first 2 columns have been printed we increment a counter by one.

Line 3:
This test to see if our counter variable (the one we just incremented) is a multiple of 4 or not. the '%' operator returns the remainder after dividing the 1st argument (a) by the second (4).

Line 4:
So if the remainder of "a/4" is zero (ie. it is divisible by 4) then we print a new line. The reason this works is that the 'print' command always prints a new line character as opposed to the printf command which doesn't. This has the effect of printing a new line every 4 lines of input.

Hope that helps ...

NickC · December 5, 2005, 12:27pm

Thanks for the clear explanation!

futurelet · December 5, 2005, 8:16pm

Why "a+=1"?
Awk already has NR.

Also, Awk isn't C. No ";" is needed after 'print ""'.

awk '{printf "%8s%8s",$1,$2;if ( NR%4 == 0) print ""}'

Unbeliever · December 6, 2005, 10:13am

I normally do stuff in perl not awk ... so I always forget about awk's built in variables

And yes ... awk is not C ... the final command termination is implicit. It doesn't harm to put it in though You also avoid questions like: You terminated all your commands with ';' except the last one ... why?

I replied with an awk script since that was what the original poster has tried.

ch.siva · June 25, 2008, 5:16am

Hey we can write a simple script for that.. just we need to get maximum no of columns in that file
then run below command till n ' n is the max no of cols in that file'
for i = 1 to n
$ cut -d' ' -f$i < here ur filename> | paste -s >> tmp
end
[ paste -s prints in serial order so there we are taking 1 col,2col,3col,,....ncol.. ]

summer_cherry · June 28, 2008, 11:19am

cat a | paste - - - -| sed 's/ / /g'