How to split one long column into multiple rows with 3 each ?

nengcheng · November 21, 2019, 1:09pm

I have a large csv dataset like this :

A	value1
A	value2
A	value3
B	value1
B	value2
B	value3
C	value1
C	value2
C	value3

what I expected output is :

A	value1	value2	value3
B	value1	value2	value3
C	value1	value2	value3

I'm thinking of use like awk, columns , but haven't find a proper way to do that.
could anyone give me some clues?

RudiC · November 21, 2019, 2:23pm

Try

awk 'LAST != $1 {printf "%s%s", DL, $0; LAST = $1; DL = RS; next}; {printf "\t%s", $2} END {printf RS}' file
A    value1    value2    value3
B    value1    value2    value3
C    value1    value2    value3

nengcheng · November 22, 2019, 10:49am

Thank you RudiC . That's a cool solution, which is also suitable for more complex situations.

MadeInGermany · February 26, 2020, 4:37pm

Nice solution
For symmetry reason (and the border case "empty input file") it should be END {printf DL} .

drl · February 26, 2020, 7:29pm

Hi.

I've often wondered why there hasn't been a utility (or mode in join ) to do this kind of operation -- a self-join, as it were ... cheers, drl