Awk: Combine multiple lines based on number of fields

If a file has following kind of data, comma delimited

1,2,3,4
1
1
1,2,3,4
1,2
2
2,3,4

My required output must have only 4 columns with comma delimited

1,2,3,4
111,2,3,4
1,222,3,4

I have tried many awk command using ORS="" but couldnt progress

Here is a solution

awk '{buf=buf $0} (split(buf,a,",")>=4) {print buf; buf=""}' file

or using the auto-split

awk -F, '{buf=buf $0; c+=NF} (c>=4) {print buf; buf=""; c=0}' file

Or

awk -F, '{while (NF < 4) {getline X; $0 = $0 X}}1' file
1,2,3,4
111,2,3,4
1,222,3,4

With sed

sed '
:loop
s/,/,/3
t
$!N
s/\n//
t loop
' file

Hello mdkm,

Following may help you too in same.
1st code:

awk -F, '{ORS=NF<4?"":"\n";} 1;END{if(NF<4){print "\n"}}' Input_file

2nd code:

awk -F, 'NR==1&&NF==4{print;next}NF<4{A=A?A $0:$0} NF==4{if(A){print A $0;A=""}} END{if(A){print A}}' OFS=,  Input_file

Thanks,
R. Singh

@Ravinder: Both of these methods will fail if the last line contains 4 fields...

Hello Scrutinizer,

When I have following file with following input:

cat Input_file
2,3,4
1
1
1,2,3,4
1,2
2
2,3,4
5,6,7,8
1
2
3
4
5
6,7,8
9,10,11,12
 

Following codes give the results.

 awk -F, '{ORS=NF<4?"":"\n";} 1;END{if(NF<4){print "\n"}}'  Input_file
 

Output will be as follows.

 awk -F, 'NR==1&&NF==4{print;next}NF<4{A=A?A $0:$0} NF==4{if(A){print A $0;A=""}} END{if(A){print A}}' OFS=,  Input_file
 

It gives as same pattern as user requested output, but yes it doesn't give four fields in each output.

EDIT: As Scrutnizer mentioned in his next post#8, above solutions will not work as per OP's given Input_file because they work on same patteren appending the texts but they are not fulfilling the 4 fields conditions, because I have used different Input_file compare to OPs.

Thanks,
R. Singh

Indeed they happen to work with the OP's input, but as is, they are not proper solutions...

1 Like

With the following input file:

1,2,3,4
1
1
1,2,3,4
1,2
2
2,3,4
1,2,3,4
1
1
1
1
1
1,2,3,4
1,2,3,4

The second suggestion does not seem to work properly:

$ awk -F, '{buf=buf $0; c+=NF} (c>=4) {print buf; buf=""; c=0}' file
1,2,3,4
111,2,3,4
1,222,3,4
1,2,3,4
1111
11,2,3,4
1,2,3,4

The first suggestion works as expected:

$ awk '{buf=buf $0} (split(buf,a,",")>=4) {print buf; buf=""}' file 
1,2,3,4
111,2,3,4
1,222,3,4
1,2,3,4
111111,2,3,4
1,2,3,4
1 Like

Correction

awk -F, '(NF>0) {buf=buf $0; c+=(NF-1)} (c>=3) {print buf; buf=""; c=0}' file

A bit of crypto-golf for fun :slight_smile:

awk -F, '{p=$0=p $0} NF>2 && !(p=x)' file