Append in one line

Hello,

I have a very huge file having below data:-

subsD,00 05 02 70 DB 3D 4A B8 47 5B 38 00,919030055007,,22,
,,,,23,
,,,,21,
subsD,00 05 02 FD DE 3D 4A B8 47 5D 38 00,919030055007,,23,
,,,,47,
,,,,49,
subsD,00 05 02 BA 01 3E 4A B8 47 67 38 00,919030055007,,22,
,,,,23,
,,,,21,
subsD,00 05 02 F8 0F 3E 4A B8 47 6C 38 00,919030055007,,22,
,,,,23,
,,,,21,

I want output like below:-

subsD,00 05 02 70 DB 3D 4A B8 47 5B 38 00,919030055007,,22,23,21,
subsD,00 05 02 FD DE 3D 4A B8 47 5D 38 00,919030055007,,23,47,49,
subsD,00 05 02 BA 01 3E 4A B8 47 67 38 00,919030055007,,22,23,21,
subsD,00 05 02 F8 0F 3E 4A B8 47 6C 38 00,919030055007,,22,23,21

Means all lines starting with ',,,,' should be appended to previous line.

I tried using sed replacing ',,,,' with dual backspaces but its not working out.
Please help!!

awk 'NR>1&&/^sub/{sub($0,"\n"$0,$0)}{printf $0}END{print}' infile

Use nawk instead of awk if you run Solaris/SunOS

nawk 'NR>1&&/^sub/{sub($0,"\n"$0,$0)}{printf $0}END{print}' infile

The above commands give as output:

subsD,00 05 02 70 DB 3D 4A B8 47 5B 38 00,919030055007,,22,,,,,23,,,,,21,
subsD,00 05 02 FD DE 3D 4A B8 47 5D 38 00,919030055007,,23,,,,,47,,,,,49,
subsD,00 05 02 BA 01 3E 4A B8 47 67 38 00,919030055007,,22,,,,,23,,,,,21,
subsD,00 05 02 F8 0F 3E 4A B8 47 6C 38 00,919030055007,,22,,,,,23,,,,,21,

Try this one:

awk -F, '!$1{$0=$(NF-1) FS}{ORS=NR%3?"":RS}1' file

I don't get the

!$1{$0=$(NF-1) FS}

This should then be sufficient :

awk '{ORS=NR%3?z:RS}1' infile

but it won't be able to handle files whose ^sub ... occurrences' interval do vary (sometime 2 line inbetween sometimes more, sometime less ...)

But ok, in the given example it look like this interval doesn't change .

This removes the 4 leading field separators of a record if the first field is empty:

!$1{$0=$(NF-1) FS}

This:

awk '{ORS=NR%3?z:RS}1' infile

gives the same output as your first solution.

I noticed you didn't seem to get a response that addressed the case of the logical CSV line being broken over some number other than 3 lines.
The code I provide works with the logical CSV broken across any number of lines as long as the continued lines start with a comma and have the same format.

Here is a sed version:

sed -ne '1{h;d};/^,/!{x;p};/^,/{s/,*\([^,]*,\)$/\1/;H;x;s/\n//g;x};${x;p}' append_csvlines.input

Running awk shell script (source provided below).

./append_csvlines.awk append_csvlines.input

awk shell script output

subsD,00 05 02 70 DB 3D 4A B8 47 5B 38 00,919030055007,,22,23,21,
subsD,00 05 02 FD DE 3D 4A B8 47 5D 38 00,919030055007,,23,47,49,
subsD,00 05 02 BA 01 3E 4A B8 47 67 38 00,919030055007,,22,23,21,
subsD,00 05 02 F8 0F 3E 4A B8 47 6C 38 00,919030055007,,22,23,21,

Example input you supplied.

subsD,00 05 02 70 DB 3D 4A B8 47 5B 38 00,919030055007,,22,
,,,,23,
,,,,21,
subsD,00 05 02 FD DE 3D 4A B8 47 5D 38 00,919030055007,,23,
,,,,47,
,,,,49,
subsD,00 05 02 BA 01 3E 4A B8 47 67 38 00,919030055007,,22,
,,,,23,
,,,,21,
subsD,00 05 02 F8 0F 3E 4A B8 47 6C 38 00,919030055007,,22,
,,,,23,
,,,,21,

Example output that you requested.

subsD,00 05 02 70 DB 3D 4A B8 47 5B 38 00,919030055007,,22,23,21,
subsD,00 05 02 FD DE 3D 4A B8 47 5D 38 00,919030055007,,23,47,49,
subsD,00 05 02 BA 01 3E 4A B8 47 67 38 00,919030055007,,22,23,21,
subsD,00 05 02 F8 0F 3E 4A B8 47 6C 38 00,919030055007,,22,23,21

Single line awk command.

awk 'BEGIN{FS=","} $1&&NR>1{$0=RS$0} !$1{$0=$(NF-1)FS} {printf $0} END{printf RS}' append_csvlines.input

The output of this script differs from your example output by a comma on the final output line. I assumed that was a typo on your part.

subsD,00 05 02 70 DB 3D 4A B8 47 5B 38 00,919030055007,,22,23,21,
subsD,00 05 02 FD DE 3D 4A B8 47 5D 38 00,919030055007,,23,47,49,
subsD,00 05 02 BA 01 3E 4A B8 47 67 38 00,919030055007,,22,23,21,
subsD,00 05 02 F8 0F 3E 4A B8 47 6C 38 00,919030055007,,22,23,21,

Here is the shell script append_lines.awk. You have to chmod +x it or it won't work.

#!/usr/bin/awk -f
BEGIN{FS=","}
$1&&NR>1{$0=RS$0}
!$1{$0=$(NF-1)FS}
{printf $0}
END{printf RS}

You can turn the sed command into a shell script as well:

#!/bin/sed -nf
1{h;d}
/^,/!{x;p}
/^,/{s/,*\([^,]*,\)$/\1/;H;x;s/\n//g;x}
${x;p}

Both the sed and awk versions have a similar pattern. The first line and last lines are special cases and you have to manually output new lines.