Joining broken lines

I have a plain test file with a delimeter '[|]'. In this file some lines are broken into two. The first part of these broken line will have 6 columns and the second part will have 4. These broken lines will be consicutive.
I want to join the two consicutive lines which are having 6 fields and 4 fields respectively. A perfect line will have 12 fields and each perfect line will end with '{}' even the second half of the broken line will have the '{}'

example
of a broken line

1[|]2[|]3[|]4[|]5[|]6[|]7
[|]a[|]b[|]c[|]d{}

a correct line

1[|]2[|]3[|]4[|]5[|]6[|]7[|]a[|]b[|]c[|]d{}

Can any one please help me to sort this out..

If {} marks the end of line, try this

 
perl -0lne 's/\n//g;print "$1\n" while /(.*?{})/g' input file. 
1 Like

Thanks a lot... It worked,Can you please explains each part of the code if you are not too busy...

---------- Post updated at 06:05 AM ---------- Previous update was at 06:04 AM ----------

Also it would be great if there is a way to join trhe lines by checking the delemeter count..

tr -d '\n' < /yourfile

 -0lne 

reads the whole file.

s/\n//g 

replaces all new line charecters and makes it into a single line.

print "$1\n" while /(.*?{})/g' 

this will try to do a minimalistic pattern match for any chars that ends with {} and it is printed.
While loop makes sure to print everything.

---------- Post updated at 05:17 PM ---------- Previous update was at 05:00 PM ----------

Another approach to make sure the no of colums are also matching.
I have tried to match 10 cols here. Change it as per your requirement.

 
perl -0lne 's/\n//g;print "$1\n" while /((\w+\[\|\]){10}\w+{})/g' input
1 Like

Also when i redirect the file the file with the below command
"perl -0lne 's/\n//g;print "$1\n" while /(.*?{})/g' inputfile>abc.txt"

it creates a binary file. When i do a head on this file it is displaying the file correctly , but in vi I can see a '^@' charecter on each line from the second line. any reason for this..

Not sure why. For me its looks fine.
Ouput file is also Ascii text not a binary.

Whats your OS btw?

This is the output of uname -a
Linux dwapp1w29m3 2.6.9-67.0.22.ELhugemem #1 SMP Fri Jul 11 10:55:23 EDT 2008 i686 athlon i386 GNU/Linux

---------- Post updated at 09:03 AM ---------- Previous update was at 07:10 AM ----------

I found a way to eliminate the non printable charecters though another forum as
strings sample.txt> sampl1.txt

Thanks a lot.. Ur help made my work easy..

An awk alternative which simply does not print the record separator (usually a newline) after lines with exactly 7 fields (6 delimiters):

awk -F '\[\|]' '{printf("%s%s", $0, (NF==7?"":RS))}'

Regards,
Alister