File formatting with newlines

Hi All -

I am in need of some help in formating the below file

Requirement -
1) replace newlines with space
2) replace '#~# ' with newline

-----------------------
sample inputfile a

I|abc|abc|aaa#~#
I|sddddd|tya|dfg

sfd

ssss#~#
I|tya1|tya2|dfg|sfd|aaa#~#
I|t#ya|a~aya4#~#

-------------------------------------------------------------------
required output file c

I|abc|abc|aaa
I|sddddd|tya|dfg sfd ssss
I|tya1|tya2|dfg|sfd|aaa
I|t#ya|a~aya4

---------------------------------------------------------------------------

I have tried regular sed commands but its not able to create newlines for step2 mentioned above

sed 's/#~# /\\\n/g' b>c

or

sed 's/#~# /'\\\n'/g' b>c

Any advise will be very much appreciated . Thanks in advance.

Try:

awk '{while (!/#~#/ && getline n>0) if(n!="")$0=$0 FS n; sub(/#~#/,x)}1'file

With GNU awk or mawk, if you do not mind an extra trailing newline at the end of the file, you can try this:

gawk '{$1=$1}1' RS='#~#' file
1 Like

hi Scrutinizer -

Many thanks for your reply . I don't have access to unix box over the weekend , so haven't been able to check this on the actual output file . however i did run this on a small sample of data using a Emulator and the awk command is absolute spot on !! Thanks for suggesting this .

Would you by any chance be able to expain the option's used in the awk command ?

thanks again

Depending on your sed version, this might work for you:

sed -n ':L; /#~#/ !{N;bL;}; /#~#/ {s///;s/\n\+/ /g; P;}' file
I|abc|abc|aaa
I|sddddd|tya|dfg sfd ssss
I|tya1|tya2|dfg|sfd|aaa
I|t#ya|a~aya4
1 Like

Hi RudiC - Thanks for your reply , would this solution be efficient enough to be run daily on files of around 20 MB size. I have an window of 5 mins to complete this formatting .

I'm afraid there's no generic answer to that question. It depends on hardware performance, system load, process priority, and mayhap other factors. You'll need to test under several different conditions to come to a satisfying conclusion.
From a gut feeling, I'd say, yes, 5 mins should suffice to get 20MB done.