Removing carriage return/line feeds on multiple lines

tomr2012 · July 6, 2012, 10:05am

I would like to remove carriage returns/line feeds in a text file, but in a specific cadence:

Read first line (Header Line 1), remove cr/lf at the end (replace it with a space ideally);
Read the next line (Line of Text 2), leave the cr/lf intact;
Read the next line, remove the cr/lf;
Read the next line (Header Line 2), remove cr/lf at the end (replace it with a space);
Read the next line (Line of Text 2), leave the cr/lf intact�

Here is my input data, in red of the cr/lf I would like deleted

Header Line 1 <cr/lf> ( replace with a space)
Line of Text 1<cr/lf>
<cr/lf> 
Header Line 2 <cr/lf> (replace with a space)
Line of Text 2<cr/lf>
<cr/lf>

What I would like to output

Header Line 1 Line of Text 1
Header Line 2 Line of Text 2

Being new to sed, I have managed to remove all the cr/lf�s but I just end up with a long string of text. I would live to preserve the cr/lfs that separate Header Line 1 Line of Text 1 from Header Line 2 Line of Text 2.

Thanks for your help!

elixir_sinari · July 6, 2012, 10:22am

Something on these lines?

sed '/^Header/{N;N;s/\r//g;s/\n$//;s/\n/ /}' inputfile

Scrutinizer · July 6, 2012, 10:26am

Try:

tr -d '\r' < infile | awk '{$1=$1}1' RS=

---
note: \r in sed regex is GNU sed only

tomr2012 · July 6, 2012, 12:09pm

Elixir - thanks, but my 'Header' line is actually a date and time field, does that make a difference?

---------- Post updated at 09:09 AM ---------- Previous update was at 09:08 AM ----------

Scrutinizer, thanks - tried the script but the outfile came back with no changes.

Scrutinizer · July 6, 2012, 12:12pm

What is your OS and version?

tomr2012 · July 6, 2012, 12:20pm

Fedora release 16 (Verne), bash shell

elixir_sinari · July 6, 2012, 12:23pm

Of course it matters. You'll have to replace ^Header using a regex/string specific to your case.

tomr2012 · July 6, 2012, 12:28pm

Got it, but suppose that changes - I will many different time/date stamps in my input file.

Scrutinizer · July 6, 2012, 12:37pm

In your sample, there is always an empty line after two lines that need to be merged, is that always the case in the real file?

Here is an alternative to try:

tr -d '\r' < infile | awk '1;{print NF?FS:RS}' ORS=

tomr2012 · July 6, 2012, 12:42pm

Hi Scrutinizer, yes - always an empty line. Tried the alternative but no changes.

Scrutinizer · July 6, 2012, 12:54pm

Strange. Could you take a couple of (anonymized) lines of your input and run it through od -c and post the result?

head -7 infile | od -c

tomr2012 · July 6, 2012, 12:58pm

does this make sense?

0000000 2 0 1 0 / 0 8 / 1 9 0 3 : 5 3
0000020 \n K O A K 1 9 0 3 5 3 Z 2 9
0000040 0 1 3 K T 1 0 S M B K N 0 0
0000060 8 1 4 / 1 2 A 2 9 9 6 R M
0000100 K A O 2 S L P 1 4 4 T 0 1
0000120 4 4 0 1 2 2 \n 2 0 1 0 / 0 8 / 1
0000140 9 0 1 : 5 0 \n A Z U H 1 9 0
0000160 1 5 0 Z 0 0 0 0 0 K T C A V
0000200 O K 2 1 / 1 2 Q 1 0 0 9 A
0000220 2 9 8 1 \n 2 0 1 0 / 0 8 / 1 9
0000240 0 3 : 5 3 \n P A B R 1 9 0 3 5
0000260 3 Z 2 2 0 1 0 K T 1 0 S M
0000300 O V C 0 1 1 0 7 / 0 6 A 2 9
0000320 7 8 R M K A O 2 S L P 0 8
0000340 4 T 0 0 7 2 0 0 6 1 \n 2 0 1 0
0000360 / 0 8 / 1 9 0 3 : 5 6 \n
0000375

Scrutinizer · July 6, 2012, 2:19pm

It does, but I see neither carriage returns, nor empty lines, only line feeds. So it seems to deviate from your original input sample..

If your input is always like this then you would only need something like:

sed 'N;s/\n/ /' infile

tomr2012 · July 6, 2012, 2:48pm

hi, that works - thanks! Weird though. When I look at my infile with notepad++ I see cr and lf's...

Thank you very much!

spynappels · July 6, 2012, 4:07pm

Do you transfer the file from a Windows box to a Unix box using WinSCP? It does some funky stuff changing DOS format to unix format. I noticed this when using pscp, and scripts were failing because pscp does not do this...