remove line feeds followed by character

fluffdasheep · July 7, 2011, 9:00am

Hi everyone,

I'm very new to using sed, run through some tutorials and everything but I've hit a problem that I'm unable to solve by myself.

I need to remove all linefeeds that are followed by a particular character (in this case a semicolon). So basically, all lines starting with a semicolon should be appended to the previous line.
Because of the per-line nature of sed, I haven't been successful... I know I can use N to somehow search through multiple lines but that hasn't gotten me anywhere. I've tried the following with no success.

sed 'N;s/\n;/;/' old > new

Thanks in advance.

getmmg · July 7, 2011, 9:36am

 
perl -0pe 's/\n;/;/g' input

Franklin52 · July 7, 2011, 10:07am

awk '/;$/{printf $0; next}1' file

fluffdasheep · July 7, 2011, 10:12am

Thanks a bundle, guys. getmmg's solution worked like a charm.

birei · July 7, 2011, 10:13am

Hi,

Using 'sed':

$ cat infile
Monday                                                                                                                                                                              
Tuesday                                                                                                                                                                             
;Wednesday                                                                                                                                                                          
Thursday                                                                                                                                                                            
;Friday                                                                                                                                                                             
;Saturday                                                                                                                                                                           
Sunday
$ cat script.sed
#!/bin/sed -nf                                                                                                                                                                      
                                                                                                                                                                                    
## First line: copy it to 'hold space' and read next one.                                                                                                                           
1 { h ; n }                                                                                                                                                                         
                                                                                                                                                                                    
## Lines beginning with ';': Append to 'hold space'. If last line, skip to                                                                                                          
## label 'l' to print rest of content, else begin next cycle.                                                                                                                       
/^;/ { H ; $ bl ; b }                                                                                                                                                               
                                                                                                                                                                                    
## A line without ';' at the beginning. Print lines saved in 'hold 'space'                                                                                                          
## removing '\n' chars and print it.                                                                                                                                                
:l                                                                                                                                                                                  
x ; s/\n;/;/g ; p                                                                                                                                                                   
                                                                                                                                                                                    
## Last line: If begins with ';' it will have been printed before, else take                                                                                                        
## it from 'hold space' and print it.                                                                                                                                               
$ { x ; /^;/! p }
$ ./script.sed infile
Monday
Tuesday;Wednesday
Thursday;Friday;Saturday
Sunday

Regards,
Birei

alister · July 7, 2011, 11:06am

In case you are interested in writing the most portable sed possible, using a semicolon after a branch command is not portable. With that syntax, the script may not work with most implementations. If portability is desired, it's best to use a newline after each branch command.

For fun, here's a sed solution which does not use branching. It's a bit shorter but not as clear as your approach (and probably not as efficient, since it unconditionally executes the substitute command for each line read).

sed -n '/^;/H; x; s/\n;/;/; x; /^;/!{x; 1!p;}; ${x;p;}'

Regards,
Alister