SED or AWK "append line to the previous line"

Hi,

How can I remove the line beak in the following case if the line begin with the special char �;�?

TEXT

Text;text
;text
Text;text;text

I want to convert the text to:

Text;text;text
Text;text;text

I have already tried to use �sed�, but the command only replace the first char one the line!

cat text | sed -e :a -e '$!N;s/^;/ /;ta' -e 'P;D'

Text;text
text
Text;text;text

Any suggestion ??

Thanks in advance!

# awk 'ORS=/^;/?"\n":" "' file
Text;text ;text
Text;text;text 

Thanks ghostdog74 for your fast reply, but unfortunately if the file includes more line

Exp.

Text;tex
;text;
Text;text;text;
010020;789
test;test;test

then the command delete all the line breaks.

cat text | awk 'ORS=/^;/?"\n":" "'

Text;tex ;text;
Text;text;text; 010020;789 test;test;test

Output File should be:

Text;tex;text;
Text;text;text;
010020;789
test;test;test

Any suggestion in this case ??

What would be a 100% indicator that it is the beginning of a line? A line beginning with a upper case T?

The 100% indicator in my project file is [[:digit:]]!
All lines beginning with the upper case [0-9].

I use the following command now: awk '/^;/ {sub(/^;/,""); getline t; print $0 t; next}; 1'
but unfortunately the next line will be appended to the pattern line and not the the pattern line to the preview.

awk '/^;/ {sub(/^;/,""); getline t; print $0 t; next}; 1' text

Text;tex
text;Text;text;text;
010020;789
test;test;test

output file should be:

Text;tex;text
Text;text;text;
010020;789
test;test;test

Is there a way I can append the line with the beginning �^;� to the preview line!

Many thanks!

if you have Python, here's an alternative solution

#!/usr/bin/env python

data=open("file").read().split("\n")
for n,items in enumerate(data):
    if items.startswith(";"):
        data[n-1]= data[n-1]+items
        data.pop(n)
print '\n'.join(data)

output:

# more file
Text;tex
;text;
Text;text;text;
010020;789
abcdefg; sdf;
;text;
Text;text;text;
010020;789
test;test;test

# ./test.py
Text;tex;text;
Text;text;text;
010020;789
abcdefg; sdf;;text;
Text;text;text;
010020;789
test;test;test

--- ghostdog74 ---

You are simply the best!

After so much of tries, ..i got the below code ..hope it's fit for your requirmnt

 
 awk 'BEGIN {i=0;}  { if (substr($0,1,1)==";") {a=a""$0} else {a[++i]=$0} } END { for (i in a)  print a;}' IP_FILE

Or if you have perl, then:

perl -ne '{chomp; $x[$.-1]=$_}
END {for($i=0; $i<=$#x; $i++) {
       if ($x[$i] =~ /^;/) {
         $x[$i-1] = $x[$i-1].$x[$i]."\n";
         $x[$i]=""
       } elsif ($x[$i+1] !~ /^;/) {
         $x[$i] .= "\n"
       }}
  print @x}' input.txt

Test:

$
$ cat -n input.txt
     1  Text;tex
     2  ;text;
     3  Text;text;text;
     4  010020;789
     5  abcdefg; sdf;
     6  ;text;
     7  Text;text;text;
     8  010020;789
     9  test;test;test
$
$ cat test_perl.sh

perl -ne '{chomp; $x[$.-1]=$_}
END {for($i=0; $i<=$#x; $i++) {
       if ($x[$i] =~ /^;/) {
         $x[$i-1] = $x[$i-1].$x[$i]."\n";
         $x[$i]=""
       } elsif ($x[$i+1] !~ /^;/) {
         $x[$i] .= "\n"
       }}
  print @x}' input.txt

$
$ . test_perl.sh
Text;tex;text;
Text;text;text;
010020;789
abcdefg; sdf;;text;
Text;text;text;
010020;789
test;test;test
$
$

tyler_durden

If it is to concanate the very next line:

sed -n '/^;.*/!{$!{N;s/\(.*\)\n\n*\(;.*\)*/\1\2/g;P;D};p}' file

cheers,
Devaraj Takhellambam

Hi all,

At first I want to thank you for your help. I really appreciate it!!
I surprised how many different kinds of possibilities there are to solve these issues.

It reminds me of an article from the pinguin magazin in middle Europe, a going to write something about this later.

I've tested your solutions and I found out the following different things :

DEVTAKH -- your solution is working fine for a small amount of data, but if the csv file has more than 200 lines a lot of lines are lost!
Maybe your string matches with some of the characters in my csv file.

$ cat test.csv | wc -l
552
$ cat test.csv | grep "^;" | wc -l
1
$ time sed -n '/^;.*/!{$!{N;s/\(.*\)\n\n*\(;.*\)*/\1\2/g;P;D};p}' test.csv | wc -l
276

real 0m11.108s
user 0m11.045s
sys 0m0.014s

DURDEN TYLER -- PANYAM --- GHOSTDOG74
your solution is working well, too thanks.

$time awk 'BEGIN {i=0;} { if (substr($0,1,1)==";") {a[i]=a[i]""$0} else {a[++i]=$0} } END { for (i in a) print a[i];}' test.csv | wc -l
551

real 0m0.106s
user 0m0.091s
sys 0m0.015s

$ time ./test_perl.sh test.csv | wc -l
550

real 0m0.348s
user 0m0.335s
sys 0m0.012s

$ time ./test.py test.csv | wc -l
549

real 0m0.310s
user 0m0.038s
sys 0m0.023s

About the article I just want to say that:

There was an assignment of tasks to work on a text file that was about 1GB.
The assignment was to convert the text file a shortest time.
All languages and scripts were allowed �from python awk to java�.
If you are interested in this I will write more about it.
Let me know.

Another one:

awk 'NR==1{s=$0;next}
/^;/{s=s$0;next}
{print s;s=$0}
END{if(s)print s}' file

Hi Franklin,

Is there a possibility to solve this issue with "AWK", if I want to show all the lines which are starting with the char digits [0-9]?
I'd like all other lines which are not starting with the char digits to move to the previous line.
My goal is to eliminate the line break in my csv file.

Give an example of your input file and the desired output within code tags (select the code with the mouse and click on # above the edit box).

cat file.text

1234;test;test;test;
;test;test;test;test
beta;text
01234;test;alpha;beta
47888;test;test;test;test
88899;test;test;test;
test

should be

1234;test;test;test;;test;test;test;testbeta;text
01234;test;alpha;beta
47888;test;test;test;test
88899;test;test;test;test

Should be something like this:

awk 'NR==1{s=$0;next}
!/^[0-9]/{s=s$0;next}
{print s;s=$0}
END{if(s)print s}' file

This will work given the fact that the next line will be the only line that has the concatenation part.

sed -n '/^;.*/!{$!{N;/.*\n;.*/{s/\(.*\)\n\(;.*\)*/\1\2/g;p;d};/.*\n;.*/!{P;D}};p}' file

-- Bingo --

The code:

awk 'NR==1{s=$0;next} /^[A-Z]|^;/{s=s$0;next} {print s;s=$0} END{if(s)print s}' file.text

1234;test;test;test;;test;test;test;testbeta;text
01234;test;alpha;beta
47888;test;test;test;test
88899;test;test;test;test

or

sed -n '/^[a-z]/!{$!{N;/.*\n;.*/{s/\(.*\)\n\(;.*\)*/\1\2/g;p;d};/.*\n;.*/!{P;D}};p}' file.text

1234;test;test;test;;test;test;test;test
01234;test;alpha;beta
47888;test;test;test;test
88899;test;test;test;