SED or AWK "append line to the previous line"

research3 · May 14, 2009, 4:12am

Hi,

How can I remove the line beak in the following case if the line begin with the special char �;�?

TEXT

Text;text
;text
Text;text;text

I want to convert the text to:

Text;text;text
Text;text;text

I have already tried to use �sed�, but the command only replace the first char one the line!

cat text | sed -e :a -e '$!N;s/^;/ /;ta' -e 'P;D'

Text;text
text
Text;text;text

Any suggestion ??

Thanks in advance!

ghostdog74 · May 14, 2009, 4:18am

# awk 'ORS=/^;/?"\n":" "' file
Text;text ;text
Text;text;text

research3 · May 14, 2009, 5:11am

Thanks ghostdog74 for your fast reply, but unfortunately if the file includes more line

Exp.

Text;tex
;text;
Text;text;text;
010020;789
test;test;test

then the command delete all the line breaks.

cat text | awk 'ORS=/^;/?"\n":" "'

Text;tex ;text;
Text;text;text; 010020;789 test;test;test

Output File should be:

Text;tex;text;
Text;text;text;
010020;789
test;test;test

Any suggestion in this case ??

zaxxon · May 14, 2009, 5:20am

What would be a 100% indicator that it is the beginning of a line? A line beginning with a upper case T?

research3 · May 14, 2009, 5:33am

The 100% indicator in my project file is [[:digit:]]!
All lines beginning with the upper case [0-9].

research3 · May 14, 2009, 6:12am

I use the following command now: awk '/^;/ {sub(/^;/,""); getline t; print $0 t; next}; 1'
but unfortunately the next line will be appended to the pattern line and not the the pattern line to the preview.

awk '/^;/ {sub(/^;/,""); getline t; print $0 t; next}; 1' text

Text;tex
text;Text;text;text;
010020;789
test;test;test

output file should be:

Text;tex;text
Text;text;text;
010020;789
test;test;test

Is there a way I can append the line with the beginning �^;� to the preview line!

Many thanks!

ghostdog74 · May 14, 2009, 6:21am

if you have Python, here's an alternative solution

#!/usr/bin/env python

data=open("file").read().split("\n")
for n,items in enumerate(data):
    if items.startswith(";"):
        data[n-1]= data[n-1]+items
        data.pop(n)
print '\n'.join(data)

output:

# more file
Text;tex
;text;
Text;text;text;
010020;789
abcdefg; sdf;
;text;
Text;text;text;
010020;789
test;test;test

# ./test.py
Text;tex;text;
Text;text;text;
010020;789
abcdefg; sdf;;text;
Text;text;text;
010020;789
test;test;test

research3 · May 14, 2009, 6:32am

--- ghostdog74 ---

You are simply the best!

panyam · May 14, 2009, 8:02am

After so much of tries, ..i got the below code ..hope it's fit for your requirmnt

 
 awk 'BEGIN {i=0;}  { if (substr($0,1,1)==";") {a=a""$0} else {a[++i]=$0} } END { for (i in a)  print a;}' IP_FILE

durden_tyler · May 14, 2009, 10:50pm

Or if you have perl, then:

perl -ne '{chomp; $x[$.-1]=$_}
END {for($i=0; $i<=$#x; $i++) {
       if ($x[$i] =~ /^;/) {
         $x[$i-1] = $x[$i-1].$x[$i]."\n";
         $x[$i]=""
       } elsif ($x[$i+1] !~ /^;/) {
         $x[$i] .= "\n"
       }}
  print @x}' input.txt

Test:

$
$ cat -n input.txt
     1  Text;tex
     2  ;text;
     3  Text;text;text;
     4  010020;789
     5  abcdefg; sdf;
     6  ;text;
     7  Text;text;text;
     8  010020;789
     9  test;test;test
$
$ cat test_perl.sh

perl -ne '{chomp; $x[$.-1]=$_}
END {for($i=0; $i<=$#x; $i++) {
       if ($x[$i] =~ /^;/) {
         $x[$i-1] = $x[$i-1].$x[$i]."\n";
         $x[$i]=""
       } elsif ($x[$i+1] !~ /^;/) {
         $x[$i] .= "\n"
       }}
  print @x}' input.txt

$
$ . test_perl.sh
Text;tex;text;
Text;text;text;
010020;789
abcdefg; sdf;;text;
Text;text;text;
010020;789
test;test;test
$
$

tyler_durden

devtakh · May 15, 2009, 1:16pm

If it is to concanate the very next line:

sed -n '/^;.*/!{$!{N;s/\(.*\)\n\n*\(;.*\)*/\1\2/g;P;D};p}' file

cheers,
Devaraj Takhellambam

research3 · May 16, 2009, 6:31pm

Hi all,

At first I want to thank you for your help. I really appreciate it!!
I surprised how many different kinds of possibilities there are to solve these issues.

It reminds me of an article from the pinguin magazin in middle Europe, a going to write something about this later.

I've tested your solutions and I found out the following different things :

DEVTAKH -- your solution is working fine for a small amount of data, but if the csv file has more than 200 lines a lot of lines are lost!
Maybe your string matches with some of the characters in my csv file.

$ cat test.csv | wc -l
552
$ cat test.csv | grep "^;" | wc -l
1
$ time sed -n '/^;.*/!{$!{N;s/$.*$\n\n*$;.*$*/\1\2/g;P;D};p}' test.csv | wc -l
276

real 0m11.108s
user 0m11.045s
sys 0m0.014s

DURDEN TYLER -- PANYAM --- GHOSTDOG74
your solution is working well, too thanks.

$time awk 'BEGIN {i=0;} { if (substr($0,1,1)==";") {a[i]=a[i]""$0} else {a[++i]=$0} } END { for (i in a) print a[i];}' test.csv | wc -l
551

real 0m0.106s
user 0m0.091s
sys 0m0.015s

$ time ./test_perl.sh test.csv | wc -l
550

real 0m0.348s
user 0m0.335s
sys 0m0.012s

$ time ./test.py test.csv | wc -l
549

real 0m0.310s
user 0m0.038s
sys 0m0.023s

About the article I just want to say that:

There was an assignment of tasks to work on a text file that was about 1GB.
The assignment was to convert the text file a shortest time.
All languages and scripts were allowed �from python awk to java�.
If you are interested in this I will write more about it.
Let me know.

Franklin52 · May 17, 2009, 7:35am

Another one:

awk 'NR==1{s=$0;next}
/^;/{s=s$0;next}
{print s;s=$0}
END{if(s)print s}' file

research3 · May 17, 2009, 8:27am

Hi Franklin,

Is there a possibility to solve this issue with "AWK", if I want to show all the lines which are starting with the char digits [0-9]?
I'd like all other lines which are not starting with the char digits to move to the previous line.
My goal is to eliminate the line break in my csv file.

Franklin52 · May 17, 2009, 9:57am

Give an example of your input file and the desired output within code tags (select the code with the mouse and click on # above the edit box).

research3 · May 17, 2009, 10:29am

cat file.text

1234;test;test;test;
;test;test;test;test
beta;text
01234;test;alpha;beta
47888;test;test;test;test
88899;test;test;test;
test

should be

1234;test;test;test;;test;test;test;testbeta;text
01234;test;alpha;beta
47888;test;test;test;test
88899;test;test;test;test

Franklin52 · May 17, 2009, 10:46am

Should be something like this:

awk 'NR==1{s=$0;next}
!/^[0-9]/{s=s$0;next}
{print s;s=$0}
END{if(s)print s}' file

devtakh · May 17, 2009, 10:48am

research3:

Hi all,

At first I want to thank you for your help. I really appreciate it!!
I surprised how many different kinds of possibilities there are to solve these issues.

It reminds me of an article from the pinguin magazin in middle Europe, a going to write something about this later.

I've tested your solutions and I found out the following different things :

DEVTAKH -- your solution is working fine for a small amount of data, but if the csv file has more than 200 lines a lot of lines are lost!
Maybe your string matches with some of the characters in my csv file.

$ cat test.csv | wc -l
552
$ cat test.csv | grep "^;" | wc -l
1
$ time sed -n '/^;.*/!{$!{N;s/$.*$\n\n*$;.*$*/\1\2/g;P;D};p}' test.csv | wc -l
276

real 0m11.108s
user 0m11.045s
sys 0m0.014s

DURDEN TYLER -- PANYAM --- GHOSTDOG74
your solution is working well, too thanks.

$time awk 'BEGIN {i=0;} { if (substr($0,1,1)==";") {a[i]=a[i]""$0} else {a[++i]=$0} } END { for (i in a) print a[i];}' test.csv | wc -l
551

real 0m0.106s
user 0m0.091s
sys 0m0.015s

$ time ./test_perl.sh test.csv | wc -l
550

real 0m0.348s
user 0m0.335s
sys 0m0.012s

$ time ./test.py test.csv | wc -l
549

real 0m0.310s
user 0m0.038s
sys 0m0.023s

About the article I just want to say that:

There was an assignment of tasks to work on a text file that was about 1GB.
The assignment was to convert the text file a shortest time.
All languages and scripts were allowed �from python awk to java�.
If you are interested in this I will write more about it.
Let me know.

This will work given the fact that the next line will be the only line that has the concatenation part.

sed -n '/^;.*/!{$!{N;/.*\n;.*/{s/\(.*\)\n\(;.*\)*/\1\2/g;p;d};/.*\n;.*/!{P;D}};p}' file

research3 · May 17, 2009, 10:55am

-- Bingo --

The code:

awk 'NR==1{s=$0;next} /^[A-Z]|^;/{s=s$0;next} {print s;s=$0} END{if(s)print s}' file.text

1234;test;test;test;;test;test;test;testbeta;text
01234;test;alpha;beta
47888;test;test;test;test
88899;test;test;test;test

research3 · May 17, 2009, 10:59am

or

sed -n '/^[a-z]/!{$!{N;/.*\n;.*/{s/\(.*\)\n\(;.*\)*/\1\2/g;p;d};/.*\n;.*/!{P;D}};p}' file.text

1234;test;test;test;;test;test;test;test
01234;test;alpha;beta
47888;test;test;test;test
88899;test;test;test;