Help with AWK SED

Hi guys,

I have a question regarding AWK or SED.

I have a data file which reads like this;

1,03005 41460 73215 10018 20000 39990 40092 51035 78782 879// 90850
2,
3,333 41998 55300 20000 83915 85650 90710 91129=
.........

There are about 100 or so lines like this.

What I need to do is to append the line with the = on the end to the line above so the data file reads like this

1,03005 41460 73215 10018 20000 39990 40092 51035 78782 879// 90850 333 41998 55300 20000 83915 85650 90710 91129=
2,
3,
........

Any ideas guys?

My pc is about to get throw out of the window....:mad:

Cheers,
Ian

Is this happening on every other line or is the pattern not regular?

---------- Post updated at 05:36 AM ---------- Previous update was at 05:06 AM ----------

Regardless, try this:

awk '
$2~/[0-9][^=]$/ {nr=NR}
$2~/=$/ { store2=$2;  $2=""; b[a[nr]]=b[a[nr]] store2 } 
{ a[NR]=$1; b[$1]=$2 }
END{
  for(i=1;i<=NR; i++)
     print a,b[a] 
}' FS="," OFS="," input

Another awk script :

awk -F, '
function printLine() {
    for (i=1; i<=cnt; i++) print line;
    cnt = 0;
}
{
    cnt++
    if (cnt==1) {
        line[1] = $0;
    } else {
        line[1] = line[1] " " $2;
        line[cnt] = $1 FS;
    }
    if (/=$/) printLine();
}
END { printLine() }
' inputfile

Input file:

1,03005 41460 73215 10018 20000 39990 40092 51035 78782 879// 90850
2,
3,333 41998 55300 20000 83915 85650 90710 91129=
4,123 456 789 012
5,abc def ghi jkl
7,AAA BBB CCC DDD EEE FFF
6,xyz=

Output :

1,03005 41460 73215 10018 20000 39990 40092 51035 78782 879// 90850  333 41998 55300 20000 83915 85650 90710 91129=
2,
3,
4,123 456 789 012 abc def ghi jkl AAA BBB CCC DDD EEE FFF xyz=
5,
7,
6,

Jean-Pierre.

Hi Guys,
Thanks for replying.

First it's not a regular patten, as some of the lines go like this

1 03005 41460 73215 10018 20000 39990 40092 51035 78782 879// 90850
2
3 333 41998 55300 20000 83915 85650 90710 91129=
4
5 03089 41460 73215 10018 20000 39990 40092 333 51035=

Generally, if the line starts with 333 it needs to be appended to the line above..

BTW - As I'm using AWK to read and process these files, can I just read the line terminated with /=$/ to read though the whole line until it gets to = ?

If so how is this done with awk? I have not been able to find any documentation on it

Cheers,

---------- Post updated 01-13-12 at 03:27 AM ---------- Previous update was 01-12-12 at 02:41 PM ----------

Hi Guys,

I have tried both scripts, but I was unable to get any of them to work.

mirni,
When I tried yours it just printed out the same as what I had

And Jean-Pierre,

It produced syntax errors on the last printLine

END { printLine() }

I removed this and more error appeared throughout the script :frowning:

Please show us your script code and the errors produced

Try to replace awk with nawk or gawk (depending of your unix).

Jean-Pierre.

Hi Jean-Pierre,

I don't have nawk installed but I do have gawk. However it came up with the same error as awk

END printLine() }
             ^ syntax error

thanks
Ian

END { printLine() }

Jean-Pierre.

Yeah,

END { printLine() }

Is in there, but still producers the same error :frowning:

I am trying this from a slightly different approach

awk 'BEGIN {RS="  "} { for (i-1;i<NF;i++) printf $(i) " "} END { print ""}' infile

This kind of works apart from removing the last set of numbers from the top line and adding a space on the following line

infile

1 03005 12671 73505 10027 21024 30197 40301 52007 60001 81434 91150
2
3  333 41998 55307 20344 81632 86075=  
4

outfile

1 03005 12671 73505 10027 21024 30197 40301 52007 60001 81434 333 41998 55307 20344 81632 86075= 
2
3  03005 12671 73505 10027 21024 30197 40301 52007 60001 81434 333 41998 55307 20344 81632 86075=
4

As you can see the outfile has the 91150 missing and also on the following lines after line one, there is a extra space appeared on the start of the lines.

Any ideas on this one?

Cheers,
ian

Can you please post the output of

cat -A inputFile  | head

I am suspicious you have some spaces after the final '=' that's why the awk isn't matching the lines.
Is this

awk '/=$/' input

returning any lines?

You could try to change this

$2~/=$/

to

$2~/= *$/

in my code.