awk parsing problem

I need help with a problem that I have not been able to figure out.
I have a file that is about 650K lines. Records are seperated by
blank lines, fields seperated by new lines. I was trying to make
a report that would add up 2 fields and associate them with a CP.

example output would be something like this:

CP31
----
TCS 10 54087
TCS 342 35173

TOTAL 59260

CP33
----
TCS 8 48790
TCS 286 33614

TOTAL 82404

In a nutshell I have to sum up the first 2 fields of EDM blocks and WRC
blocks and then associate them with what TCS & CP they belong to. I would like
to use AWK, and was trying to use arrays, but with no luck. Maybe multi-
dimensional, have not had any experiance with these. Any help would be very
much appreciated.

Thanks in advance.

I attached a txt doc of partial file. I hope this makes sense.

Could you post sample input data?

Edit: Where is the attachment?

sorry, I think I had a error when I first tried to attach file. Thanks in advance

awk '{printf "%s\n----\n%s %s %s\n%s %s %s\n\nTOTAL %d\n\n\n",
$1,$95,$96,$98+$99,$179,$180,$182+$183,$98+$99+$182+$183
}' RS= filename

I think you meant TOTAL 89260 and not TOTAL 59260 for
the first record.

Use nawk or /usr/xpg4/bin/awk on Solaris.

radoulov,
first of all thanks a lot for looking at this for me, unfortunantely it is not working for me, it is my fault, I don't believe I was clear enough. The file is big and I don't want ALL the fields summed up, just CP fields and the associated TCS fields with the first two fields summed up. I hope this is clear???
Please let me know, because I have not gotten anywhere with this. Thanks
I can post more of the file if this will help.

awk '/^CP/{printf "%s\n----\n%s %s %s\n%s %s %s\n\nTOTAL %d\n\n\n",
$1,$95,$96,$98+$99,$179,$180,$182+$183,$98+$99+$182+$183
}' RS= filename

If the above code doesn't work, post a bigger sample from your datafile.

radoulov,
thanks again for looking at this, I attached a bigger example of input file Rev1, might have to open wider than 80 columns, and at the bottom of file I put desired output. I think I explained what I needed more clearly and I have reached my limit on what I can do in awk with this one and appreciate any help with this.

thanks in advance.

The following gives the output you want based on the sample file provided.

#!/usr/bin/awk -f

BEGIN {
    total = 0;
    cpfound = 0;
    edmfound = 0;
    wrcfound = 0;
}

function parseEDM()
{
   j = 0;
   sum = 0;
   edmfound = 1;

   while (j < 6) {
      getline;
      sum = sum + $1 + $2;
      j++;
   }

   total = total + sum;
   return sum;
}

function parseWRC()
{
   j = 0;
   sum = 0;
   wrcfound = 1;

   getline;
   sum = sum + $1 + $2;

   total = total + sum;
   return sum;
}

NF==1 && substr($1,1,2)=="CP" {
    print "";
    print $1;
    print "----";

    total = 0;
    cpfound = 1;
    edmfound = 0;
    wrcfound = 0;
}

NF==2 && substr($1,1,3)=="TCS" && cpfound == 1 {
       field1 = $1;
       field2 = $2;
       getline;
       if (NF==1 && substr($1,1,3)=="EDM") {
          field3=parseEDM();
       }
       if (NF==1 && substr($1,1,3)=="WRC") {
          field3=parseWRC();
       }
       print field1, field2, field3;
       if (edmfound==1 && wrcfound==1) {
           print "";
           print "TOTAL", total;
           edmfound = 0;
           wrcfound = 0;
           total = 0;
           cpfound = 0;
       }
}

Obviously error handling, etc. needs to be added if used in a production environment.

$ ./testawk testfile

CP31
----
TCS 10 54087
TCS 342 35173

TOTAL 89260

CP33
----
TCS 8 48790
TCS 286 33614

TOTAL 82404
$

OK,
changed like this:

awk '
  /^CP/ {    
        print
        f++     
        }

  f && /TCS /   {   
        if (edm) {
                printf "----\n%-8s %-8s\n", tcs, edm
                edm = ""
                }
        tcs = (($1) FS ($2))
        }

  f && /EDM|WRC|wrc/    {  
        getline 
        edm += $1 + $2 
        edmt += $1 + $2 
        }

  f && ! NF     {  
        printf "%-8s %-8s\nTotal:   %-8s\n\n", tcs, edm, edmt
        f = tcs = edm = edmt = ""
        }
' filename

This is the output I get:

$ cat timj.txt


CDN 07    4
 IMS-SCNT 00000 00000 00000 00000
 IMS-LCNT 000000000 000000000 000000000 000000000 000000000 000000000 000000000
          000000000
 SCNT     00000 00025 00000 00000 00000 00000 00000 00000 00031
 LCNT     000000041 001007860 187905607 000891919 102177186 000000000 000000000
          000000000 000000000 000000000 000000000 000000023 000000000 000001679
          000000016 000000309 000000000 000000000 000000000 000000000 000000000
          000000000 000000000 000000000 000000000 000000000 000000000 000000000
          000000000 000000000 000000000 000000000 000000000 000000000
 RT USAGE 00093 00001
 CP       000082307 000000000 000000000 000000000 000000000 000000000 000000000
          000000000
 MCR USG  000541595


CDN 08    4
 IMS-SCNT 00000 00000 00000 00000
 IMS-LCNT 000000000 000000000 000000000 000000000 000000000 000000000 000000000
          000000000
 SCNT     00000 00025 00000 00000 00000 00000 00000 00000 00031
 LCNT     000000023 001033219 190332475 000919047 104943932 000000000 000000000
          000000000 000000000 000000000 000000000 000000023 000000000 000001697
          000000017 000000306 000000000 000000000 000000000 000000000 000000000
          000000000 000000000 000000000 000000000 000000000 000000000 000000000
          000000000 000000000 000000000 000000000 000000000 000000000
 RT USAGE 00082 00001
 CP       000085873 000000000 000000000 000000000 000000000 000000000 000000000
          000000000
 MCR USG  000553997

CP31
 ECMR  05 00000
          000000038 000000000 000000000 000000000 000175275 000175275 000033886
          000033886 000000000 000000000 000000011 000000001 000000095 000431157
          000143147 000000000 000004124 000246868 000184289 000085517 000069108
          000004981 000015056 000000731 000000678 000000000 000000000
 OC    07 00000
          000000283 000000000 000000000 000000000 000000000 000000000 000000000
          000000000 000000000 000000000 000000000 000000000 000000000 000000000
          000000000 000000424 000000027 000000195 000000000 000000004 000000006
          000000000 000000000 000000000 000000000 000000000 000000000 000000000
          000000000 000000000 000000000 000000000 000000000 000000000 000000000
          000000000 000000000 000000000 000000000 000000000 000000000 000000000
          000000000 000000000 000000000 000000000 000000000 000000000 000000000
          000000000 000000000 000000000 000000000 000000472 000000472 000000472
 CCM   03 00000
  TCSLINK
  TCS  10
  EDM1
          000027214 000026873 000000000 000000000 000000000 000000000 000000000
          000000000 000000000 000000000 000000000 000000000 000000000 000000000
          000000000 000000000 000000000 000000000 000000000 000000000
          000000000 000000000 000000000 000000000 000000000 000000000 000000000
          000000000 000000000 000000000 000000000 000000000 000000000 000000000
          000000000 000000000 000000000 000000000 000000000 000000000
  EDM2
          000000000 000000000 000000000 000000000 000000000 000000000 000000000
          000000000 000000000 000000000 000000000 000000000 000000000 000000000
          000000000 000000000 000000000 000000000 000000000 000000000
          000000000 000000000 000000000 000000000 000000000 000000000 000000000
          000000000 000000000 000000000 000000000 000000000 000000000 000000000
          000000000 000000000 000000000 000000000 000000000 000000000
  TCS 342                                                                                      
   WRC1
          000014880 000020293
   WRC2
          000000000 000000000
   WRC3
          000000000 000000000
   WRC4
          000000000 000000000
   WRC5
          000000000 000000000
   WRC6
          000000000 000000000
   wrc7
          000000000 000000000
   wrc8
          000000000 000000000
   wrc9
          000000000 000000000
   wrc10
          000000000 000000000
   wrc11
          000000000 000000000
   wrc12
          000001345 000002365
   wrc13
          000000000 000000000
   wrc14
          000000000 000000000
   wrc15
          000000000 000000000
   wrc16
          000000000 000000000

CP33
 ECMR  05 00000
          000000042 000000000 000000000 000000000 000167297 000167297 000022079
          000022079 000000000 000000000 000000010 000000001 000000095 000298686
          000148264 000000000 000004122 000168466 000130220 000081446 000066818
          000003491 000014595 000000730 000000678 000000000 000000000
 OC    07 00000
          000000254 000000000 000000000 000000000 000000000 000000000 000000000
          000000000 000000000 000000000 000000000 000000000 000000000 000000000
          000000000 000000394 000000022 000000197 000000000 000000001 000000004
          000000000 000000000 000000000 000000000 000000000 000000000 000000000
          000000000 000000000 000000000 000000000 000000000 000000000 000000000
          000000000 000000000 000000000 000000000 000000000 000000000 000000000
          000000000 000000000 000000000 000000000 000000000 000000000 000000000
          000000000 000000000 000000000 000000000 000000000 000000000 000000000
 CCM   03 00000
  TCSLINK
  TCS   8
  EDM1
          000025112 000023678 000000000 000000000 000000000 000000000 000000000
          000000000 000000000 000000000 000000000 000000000 000000000 000000000
          000000000 000000000 000000000 000000000 000000000 000000000
          000000000 000000000 000000000 000000000 000000000 000000000 000000000
          000000000 000000000 000000000 000000000 000000000 000000000 000000000
          000000000 000000000 000000000 000000000 000000000 000000000
  EDM2
          000000000 000000000 000000000 000000000 000000000 000000000 000000000
          000000000 000000000 000000000 000000000 000000000 000000000 000000000
          000000000 000000000 000000000 000000000 000000000 000000000
          000000000 000000000 000000000 000000000 000000000 000000000 000000000
          000000000 000000000 000000000 000000000 000000000 000000000 000000000
          000000000 000000000 000000000 000000000 000000000 000000000
  TCS 286
   WRC1
          000014267 000019347
   WRC2
          000000000 000000000
   WRC3
          000000000 000000000
   WRC4
          000000000 000000000
   WRC5
          000000000 000000000
   WRC6
          000000000 000000000
   WRC7
          000000000 000000000
   WRC8
          000000000 000000000


$ nawk '
>   /^CP/ { 
>         print 
>         f++
>         }
>
>   f && /TCS /   {
>         if (edm) {
>                 printf "----\n%-8s %-8s\n", tcs, edm
>                 edm = ""
>                 }
>         tcs = (($1) FS ($2))
>         }
>
>   f && /EDM|WRC|wrc/    {
>         getline
>         edm += $1 + $2
>         edmt += $1 + $2
>         }
>
>   f && ! NF     {
>         printf "%-8s %-8s\nTotal:   %-8s\n\n", tcs, edm, edmt
>         f = tcs = edm = edmt = ""
>         }
> ' timj.txt
CP31
----
TCS 10   54087
TCS 342  38883
Total:   92970

CP33
----
TCS 8    48790
TCS 286  33614
Total:   82404

radoulov & fpmurphy,
I can't thank you both enough for what you guys did. I had to tweak both scripts a little bit, but both are working. I realize I was going about this the wrong way. Both have helped me enormously.

radoulov,
I am having trouble following your logic, it looks like there are "shortcuts" in your script that I am having problem deciphering. Can you explain your script for me.

Thank you both again, so much.

Of course.
First, I would change the code to:

awk '
  /^CP/ { 
        print 
        f++     
        }

  f     { 
        if ($0 ~ /TCS /) { 
                if (edm) { 
                        printf "----\n%-8s %-8s\n", tcs, edm
                        edm = ""
                        }
                tcs = (($1) FS ($2))
                }
        if ($0 ~ /EDM|WRC|wrc/) { 
                getline 
                edm += $1 + $2 
                edmt += $1 + $2 
                }
        if (! NF) { 
                printf "%-8s %-8s\nTotal:   %-8s\n\n", tcs, edm, edmt
                f = tcs = edm = edmt = ""
                }
        }
' input

The logic is quite simple, here we go:

  /^CP/ { 
        print 
        f++     
        }

For every record that matches the pattern ^CP: print it and increment the value of the parameter f. We only need a flag, you can use flag = "true" if you consider it more readable.

  f     { 
        if ($0 ~ /TCS /) { 
                if (edm) { 
                        printf "----\n%-8s %-8s\n", tcs, edm
                        edm = ""
                        }
                tcs = (($1) FS ($2))
                }

For every record for which our flag is true, has value different than zero or null - the f by itself (here we are in your logical record/block):
+ if the record matches the pattern TCS<space>:
++ if the parameter edm is true (see below), then print the values of the parameters tcs and edm, then unset edm (set it to "", false)
++ set the parameter tcs to the values of $1 FS and $2.

        if ($0 ~ /EDM|WRC|wrc/) { 
                getline 
                edm += $1 + $2 
                edmt += $1 + $2 
                }

+ if the record matches the pattern EDM|WRC|wrc (EDM OR WRC OR wrc) go to the next line (getline) and:
++ increment edm and edmt (total) with the sum of $1 and $2.

        if (! NF) { 
                printf "%-8s %-8s\nTotal:   %-8s\n\n", tcs, edm, edmt
                f = tcs = edm = edmt = ""
                }

+ if the record has no fields (a blank line, the end of your logical record/block), print the values of
tcs, edm, edmt (total) and unset f, tcs, edm and edmt.

I think I got it now. I was getting confused on the first edm statement, I was wondering how it becomes true, until I relized script goes until it finds a blank line and then prints out varables and resets varables at the end. I also did not know that a varable can be used in the pattern part of a awk script. Again thanks, you have no idea how much I struggled on this.