need advice on AWK

Hallo Guys,

I have an sql script which gives me the output below:

LOGIN_ID ACTION_CODE DATE CASE_REFERENCE FRAUD_TYPE_ID ESTIMATE_LOSS
james_j Assign 17/03/2008 0833003470 High Frequency 77513.92
james_j Assign 17/03/2008 0838117486 High Aggregate Cost of Calls 149830.5
james_j Assign 17/03/2008 0838521885 High Aggregate Cost of Calls 112457.79
james_j Open 17/03/2008 0832880151 High Aggregate Cost of Calls 21445.46
james_j Open 17/03/2008 0832854857 High Aggregate Cost of Calls 39037.13
james_j Open 17/03/2008 0833041781 PD Summary alarm 15537.59
james_j Open 17/03/2008 0832122779 High Aggregate Cost of Calls 17705.54
james_j Open 17/03/2008 0832126015 High Value Calls 16792.49
james_j Open 17/03/2008 0833006332 High Value Calls 26393.33
james_j Open 17/03/2008 0832725833 High Aggregate Cost of Calls 16197.94
james_j Open 17/03/2008 0832127222 High Aggregate Cost of Calls 14136.68
james_j Close 01/04/2008 0832725833 High Aggregate Cost of Calls 16197.94
james_j Close 01/04/2008 0832854857 High Aggregate Cost of Calls 39037.13
james_j Open 01/04/2008 0833041781 PD Summary alarm 15537.59
james_j Open 01/04/2008 0832725833 High Aggregate Cost of Calls 16197.94
james_j Open 01/04/2008 0832854857 High Aggregate Cost of Calls 39037.13

  1. First what i wana do is to group all records according to their action code and then add the estimate loss for all records with action code close for example.

Can you please assist.

Regards,

And why don't you use sql for that?

Can someone please give me an answer an not ask me why am i not using SQL?

Something like this?

awk 'END { for (k in x)
printf "ACTION_CODE: %s\n\n%s\n\nTotal: %60.2f\n\n", 
k, x[k], sum[k] }
NR > 1 { s = $2; $2 = ""
x = x ? xRS$0 : $0; sum += $NF }' file

If not, post an example output.

Use nawk or /usr/xpg4/bin/awk on Solaris.

Hallo Radoulov,

Thank you for the responce. One more question though, If the sql statement outputs to a file called xxx_123.txt THEN how do I run your AWK script?

Regards,

awk 'END { for (k in x)
printf "ACTION_CODE: %s\n\n%s\n\nTotal: %60.2f\n\n", 
k, x[k], sum[k] }
NR > 1 { s = $2; $2 = ""
x = x ? xRS$0 : $0; sum += $NF }' xxx_123.txt

Hi,

Your awk script doesnt give me exactly what I want. Have a look below:

$ cat cm_activities_170408:0302.log
############################ ANALYST REPORT ####################################

LOGIN_ID ACTION_CODE DATE
---------------- ------------------------------ ---------
CASE_REFERENCE FRAUD_TYPE_ID ESTIMATE_LOSS
---------------------------------------------------------------- ------------------------------ -------------
james_j Assign 17-MAR-08
0838521885 High Aggregate Cost of Calls 112457.79

james_j Assign 17-MAR-08
0838117486 High Aggregate Cost of Calls 150801.75

james_j Assign 17-MAR-08
0833003470 High Frequency 77513.92

hyman_c Open 19-MAR-08
0838521885 High Aggregate Cost of Calls 112457.79

hyman_c Open 19-MAR-08
0838521885 High Aggregate Cost of Calls 112457.79

hyman_c Assign 19-MAR-08
0832090883 PD Summary alarm 29401.46

hyman_c Open 01-APR-08
0838521885 High Aggregate Cost of Calls 112457.79

hyman_c Open 07-APR-08
0838521885 High Aggregate Cost of Calls 112457.79

hyman_c Assign 07-APR-08
0734646922 Very Long Duration Call 4

hyman_c Assign 07-APR-08
0833063410 Very Long Duration Call 4261.97

hyman_c Assign 07-APR-08
0782078038 Very Long Duration Call .19

hyman_c Assign 07-APR-08
0730799413 Very Long Duration Call 50.13

hyman_c Open 07-APR-08
0832090883 PD Summary alarm 29401.46

hyman_c Open 07-APR-08
0832090883 PD Summary alarm 29401.46

hyman_c Open 08-APR-08
0734646922 Very Long Duration Call 4

hyman_c Open 08-APR-08
0782078038 Very Long Duration Call .19

hyman_c Open 08-APR-08
0734646922 Very Long Duration Call 4

hyman_c Open 08-APR-08
0782078038 Very Long Duration Call .19

hyman_c Open 08-APR-08
0838521885 High Aggregate Cost of Calls 112457.79

hyman_c Open 08-APR-08
0832090883 PD Summary alarm 29401.46

hyman_c Open 08-APR-08
0782078038 Very Long Duration Call .19

hyman_c Open 09-APR-08
0833063410 Very Long Duration Call 4261.97

hyman_c Open 09-APR-08
0832090883 PD Summary alarm 29401.46

hyman_c Open 09-APR-08
0832090883 PD Summary alarm 29401.46

hyman_c Open 09-APR-08
0734646922 Very Long Duration Call 4

hyman_c Open 09-APR-08
0730799413 Very Long Duration Call 50.13

hyman_c Open 09-APR-08
0838521885 High Aggregate Cost of Calls 112457.79

hyman_c Open 10-APR-08
0833063410 Very Long Duration Call 4261.97

hyman_c Open 10-APR-08
0782078038 Very Long Duration Call .19

hyman_c Open 10-APR-08
0782078038 Very Long Duration Call .19

hyman_c Open 10-APR-08
0734646922 Very Long Duration Call 4

hyman_c Open 10-APR-08
0734646922 Very Long Duration Call 4

hyman_c Open 10-APR-08
0833063410 Very Long Duration Call 4261.97

hyman_c Open 10-APR-08
0730799413 Very Long Duration Call 50.13

hyman_c Open 10-APR-08
0730799413 Very Long Duration Call 50.13

hyman_c Open 11-APR-08
0832090883 PD Summary alarm 29401.46

hyman_c Open 11-APR-08
0734646922 Very Long Duration Call 4

nhlaph_pr Assign 15-APR-08
0832124374 PD Summary alarm 72957.64

$
$ cat test_123.ksh
#!/bin/ksh
set -x
#########################################################################
#
awk 'END { for (k in x)
printf "ACTION_CODE: %s\n\n%s\n\nTotal: %60.2f\n\n",
k, x[k], sum[k] }
NR > 1 { s = $2; $2 = ""
x [s]= x [s]? x[s]RS$0 : $0; sum [s]+= $NF }' /minotaur/Log/stats/cm_activities_170408:0302.log > xxx.log

$ cat xxx.log
ACTION_CODE:

Total: 0.00

ACTION_CODE: ------------------------------

---------------- ---------
---------------------------------------------------------------- -------------

Total: 0.00

ACTION_CODE: Assign

james_j 17-MAR-08
james_j 17-MAR-08
james_j 17-MAR-08
hyman_c 19-MAR-08
hyman_c 07-APR-08
hyman_c 07-APR-08
hyman_c 07-APR-08
hyman_c 07-APR-08
nhlaph_pr 15-APR-08

Total: 113.00

ACTION_CODE: FRAUD_TYPE_ID

CASE_REFERENCE ESTIMATE_LOSS

Total: 0.00

ACTION_CODE: High

0838521885 Aggregate Cost of Calls 112457.79
0838117486 Aggregate Cost of Calls 150801.75
0833003470 Frequency 77513.92
0838521885 Aggregate Cost of Calls 112457.79
0838521885 Aggregate Cost of Calls 112457.79
0838521885 Aggregate Cost of Calls 112457.79
0838521885 Aggregate Cost of Calls 112457.79
0838521885 Aggregate Cost of Calls 112457.79
0838521885 Aggregate Cost of Calls 112457.79

Total: 1015520.20

ACTION_CODE: Open

hyman_c 19-MAR-08
hyman_c 19-MAR-08
hyman_c 01-APR-08
hyman_c 07-APR-08
hyman_c 07-APR-08
hyman_c 07-APR-08
hyman_c 08-APR-08
hyman_c 08-APR-08
hyman_c 08-APR-08
hyman_c 08-APR-08
hyman_c 08-APR-08
hyman_c 08-APR-08
hyman_c 08-APR-08
hyman_c 09-APR-08
hyman_c 09-APR-08
hyman_c 09-APR-08
hyman_c 09-APR-08
hyman_c 09-APR-08
hyman_c 09-APR-08
hyman_c 10-APR-08
hyman_c 10-APR-08
hyman_c 10-APR-08
hyman_c 10-APR-08
hyman_c 10-APR-08
hyman_c 10-APR-08
hyman_c 10-APR-08
hyman_c 10-APR-08
hyman_c 11-APR-08
hyman_c 11-APR-08

Total: 272.00

ACTION_CODE: ACTION_CODE

LOGIN_ID DATE

Total: 0.00

ACTION_CODE: Very

0734646922 Long Duration Call 4
0833063410 Long Duration Call 4261.97
0782078038 Long Duration Call .19
0730799413 Long Duration Call 50.13
0734646922 Long Duration Call 4
0782078038 Long Duration Call .19
0734646922 Long Duration Call 4
0782078038 Long Duration Call .19
0782078038 Long Duration Call .19
0833063410 Long Duration Call 4261.97
0734646922 Long Duration Call 4
0730799413 Long Duration Call 50.13
0833063410 Long Duration Call 4261.97
0782078038 Long Duration Call .19
0782078038 Long Duration Call .19
0734646922 Long Duration Call 4
0734646922 Long Duration Call 4
0833063410 Long Duration Call 4261.97
0730799413 Long Duration Call 50.13
0730799413 Long Duration Call 50.13
0734646922 Long Duration Call 4

Total: 17277.54

ACTION_CODE: PD

0832090883 Summary alarm 29401.46
0832090883 Summary alarm 29401.46
0832090883 Summary alarm 29401.46
0832090883 Summary alarm 29401.46
0832090883 Summary alarm 29401.46
0832090883 Summary alarm 29401.46
0832090883 Summary alarm 29401.46
0832124374 Summary alarm 72957.64

Total: 278767.86

$

OK,
could you give an example of the output that you want?

<b>I would like my output to be similar to the one below:</b>

LOGIN_ID ACTION_CODE DATE CASE_REFERENCE FRAUD_TYPE_ID ESTIMATE_LOSS
james_j Assign 19/03/2008 832090883 PD Summary alarm 29401.46
james_j Open 19/03/2008 838521885 High Aggregate Cost of Calls 112457.79
james_j Open 01/04/2008 838521885 High Aggregate Cost of Calls 112457.79
james_j Assign 07/04/2008 782078038 Very Long Duration Call 0.19
james_j Assign 07/04/2008 734646922 Very Long Duration Call 4
james_j Assign 07/04/2008 730799413 Very Long Duration Call 50.13
james_j Assign 07/04/2008 833063410 Very Long Duration Call 4261.97
james_j Open 07/04/2008 832090883 PD Summary alarm 29401.46
james_j Open 07/04/2008 838521885 High Aggregate Cost of Calls 112457.79
james_j Open 08/04/2008 782078038 Very Long Duration Call 0.19
james_j Open 08/04/2008 734646922 Very Long Duration Call 4
james_j Open 08/04/2008 832090883 PD Summary alarm 29401.46
james_j Open 08/04/2008 838521885 High Aggregate Cost of Calls 112457.79
james_j Open 09/04/2008 734646922 Very Long Duration Call 4
james_j Open 09/04/2008 730799413 Very Long Duration Call 50.13
james_j Open 09/04/2008 833063410 Very Long Duration Call 4261.97
james_j Open 09/04/2008 832090883 PD Summary alarm 29401.46
james_j Open 09/04/2008 838521885 High Aggregate Cost of Calls 112457.79
james_j Open 10/04/2008 782078038 Very Long Duration Call 0.19
james_j Open 10/04/2008 734646922 Very Long Duration Call 4
james_j Open 10/04/2008 730799413 Very Long Duration Call 50.13
james_j Open 10/04/2008 833063410 Very Long Duration Call 4261.97
james_j Open 11/04/2008 734646922 Very Long Duration Call 4
james_j Open 11/04/2008 832090883 PD Summary alarm 29401.46
james_j Open 17/04/2008 824115119 PD Summary alarm 118856.9
james_j Closed 17/04/2008 824115118 PD Summary alarm 118856.9

Total Loss for Open Tickets 807392.27
Total Loss for Closed Tickets 118856.9
Total Loss for Assigned Tickets 33717.75

OK,
reading your post it's not clear if one row = one line.
Try to attach a sample from your data (or use code tags).

If the above is not true, you should change the sql (if the client is sqlplus: adjust the colum format and the linesize).

Hi Radoulov,

I have decided to calculate the sum of the total_estimate_loss using sql.

Thnx for all your help.

Ta,:b:

This should give the expected output:

awk ' NR>1{arr[$2]+=$NF}
END{for(i in arr){printf("Total lost for %s Tickets %10.2f\n", i, arr)}}' file

cat awkadvice.d | awk 'BEGIN {
Assign=0;
Open=1;
Closed=2;
sum[Assign]=0;
sum[Open]=0;
sum[Closed]=0;
}

    \{
      print $0
      if \($2 == "Assign" \) sum[Assign]=sum[Assign]\+$NF;
      if \($2 == "Open" \) sum[Open]=sum[Open]\+$NF;
      if \($2 == "Closed" \) sum[Closed]=sum[Closed]\+$NF;
    \}

    END \{
      print
      print "Total estimated cost for Assign is " sum[Assign]
      print "Total estimated cost for Open is " sum[Open]
      print "Total estimated cost for Closed is " sum[Closed]

    \}'

aju_cup,

The OP decided to use sqlplus to do the job, but have you tried to get the desired output with your script?
I don't!?

Regards

awk '{
arr[$2]=sprintf("%s\n%s",arr[$2],$0)
sum[$2]=sum[$2]+$NF
}
END{
for(i in arr)
{
	print arr
	print "total -->" sum
}
}' filename