Awk: Summing values with group criteria

Hi Guys,

I have a text file with ";" like separator

F1;F2;F3;F4;F5
444;100041;IT;GLOB;1800000000
444;100041;TM;GLOB;1000000000
444;10300264;IT;GLOB;2000000000
444;10300264;IT;GLOB;2500000000

I have to sum the cullums F5 for same F2 and F3 collums
The result must be:

444;100041;IT;GLOB;1800000000
444;100041;TM;GLOB;1000000000
444;10300264;IT;GLOB;4500000000

Thanks for your support.
Regards
Gianluca

code:-

nawk -F";"  -v OFS=";" '
NF{a[$2";"$3]+=$5 ; b[$2";"$3]=$1";"$4}
END{ for (i in a) {print i,b,a } }
'  input_file | sort | nawk -F";" -v OFS=";" '{print $3,$1,$2,$4,$5}'

BR

---------- Post updated at 11:45 AM ---------- Previous update was at 11:33 AM ----------

or better one and more elegant:-

nawk  -F";"  -v OFS=";" '
NF{a[$2";"$3]+=$5 ; b[$2";"$3]=$1";"$2";"$3";"$4}
END{ for (i in a) {print b,a } }
'  input_file | sort > out_file

BR :D:D:D:D:D
;););):wink:

awk 'BEGIN{FS=OFS=";"}NR>1{a[$2FS$3]+=$NF;$NF="";b[$2";"$3]=$0}END{for(i in a)print ba}' file
awk -F';' '{A[$2$3]=$1FS$2FS$3FS$4;B[$2$3]+=$5} END{for (i in A) printf "%s%.0f\n",AFS,B}' infile |sort -nt';'
444;100041;IT;GLOB;1800000000
444;100041;TM;GLOB;1000000000
444;10300264;IT;GLOB;4500000000

-or-

awk -F';' '{B[$2$3]+=$5;$5="";A[$2$3]=$0} END{for (i in A) printf "%s%.0f\n",A,B}' infile |sort -n|tr ' ' ';'

Hi guys,

Many thanks for your answers & solution.
I have a little issue. If the source trace change as is

444;100041;IT;GLOB;1800000000;;;;;I
444;100041;TM;GLOB;1000000000;;;;;I
444;10300264;IT;GLOB;4500000000;;;;;I

the awk drop the chars after the number ";;;;;I"

What i have to change?
Thanks and regards
Gianluca

Try this:

awk 'BEGIN{OFS=FS=";"}{B[$2$3]+=$5;$5="@";A[$2$3]=$0} END{ for (i in A) {sub(/@/,sprintf("%.0f",B),A); print A}}' infile | sort -nt';'

-or-

awk -F';' '{B[$2$3]+=$5;$5="@";A[$2$3]=$0} END{ for (i in A) {sub(/@/,sprintf("%.0f",B),A); print A}}' infile |sort -n|tr ' ' ';'

Use gawk, nawk or /usr/xpg4/bin/awk on Solaris.

awk -F\; 'END { for (_ in _0) {
  l = split(_0[_], t); t[5] = _5[_]
  for (i=1; i<=l; i++) 
    printf "%s", t (i == l ? RS : FS)  
    }      
  }  
{ _5[$2,$3] += $5; _0[$2,$3] = $0 }   
' infile

---------- Post updated at 01:03 PM ---------- Previous update was at 12:54 PM ----------

If the file is already ordered:

awk -F\; 'END { print r }
!_[$2,$3]++ && NR > 1 { 
  print r; x = 0 
  }
{ $5 = x += $5; r = $0 }
' OFS=\; infile
nawk '
    (NR == 1){print;next}
    (NR == FNR){ln[$2,$3]+=$5;next}
    (FNR != 1){
        if(ln[$2,$3]){
            $5 = ln[$2,$3]
            ln[$2,$3] = ""
            print
        }
    }
' FS=";" OFS=";" file file