My File looks like:
"|" -> Field separator
A|B|C|100|1000
D|E|F|1|2
G|H|I|0|7
D|E|F|1|2
A|B|C|10|10000
G|H|I|0|7
A|B|C|1|100
D|E|F|1|2
I need to do a SUM on Col. 5 and Col.6 by grouping on Col 1,2 & 3
My expected output is:
A|B|C|111|11100
D|E|F|3|6
G|H|I|1|16
Can this be done using awk? I have done this kind of SUM for a single column but I'm stuck up when I'm doing SUM for two columns
awk -F '|' '{k = $1"|"$2"|"$3; x[k]+=$4; y[k]+=$5} END{for(k in x){print k"|"x[k]"|"y[k]}}' file
This is just to help you get started. For summing beyond 5 columns, use a loop appropriately.
1 Like
If your real file contains many columns then try
awk -F"|" '{
x=$1 FS $2 FS $3
for( i=column_start; i<=column_end; i++)
A[x FS i] = A[x FS i] ? A[x FS i] + $i : $i
}
END{
s = gsub( x, FS)
for( i in A ){
x = ""; split(i,S,FS)
for( j=1; j<=s; j++ )
x = x ? x FS S[j] : S[j]
if( !U[x]++ )SS[++k]=x
}
for(i=1;i<=k;i++){
printf SS OFS
for( j=column_start; j<=column_end; j++ )
printf j < column_end ? A[SS FS j] OFS : A[SS FS j] RS
}
}' column_start="4" column_end="5" OFS="|" file
A|B|C|111|11100
D|E|F|3|6
G|H|I|0|14
change column_start="x"
and column_end="x"
according to your need.