Add the values in second and third columns with group by on first column.

angshuman · February 14, 2013, 6:04am

Hi All,

I have a pipe seperated file. I need to add the values in second and third columns with group by on first column.

MYFILE_28012012_1115|47|173.90
MYFILE_28012012_1115|4|0.00
MYFILE_28012012_1115|6|22.20
MYFILE_28012012_1116|47|173.90
MYFILE_28012012_1116|4|0.00
MYFILE_28012012_1116|6|22.20
MYFILE_28012012_1117|47|173.90
MYFILE_28012012_1117|4|0.00
MYFILE_28012012_1117|6|22.20

I expect the result to be as below:

MYFILE_28012012_1115|57|196.11
MYFILE_28012012_1116|57|173.90
MYFILE_28012012_1117|57|173.90

Any help on this will be highle appreciated

Thanks
Angshuman

pamu · February 14, 2013, 6:09am

Try

awk -F "|" '{A[$1]+=$2;B[$1]+=$3;}END{for(i in A){print i,A,B}}' OFS="|" file

angshuman · February 14, 2013, 7:17am

Hi Pamu,

Thank you for your prompt help. I just need to make a small change to it so that the last filed in the out put always displays two decimal places.

May I request your help on this and can I request you to explain the code?

Thanks & Regards
Angshuman

pamu · February 14, 2013, 7:31am

try this..

awk -F "|" '{A[$1]+=$2;B[$1]+=$3;}END{for(i in A){printf "%s|%s|%0.02f\n", i,A,B}}' file

awk -F "|" '{    # Define | as a Field Separator
A[$1]+=$2;        # Add $2 to the array A having index of $1
B[$1]+=$3;}        # Add $3 to the array B having index of $1
END{for(i in A)    # Get all the indexes of array A in i

{printf "%s|%s|%0.02f\n", i,A,B}}' file        # Print i(means $1),A(means sum of all $2 for the particular $1),B(means sum of all $3 for the particular $1). Used 0.02f here for floating purpose.

Hope this helps you.

Regards,

pamu