Sum the values in the column using date column

I have a file which need to be summed up using date column.

I/P:

2017/01/01 a 10
2017/01/01 b 20
2017/01/01 c 40
2017/01/01 a 60
2017/01/01 b 50
2017/01/01 c 40
2017/01/01 a 20
2017/01/01 b 30
2017/01/01 c 40

2017/02/01 a 10
2017/02/01 b 20
2017/02/01 c 30
2017/02/01 a 10
2017/02/01 b 40
2017/02/01 c 60
2017/02/01 a 10
2017/02/01 b 5
2017/02/01 c 15

2017/03/01 a 5
2017/03/01 b 15
2017/03/01 c 20
2017/03/01 d 10
2017/03/01 a 20
2017/03/01 b 25
2017/03/01 c 30
2017/03/01 d 20
2017/03/01 a 5
2017/03/01 b 15
2017/03/01 c 20

O/P:

2017/01/01 a 90
2017/01/01 b 100
2017/01/01 c 120
2017/02/01 a 30
2017/02/01 b 65
2017/02/01 c 105
2017/03/01 a 30
2017/03/01 b 55
2017/03/01 c 50
2017/03/01 d 30

Welcome to the forum.

Any attempts / ideas / thoughts from your side?

It also helps us if you tell us what operating system and shell you're using whenever you post questions in this forum.

Are you sure that the output line:

2017/03/01 c 50

has the value that you want in the 3rd field?

It worked for me.

awk '{a[$1" "$2]+=$3+$4}END{for (i in a){print i,a}}'  filename

So - what (except for the strange, non-existing - but not hurting - $4) don't you like with your own approach?

1 Like

As RudiC said, it looks like:

awk '{a[$1" "$2]+=$3+$4}END{for (i in a){print i,a}}'  filename

should work, but since there are only 3 fields in any of your input lines, you should get the same results with:

awk '{a[$1" "$2]+=$3}END{for (i in a){print i,a}}'  filename

Note, however, that with the sample input you provided in post #1 in this thread, the output from the input lines:

2017/03/01 c 20
2017/03/01 c 30
2017/03/01 c 20

seems to me like it should be:

2017/03/01 c 70

instead of:

2017/03/01 c 50
1 Like

Thanks Don. It's a typo, I have tried without $4 and it worked as expected.