I need to sum up the values in field nr 5 in a data file that contains some file listing. The 5th field denotes the size of each file and following are some sample values.
1,775,947,633
4,738
7,300
16,610
15,279
0
0
I tried the following code in a shell script.
awk '{sum+=$5} END{print sum}' mylogfile.log
But it does not add up the numbers. The result is the sum of values until it encounters the first comma in each of the numbers.
I also tried doing a substitution before summing up (as below), but even that did not give expected result.
awk '{sum+=gsub(",", "", $5)} END{print sum}' mylogfile.log
Please suggest if you have encountered such a scenario.
Hi, did I understand correctly?
awk -F, 'NR == 5 {for(i = 1; i <= NF; i++) sum+=$i; print sum}' file
294
--- Post updated at 10:30 ---
awk '{sum = 0; t = split($5, arr, ","); for(i = 1; i <= t; i++) sum+=arr; print sum}' file
fixed
Not exactly. To explain further, following is the input data.
1,775,947,633
4,738
7,300
16,610
15,279
0
0
Expected output is (sum of all the numbers):
1775991560
I am trying to achieve this using awk.
--- Post updated at 02:14 PM ---
solved this using gsub by correcting my arguments to gsub as follows.
awk '{gsub(/,/,"",$5);sum+=$5} END{print sum}' file
Result:
1775991560
awk {gsub(",", "", $5); sum+=$5} END {print sum}' file
may be not "$5" field
awk {gsub(",", ""); sum+=$0} END {print sum}' file
Supposed you only show $5 it is
awk '{gsub(",", "", $5); sum+=$5} END{print sum}' mylogfile.log
The gsub() returns the number of performed substitutions, not the result string. The result is stored in the input variable, here $5.
Because modification of an input field like $5 causes a reformatting of $0, it sometimes makes sense to have an extra variable.
A demonstration:
awk '{x=$5; gsub(",", "", x); sum+=x; print} END{print sum}'
Compare with
awk '{gsub(",", "", $5); sum+=$5; print} END{print sum}'
2 Likes