there are two patterns here, but with different values. the two patterns are "session_opened" and "session_closed". i expect there will be many more other patterns.
what i want to do is whenever there's a duplicate of patterns, i want to add up all the numbers of each duplicate, so that the refined output looks like this:
session_opened=168
session_closed=175
i dont want to do a "uniq" here in case there are duplicates with the same exact number. i want to make sure I add up ALL the values of all the patterns.
i would like to do this in awk. i cant seem to come up with any ideas to get this done.
this worked beautifully. anyway you can explain what each line is doing for me, please?? i know the first line is specifying the delimiter. but how are the numbers for each pattern being added. i do a lot of for loops like this in bash and i'd love to be able to do them all in awk.
Following may help you in same, please do let me knnow in case you have any queries on same.
awk -F'=' ' ##### Making = as a field seprator here.
{ c[$1] += $2 ##### creating an array named C whose index is $1 abd value is $2, so += means add the same indexes values to it's previous values, so that we could get a total sum of same index($1, first field's) as per your requirement.
}
END { for(i in c) printf("%s%s%s\n", i, FS, c) ##### Starting END block here, where starting a for loop in array c, so i is a vriable here and it will traverse through all items of array c, then printing the value of i(which is index pf array, you could say first field's value then printing FS(which is field seprator =) then printing the array c's current index(i)'s value by c.
}' file ##### mentioning Input_file here.
You count the $2 values per $1 value.
That means you need a variable per each $1, ideally this is a $1-addressed array. I.e. $1 is the array key.
And the array stores the sum of the $2 values, i.e. each $2 value is added to it.
Because it is unknown how many values are to be added, you need an END section to print the array keys and their values.
Even with an original Bourne shell and back in the days before test was a shell built-in, the -ne comparison operator was for comparing numeric values; not strings. For strings, the not equal comparison operator is and was != . And, I assume the final_total above in red was intended to be grand_total .
With POSIX conforming shells (and small input files), sort and shell might be faster than awk . You might want to try the following:
#!/bin/ksh
sub_total=0
grand_total=0
prev_desc=
sort file | while IFS="=" read desc amount
do if [ "$desc" != "$prev_desc" ]
then printf '%s=%d\n' "$prev_desc" "$sub_total"
grand_total=$((grand_total + sub_total))
sub_total=0
prev_desc=$desc
fi
sub_total=$((sub_total + amount))
done
printf '%s=%d\n' "$prev_desc" "$sub_total"
grand_total=$((grand_total + sub_total))
printf 'Grand Total=%d\n' "$grand_total"
Although written and tested using a Korn shell, the above script should work with any POSIX-conforming shell. And, with a 1993 or later version of the Korn shell, if you change all three occurrences of %d in the above script to %.2f , this script can also handle amounts presented with or without a decimal point and up to 2 digits after the decimal point (instead of just processing whole numbers).
Don, I think by piping into the while loop you force it into a sub shell, so at the end the variables are not updated!?
Only ksh derivates run the last part of a pipe (here the while loop) in the main shell.
Ouch. Yes, mostly. I sometimes forget why I like ksh so much. The standards don't force a subshell to be created in this case, but they do allow a subshell to be used in this case. So, the script I suggested will work with a Korn shell and some other shells, but it will not work with some other shells (including bash ). To make it portable to any POSIX-conforming shell, the easy way would be to go back to using a temp file: