I have files :
cat file1
15
88
44
667
33
4
cat file2
445 66 77 3 56
I have files :
cat file1
15
88
44
667
33
4
cat file2
445 66 77 3 56
awk '{for(i=1;i<=NF;i++)t+=$i} END {print t}' file1 file2 file3
is it possible to do it without awk?? Just to add smth to my my code?
Not a fan of awk?
for f in file1 file2 file3; do
while read line || [ -n "$line" ]; do
for num in $line; do
(( t += num ))
done
done < $f
done
echo $t
Lol yes not fan what about echo 5 6 | sh my_code?
script.ksh
for num in `cat $*`;do((t+=num));done;echo $t
... or ...
awk '{for(i=1;i<=NF;i++)t+=$i} END {print t}' $*
echo 5 6 | sh script.ksh
Thank you:)
I try to avoid loops in awk to speed things up. If you have gnu awk
, you can do this:
awk '{a+=$1} END {print a}' RS=" |\n" file?
1779
If you like to store this into a variable do this:
var=$(awk '{a+=$1} END {print a}' RS=" |\n" file?)
PS, if you have awk on your system, why not use it?
I dont have awk and I am not fan of awk..Even now I dont understand that equation u wrote to me... prefer loops and statements...More clear for me..
RS=" |\n"
make a data in the file come out in separate lines, like
1 2 3
changes to
1
2
3
a+=$1
add all lines to variable a
print a
prints the variable a
file?
represent any file from file1
to file9
What system are you on?
See my post in this forum Finding an average
At the end it divides the sum to find the average.
While there may be combinations of AWK implementation and operating system on which your suggestion is faster, I compared it against its predecessor on two combinations and yours was slower everytime.
$ seq 1000000 | paste - - - - - - - - - - > data
$ wc data
100000 1000000 6888896 data
$ head -n5 data
1 2 3 4 5 6 7 8 9 10
11 12 13 14 15 16 17 18 19 20
21 22 23 24 25 26 27 28 29 30
31 32 33 34 35 36 37 38 39 40
41 42 43 44 45 46 47 48 49 50
$ tail -n5 data
999951 999952 999953 999954 999955 999956 999957 999958 999959 999960
999961 999962 999963 999964 999965 999966 999967 999968 999969 999970
999971 999972 999973 999974 999975 999976 999977 999978 999979 999980
999981 999982 999983 999984 999985 999986 999987 999988 999989 999990
999991 999992 999993 999994 999995 999996 999997 999998 999999 1000000
For each of the following results, the best of 5 runs was chosen.
Cygwin/GAWK 4.1.0:
$ time gawk '{for(i=1;i<=NF;i++)t+=$i} END {print t}' data
500000500000
real 0m1.359s
user 0m1.327s
sys 0m0.015s
$ time gawk '{a+=$1} END {print a}' RS=' |\t|\n' data
500000500000
real 0m2.797s
user 0m2.796s
sys 0m0.030s
Linux/MAWK 1.3.3:
$ time mawk '{for(i=1;i<=NF;i++)t+=$i} END {print t}' data
5e+11
real 0m0.753s
user 0m0.640s
sys 0m0.032s
$ time mawk '{a+=$1} END {print a}' RS=' |\t|\n' data
5e+11
real 0m1.346s
user 0m1.268s
sys 0m0.012s
In my opinion, unless there is a confirmed performance issue and unless the AWK implementation is known, unqualified AWK optimization tips are usually a bad idea (doubly so when advising a novice who is more likely to blindly internalize the advice).
Different awk implementations, and even different versions of the same implementation, implement differing sets of optimization strategies. One example I ran into recently: gawk lazily recomputes $0. As you probably know, POSIX requires recomputing $0 whenever a field is modified. gawk will not perform that recomputation until $0 is referenced (if at all). That optimization in effect:
$ time gawk '{for (i=1;i<=NF;i++) $i=""}' data
real 0m0.594s
user 0m0.593s
sys 0m0.030s
$ time mawk '{for (i=1;i<=NF;i++) $i=""}' data
real 0m1.039s
user 0m0.900s
sys 0m0.060s
Even though it is MAWK who has the speedy reputation, this version of GAWK is much faster because it doesn't recompute $0 after each $i=""
(since $0 is never referenced after a field modification, it is never recomputed).
Regards,
Alister
This was very interesting, and an eye opener. I have never tested this, just thought i many be solver to run ting in loop. This prove it many be wrong.
Thanks for taking time to test.