Gents,
I did the below code to get an output (report) ,.. the code works fine but I believe it can be more shorted using better method.
Please if you can help, to generate same output improving the code , will be great.
here my code.
# get diff in time
awk '{$9=$8-prev8;prev8=$8;print $9/1000000}' tmp1 OFS="\t" |
awk 'NR>1{print}' | awk '{a=int($1); print a}' OFS="\t" > tmp2
# get distance
awk '{$9=$6-prev6;prev6=$6;print}' tmp1 OFS="\t" | awk '{$10=$7-prev7;prev7=$7;print}' OFS="\t" |
awk '{a=(sqrt(($9)^2+($10)^2)); print a/1000}' | awk 'NR>1{print}' | awk '{a=int($1); print a }' OFS="\t" > tmp3
paste tmp3 tmp2 > tmp4
# filter by conditions
vpff=`awk '$2>=18 { ++count } END{ print count +1 }' tmp4`
vpds4a=`awk '$1>=2 && $1<=5 && $2<18 { ++count } END{ print count }' tmp4`
vpds4b=`awk '$1>5 && $1<=11 && $2<18 { ++count } END{ print count }' tmp4`
vpds4=$(($vpds4a+vpds4b))
vpdsss=`awk '$1>11 && $2<18 { ++count } END{ print count }' tmp4`
printf " INFO1: $vpff \n" > tmp5
printf " INFO2: $vpds4\n" >> tmp5
printf " INFO3: $vpdsss \n" >> tmp5
# average for each lien
awk 'FNR==NR{sum+=$2;next}; {printf ("%s %4d %4.1f\n", $1,$2,($2/sum)*100)}' tmp5{,} > tmp6
# report
awk 'BEGIN{
printf ("\t-------------------------------------------\n")
print ("\tCode \t Total-VPs \t Total-PCT")
printf ("\t-------------------------------------------\n")
}
{
sum2 += $2;
printf ("\t%-15s\t%9d\t%8.1f\n",$1,$2,$3)
}
END {
printf ("\t-------------------------------------------\n")
printf ("\tTotal:\t%17d\n",sum2)
printf ("\t-------------------------------------------\n")
}' tmp6
attached input file..
Thanks and regards..
zaxxon
February 6, 2017, 8:19am
2
You already have 317 posts in this forum and I bet you got lots of hints to your questions in the past. What would be the most obvious "bad habit" in this script?
6 Likes
RudiC
February 6, 2017, 8:28am
3
I was about to ask the same as zaxxon did, as I'm surprised that none of the help given in the past seems to have fallen on fertile ground. Until a decent answer is posted, I'll withhold the "one single awk" proposal I came up with.
4 Likes
Gents,
Thanks for your comments. I have learn lot from your advices.. I post my code to find more alternatives, with the expert people.
Appreciate your help.
---------- Post updated at 09:17 AM ---------- Previous update was at 09:17 AM ----------
What would be the most obvious "bad habit" in this scrip
I believe instead to create many output files, this can be do in variables ..
RudiC
February 6, 2017, 9:53am
5
Yes, true, partly, but: is that all? What did you learn from e.g.
?
1 Like
I'll cut to the chase and be a bit more generous -- you are reprocessing your input files over, and over, and over, and over, and over, and over, and over, when you could have done so just once or twice.
Your code is too big a mess to replace wholesale, especially since we know nothing about your input, but 'filter by conditions' is particularly egregious. You are allowed to put more than one statement in an awk program! You could have done 5 times as much work at once. Here is pseudocode.
read vpff vpds4a vpds4b vpds4 vpdsss <<EOF
$( awk '
awk-range1 { count1++ }
awk-range2 { count2++ }
awk-range3 { count3++ }
awk-range4 { count4++ }
awk-range5 { count5++ }
END { print count1+0, count2+0, count3+0, count4+0, count5+0; }' tmp4 )
EOF
The idea is to have awk print a line like "5 7 3 9 12" which gets dumped into read and split among its variables.
1 Like
jiam912
February 6, 2017, 11:23am
7
Thanks for the advices.
Yes it is clear that my code is a big a mess , but to be honest that is what i can do till. now.. we learn daily.. hope in the future improve more.
Regards
RudiC
February 6, 2017, 12:15pm
8
Certainly, we're not here to put you in "digital pillory". On the other hand, wouldn't it be nice and satisfying to see some learning curve?
However, see if this comes close to what you need although it saves 12 awk
processes, 5 temp files, 18 file operations, and 7 readings of lengthy files:
awk '
NR > 1 {DS = int( sqrt ( ($6 - PRV6) ^2 + ($7 - PRV7) ^2) / 1E3 )
DT = int( ($8 - PRV8) / 1E6 )
}
{PRV6 = $6
PRV7 = $7
PRV8 = $8
}
DT >= 18 {CNTDT++
next
}
DS > 11 {CNTSS++
next
}
DS >= 2 {CNTDS++
}
END {SUM = ++CNTDT + CNTSS + CNTDS
LNS = "\t-------------------------------------------\n"
FMT = "\t%-15s\t%9d\t%8.1f\n"
printf LNS "\tCode \t Total-VPs \t Total-PCT\n" LNS
printf (FMT, "INFO1:", CNTDT, CNTDT / SUM * 100)
printf (FMT, "INFO2:", CNTDS, CNTDS / SUM * 100)
printf (FMT, "INFO3:", CNTSS, CNTSS / SUM * 100)
printf LNS "\tTotal:\t%17d\n" LNS , SUM
}
' /tmp/tmp1
-------------------------------------------
Code Total-VPs Total-PCT
-------------------------------------------
INFO1: 85 0.5
INFO2: 11151 66.7
INFO3: 5481 32.8
-------------------------------------------
Total: 16717
-------------------------------------------
1 Like
Dear RudiC
As always you are great.. thanks a lot .
I will do my best to improve.
Thanks again for your help