Improving code

Gents,

I did the below code to get an output (report) ,.. the code works fine but I believe it can be more shorted using better method.

Please if you can help, to generate same output improving the code , will be great.

here my code.

# get diff in time
awk '{$9=$8-prev8;prev8=$8;print $9/1000000}' tmp1 OFS="\t" |
awk 'NR>1{print}' | awk '{a=int($1); print a}' OFS="\t"   > tmp2

# get distance 
awk '{$9=$6-prev6;prev6=$6;print}' tmp1 OFS="\t" | awk '{$10=$7-prev7;prev7=$7;print}' OFS="\t" | 
awk '{a=(sqrt(($9)^2+($10)^2)); print a/1000}' | awk 'NR>1{print}' | awk '{a=int($1); print a }' OFS="\t"  > tmp3

paste tmp3 tmp2 > tmp4

# filter by conditions
vpff=`awk  '$2>=18 { ++count } END{ print count +1 }' tmp4`
vpds4a=`awk  '$1>=2 && $1<=5 && $2<18 { ++count } END{ print count }' tmp4`
vpds4b=`awk  '$1>5 && $1<=11 && $2<18 { ++count } END{ print count }' tmp4`
vpds4=$(($vpds4a+vpds4b))
vpdsss=`awk  '$1>11 && $2<18 { ++count } END{ print count }' tmp4`

printf "      INFO1:     $vpff \n" > tmp5
printf "      INFO2:     $vpds4\n" >> tmp5
printf "      INFO3:     $vpdsss \n" >> tmp5

# average for each lien
awk 'FNR==NR{sum+=$2;next}; {printf ("%s %4d %4.1f\n", $1,$2,($2/sum)*100)}' tmp5{,} > tmp6 

# report
awk 'BEGIN{
printf ("\t-------------------------------------------\n")
print ("\tCode \t           Total-VPs \t  Total-PCT")
printf ("\t-------------------------------------------\n")
}
{
sum2 += $2;
printf ("\t%-15s\t%9d\t%8.1f\n",$1,$2,$3)
}
END {
printf ("\t-------------------------------------------\n")
printf ("\tTotal:\t%17d\n",sum2)
printf ("\t-------------------------------------------\n")
}' tmp6

attached input file..

Thanks and regards..

You already have 317 posts in this forum and I bet you got lots of hints to your questions in the past. What would be the most obvious "bad habit" in this script?

6 Likes

I was about to ask the same as zaxxon did, as I'm surprised that none of the help given in the past seems to have fallen on fertile ground. Until a decent answer is posted, I'll withhold the "one single awk" proposal I came up with.

4 Likes

Gents,

Thanks for your comments. I have learn lot from your advices.. I post my code to find more alternatives, with the expert people.

Appreciate your help.

---------- Post updated at 09:17 AM ---------- Previous update was at 09:17 AM ----------

What would be the most obvious "bad habit" in this scrip

I believe instead to create many output files, this can be do in variables ..

Yes, true, partly, but: is that all? What did you learn from e.g.

?

1 Like

I'll cut to the chase and be a bit more generous -- you are reprocessing your input files over, and over, and over, and over, and over, and over, and over, when you could have done so just once or twice.

Your code is too big a mess to replace wholesale, especially since we know nothing about your input, but 'filter by conditions' is particularly egregious. You are allowed to put more than one statement in an awk program! You could have done 5 times as much work at once. Here is pseudocode.

read vpff vpds4a vpds4b vpds4 vpdsss <<EOF
$( awk '
        awk-range1 { count1++ }
        awk-range2 { count2++ }
        awk-range3 { count3++ }
        awk-range4 { count4++ }
        awk-range5 { count5++ }
        END { print count1+0, count2+0, count3+0, count4+0, count5+0; }' tmp4 )
EOF

The idea is to have awk print a line like "5 7 3 9 12" which gets dumped into read and split among its variables.

1 Like

Thanks for the advices.

Yes it is clear that my code is a big a mess , but to be honest that is what i can do till. now.. we learn daily.. hope in the future improve more.

Regards

Certainly, we're not here to put you in "digital pillory". On the other hand, wouldn't it be nice and satisfying to see some learning curve?
However, see if this comes close to what you need although it saves 12 awk processes, 5 temp files, 18 file operations, and 7 readings of lengthy files:

awk '
NR > 1          {DS = int( sqrt ( ($6 - PRV6) ^2 + ($7 - PRV7) ^2) / 1E3 )
                 DT = int( ($8 - PRV8) / 1E6 )
                }

                {PRV6 = $6
                 PRV7 = $7
                 PRV8 = $8
                }

DT >= 18        {CNTDT++
                 next
                }

DS  > 11        {CNTSS++
                 next
                }

DS >=  2        {CNTDS++
                }

END             {SUM = ++CNTDT + CNTSS + CNTDS
                 LNS = "\t-------------------------------------------\n"
                 FMT = "\t%-15s\t%9d\t%8.1f\n"

                 printf LNS "\tCode \t           Total-VPs \t  Total-PCT\n" LNS

                 printf (FMT, "INFO1:", CNTDT, CNTDT / SUM * 100)
                 printf (FMT, "INFO2:", CNTDS, CNTDS / SUM * 100)
                 printf (FMT, "INFO3:", CNTSS, CNTSS / SUM * 100)

                 printf LNS "\tTotal:\t%17d\n" LNS , SUM


                }
' /tmp/tmp1
    -------------------------------------------
    Code                Total-VPs       Total-PCT
    -------------------------------------------
    INFO1:                    85         0.5
    INFO2:                 11151        66.7
    INFO3:                  5481        32.8
    -------------------------------------------
    Total:                16717
    -------------------------------------------
1 Like

Dear RudiC

As always you are great.. thanks a lot .

I will do my best to improve.

Thanks again for your help