Optimize awk command

WARNING=${1}
CRITICAL=${2}

echo ${OUTPUT} | gawk -F'[=,]' ' {
        V[++n] = $2
        R[n]  = $0
} END {
        for ( i = 1; i <= n; i++) {
                if((V > 0) && (V < V[i+1]))
                        print R, ((V[i+1] - V) / V) * 100
                else if ((V > V[i+1]) && (V[i+1] > 0))
                        print R, ((V[i+1] - V) / V) * 100
                else if ((V == V[i+1]) && (V[i+1] > 0))
                        print R, (V[i+1] - V)
                else if ((V == 0) && (V[i+1] > 0))
                        print R, (V[i+1] + 1 - V + 1 / 1 * 100)
                else if ((V[i+1] == 0) && (V > 0))
                        print R, ((V - V[i+1]) / V) * -100
                else if ((V == 0) && (V[i+1] == 0))
                        print R, (V + V[i+1])
                else
                        print R
        }
} ' OFS=, | gawk -F'[=,]' '{if(($4>='"$WARNING"') && ($4<='"$CRITICAL"')) {print $0} }'

so i have the above command which works wonderfully. but i found myself in a situation i didn't expect.

first, i believe the code I bolded above can be incorporated into the preceding awk calculations. any ideas on how to do that?

second, i need to find a way to only show output if and only if the numbers in field 4 ($4), after the calculations are done, is above two interesting numbers.

so the interesting numbers would be something like:

WARNING=-2
CRITICAL=2

so i want to alert if the numbers is either greater than and/or equal to -2 (negative 2) or greater than or equal to 2 (positive 2).

so in this context, for example, -3 should be considered greater than or equal to -2. and 3 of course should be considered greater than or equal to 2 as well.

any ideas on how to modify the above code to do that, efficiently?

Without OUTPUT string it's hard to test but this could replace your two awk scripts:

WARNING=${1}
CRITICAL=${2}

echo "${OUTPUT}" | gawk -F'[=,]' -vCRIT=${CRITICAL} -vWARN=${WARNING} ' {
        V[++n] = $2
        T[n] = $4
        R[n]  = $0
} END {
        for ( i = 1; i <= n; i++) {
            if(T >= WARN && T <= CRIT) {
                if((V > 0) && (V < V[i+1]))
                        print R, ((V[i+1] - V) / V) * 100
                else if ((V > V[i+1]) && (V[i+1] > 0))
                        print R, ((V[i+1] - V) / V) * 100
                else if ((V == V[i+1]) && (V[i+1] > 0))
                        print R, (V[i+1] - V)
                else if ((V == 0) && (V[i+1] > 0))
                        print R, (V[i+1] + 1 - V + 1 / 1 * 100)
                else if ((V[i+1] == 0) && (V > 0))
                        print R, ((V - V[i+1]) / V) * -100
                else if ((V == 0) && (V[i+1] == 0))
                        print R, (V + V[i+1])
                else
                        print R
            }
        }
} ' OFS=,

Note:

  • -3 is not greater than -2
  • It's a good idea to quote the ${OUTPUT} string to stop the shell globbing and param-splitting it.
1 Like

Chubler_XL, thank you so much!

the output looks like this:

survey=177,value=85.6701,time=[Sep-1-(14:21:02-1409606462-2014)],epoch=1409606462,avg=7.43836,range=120330--120507,-95.4802  survey=177,value=85.6701,time=[Sep-1-(14:21:02-1409606462-2014)],epoch=1409606462,avg=7.43836,range=120330--120507,-95.4802 survey=177,value=85.6701,time=[Sep-1-(14:21:02-1409606462-2014)],epoch=1409606462,avg=7.43836,range=120330--120507,-95.4802

notice the spaces.

and while i'm aware that -3 is not bigger than -2, what i meant to say was, i want to catch values that are less than or equal to -2. in which case, and in the context you are thinking, would mean numbers like -2, -3, -4, -5, -6, etc.

in addition to what Chubler_XL has suggested, you could change the prints in if else conditions as below

else if ((V == V[i+1]) && (V[i+1] > 0))
        print R, (V[i+1] - V)
## You could change the print to
print R, 0
 
else if ((V == 0) && (V[i+1] > 0))
        print R, (V[i+1] + 1 - V + 1 / 1 * 100)
## here, I am not sure what you wanted to achieve, but this can be written
print R, (V[i+1] + 101)
 
else if ((V[i+1] == 0) && (V > 0))
        print R, ((V - V[i+1]) / V) * -100
## here, again
print R, -100
 
else if ((V == 0) && (V[i+1] == 0))
        print R, (V + V[i+1])
## here,
print R, 0
1 Like

I'm very confused by this thread. Using the above code with OUTPUT set to:

survey=177,value=85.6701,time=[Sep-1-(14:21:02-1409606462-2014)],epoch=1409606462,avg=7.43836,range=120330--120507,-95.4802  survey=177,value=85.6701,time=[Sep-1-(14:21:02-1409606462-2014)],epoch=1409606462,avg=7.43836,range=120330--120507,-95.4802 survey=177,value=85.6701,time=[Sep-1-(14:21:02-1409606462-2014)],epoch=1409606462,avg=7.43836,range=120330--120507,-95.4802

(which uses echo to change the two spaces before the second occurrence of survey= into a single space) feeds a single line into gawk . Even if $OUTPUT expanded to contain multiple lines, the echo would convert any adjacent sequence of one or more <space>, <tab>, and <newline> characters into single <space> characters in the data passed in to gawk . Therefore, all of the conditions marked in red above will ALWAYS evaluate to false.

And, you said the above script is working correctly, but you want it to run faster by combining the two invocations of gawk into a single invocation of gawk (or awk ), but the description of what you want the code:

if(($4>='"$WARNING"') && ($4<='"$CRITICAL"')) {print $0}

to do, doesn't even come close to what this code does. If WARNING is set to -2 and CRITICAL is set to 2, this code will print the single input line to your script if and only if $4 in your input ( 85.6701 in your sample input) is between -2 and 2. Since 85.6701 is not between -2 and 2, your script will produce no output for your sample input. Borrowing from Chubler_XL and taking SriniShoo's evaluation to the next step, your code could be optimized to something like:

WARNING=${1}
CRITICAL=${2}

echo ${OUTPUT} | gawk -F'[=,]' -v CRIT=${CRITICAL} -v WARN=${WARNING} '
$4 >= WARN && $4 <= CRIT {
        if ($2 > 0)
                print $0, -100
        else if ($2 == 0)
                print $0, 0
        else    print
} ' OFS=,

Or, to match your various English descriptions of how $WARNING and $CRITICAL are to be used, change:

$4 >= WARN && $4 <= CRIT {

to:

$4 <= WARN {

or, maybe:

$4 <= WARN || $4 >= CRIT {
1 Like

In addition to the observations made earlier, I think the script could be reduced to something like this, without arrays and enumeration in the END section. With i representing the $2 value V(N) and j representing V(N+1) ..

WARNING=${1}
CRITICAL=${2}

printf "%s\n" "${OUTPUT}" |
awk -F'[=,]' -v WARN=$WARNING -v CRIT=$CRITICAL '
  NR>1 && ( $4<=WARN || $4>=CRIT ) {
    j=$2
    if      (i>0  && i<=j)  print s, (j?1:-1) * 100 * (j - i) / i
    else if (j>0  && i>j)   print s, 100 * (j - i) / i
    else if (i>=0 && i==j)  print s, 0
    else if (i==0 && j>0)   print s, j - i + 101
    else print s
  }
  {
    i=$2
    s=$0
  }
' OFS=,

I think

$4*$4 > 4

would satisfy both ($4 < -2 or $4 > 2) conditions...