Search for columns with numbers greater than 70

cokedude · July 31, 2019, 5:38pm

This helped get me started.

This is the command I am using. I am trying to find numbers greater than 70 in column 5. I do not know if it is getting confused because there is a % at the end or if it because there is a single digit in column 5,

grep dev file | awk '$5 > 70'

This is my data that I get after I run the command. It is mostly correct except for the random 8% and 9%.

server1: /dev/hd4           1.00      0.29   72%    22385    24% /
server2: /dev/hd4           1.00      0.07   93%    25802    54% /
server3: /dev/hd4           2.00      0.55   73%    25416    16% /
server4: /dev/hd4           1.00      0.91    9%     6065     3% /
server5: /dev/hd4           2.00      0.51   75%    29052    20% /
server6: /dev/hd4           1.00      0.92    8%     5521     3% /
server7: /dev/hd4           4.00      3.64    9%     5314     1% /
server8: /dev/hd4           5.00      4.62    8%     5248     1% /

RavinderSingh13 · July 31, 2019, 10:37pm

Hello cokedude,

Could you please try following(I haven't tested it though).

awk '/dev/ && $5+0 > 70'  Input_file

Thanks,
R. Singh

MadeInGermany · August 1, 2019, 11:59am

The trailing % sign makes $5 a string, and the > compares strings.
The +0 ensures that the > compares numbers.
Most awk versions will then ignore the trailing % sign.
But a few awk versions convert such strings to zero; they need

awk 'sub(/%$/,"",$5) && $5 > 70'

The sub() should return 1 (true) on a successfull substitution. So maybe you can omit a further test like $2~/dev/ .

rdrtx1 · August 1, 2019, 5:24pm

awk '$5 > 70' FS="[ %] *" infile

Don_Cragun · August 1, 2019, 10:28pm

Hi rdrtx1,
Unfortunately, the above script won't match 80%, 90%, or 100%. One could use:

awk '/dev/ && $5 ~ /([7-9][1-9]|(8|9|10)0)%/' infile

but I find the suggestions provided by RavinderSingh13 and MadeInGermany easier to read.

I don't know of any versions of awk written since 1980 that won't accept $5+0 to strip off the trailing percent sign and then perform a numeric comparison (as required by the standards), but if you run into an awk that doesn't handle that correctly, MadeInGermany's approach will handle those cases while Ravinder's approach won't. I don't remember for sure whether or not the original awk written in the 1970's by Aho, Weinberg, and Kernighan handled this case or not; but I thought it did.

cokedude · August 2, 2019, 6:02pm

All of the above methods work. Thank you. RavinderSingh13 what does the extra 0 do? rdrtx1 and Don Cragun can you please explain your methods?

Don_Cragun · August 2, 2019, 10:36pm

There is no extra zero in the code RavinderSingh13 provided. The reason the +0 is present in his code has already been explained by MadeInGermany in post #3 in this thread.

I simply created an extended regular expression that matches the values you want (i.e., those with the values 71% through 100%) in field #5 of your input.

If you can't read an ERE and determine what it matches, I strongly suggest that you follow my advice in post #5 and use one of the suggestions posted by RavinderSingh13 and MadeInGermany.