Calculate percentage difference between two columns

I have a input text file in this format:

ITEM1 10.9 20.1
ITEM2 11.6 12
ITEM3 14 15.7
ITEM5 20 50.6
ITEM6 25 23.6

I want to print those lines which have more than 5% difference [negative or positive] between second and third columns.

Hello ctrld,

Could you please try following and let me know if this helps you.

awk '{if((Q=100-(($2 * 100)/$3)) > 5 || Q <= -5){print Q, $0}}'  Input_file

Thanks,
R. Singh

1 Like

Appreciate your quick help. I got following output:

awk '{if((Q=100-(($2 * 100)/$3)) > 5 || Q >= -5){print Q, $0}}' input.txt
45.7711 ITEM1 10.9 20.1
3.33333 ITEM2 11.6 12
10.828 ITEM3 14 15.7
60.4743 ITEM5 20 50.6

Seems like a small tweak may be required to eliminate ITEM2 .

Hello ctrld,

As you had mentioned to take 5% in positve and negative both ways so I had kept it, could you please try following and let me know then if this helps you.

awk '{if((Q=100-(($2 * 100)/$3)) > 5){print Q, $0}}'  Input_file

Thanks,
R. Singh

1 Like

This one works perfect.

awk '{if((Q=100-(($2 * 100)/$3)) > 5){print Q, $0}}' input.txt
45.7711 ITEM1 10.9 20.1
10.828 ITEM3 14 15.7
60.4743 ITEM5 20 50.6

Seems like awk is not considering if its negative or positive difference.

This one works perfect.

awk '{if((Q=100-(($2 * 100)/$3)) > 5){print Q, $0}}' input.txt
45.7711 ITEM1 10.9 20.1
10.828 ITEM3 14 15.7
60.4743 ITEM5 20 50.6

Seems like awk is not considering if its negative or positive difference.
Thanks a lot for your quick assistance in this query.

Hello ctrld,

No, awk considers the difference between +5 and -5 , offcourse if you will not put and +ve or -ve sign with any digit as per basic mathematics it will considered as a positive number, you could put -5 there and could see it will calculate from there.

Thanks,
R. Singh

1 Like

Try also (untested)

awk '{sqrt($2*$2/$3*$3)>1.05}' file

Post#8 was - hmmm - UNTESTED; sorry for that and thanks to RavinderSingh13 for pointing that out!

The percentage is dependent on the reference column, i.e. 10 and 10.5 have a 5% diff if referenced to 10, but 4,76% if reffed to 10.5 . Using col 2 as the reference, try

awk 'sqrt((1-$3/$2)*(1-$3/$2)) > 0.05'  file