Subtract coloumn3 with coloumn 4 & divide it

sam · October 25, 2015, 7:17pm

I have a file called testfile as per my requirement I want to subtract coloumn3 with coloumn 4 & divide it with (1024*10241024)) and finally if the value is greater than 1.5 then it should be printed
formula looks like this (($3-$4)/(1024*10241024))

more testfile
2015-01-19 00:12:32 4227465216 2598752128 333 349 23 60 56811040
2015-01-19 00:13:52 4236378112 2571921240 332 349 11 60 140968528
2015-01-19 00:14:25 4233428992 2454180800 332 349 11 60 3076
2015-01-19 00:15:25 4242997248 2594027248 332 349 14 60 1060
2015-01-19 00:16:25 4239589376 2572173856 332 349 29 60 0

final output looks like this:
2015-01-19 00:12:32 1.5 333 349 23 60 56811040
2015-01-19 00:13:52 1.5 332 349 11 60 140968528
2015-01-19 00:15:25 1.5 332 349 14 60 1060
2015-01-19 00:16:25 1.5 332 349 29 60 0

---------- Post updated at 08:17 AM ---------- Previous update was at 06:50 AM ----------

this is new code , but not getting expected results

cat filename | awk '{val=($3-$4)/(1024*1024*1024) val> 1.4 {print $1,"\t" $2,"\t" $3,"\t" $4,"\t" val,"\t"$5,"\t""\t"$6,"\t""\t"$7,"\t""\t" "\t" $8,"\t" $9,"\t";}

Aia · October 25, 2015, 7:40pm

awk '
(($3+0)-($4+0))/(1024*1024*1024) > 1.5 {
    printf "%s %.1f %s\n", $1OFS$2,(($3+0)-($4+0))/(1024*1024*1024), $5OFS$6OFS$7OFS$8OFS$9
}' OFS='\t'  testfile

sam · October 25, 2015, 7:56pm

Thanks Aia, I got output from below code:

cat sam | awk '{val=($3-$4)/(1024*1024*1024)} val > 1.3 {print $1,$2,$3,$4,val,$5,$6,$7,$8,$9;} '

but in the output column 5 should be printed as below, how to achieve this
1.5
1.5
1.6
1.5
1.5

output produced
2015-01-19 00:12:32 4227465216 2598752128 1.51686 333 349 23 60 56811040
2015-01-19 00:13:52 4236378112 2571921240 1.55015 332 349 11 60 140968528
2015-01-19 00:14:25 4233428992 2454180800 1.65705 332 349 11 60 3076
2015-01-19 00:15:25 4242997248 2594027248 1.53572 332 349 14 60 1060
2015-01-19 00:16:25 4239589376 2572173856 1.5529 332 349 29 60 0

Aia · October 25, 2015, 8:06pm

substitute the ` print ' for ` printf "%s %s %s %s %.1f %s %s %s %s %s\n", '
You may be able to combine some of those `%s' into one as I did in post #2
By the way, cat sam | is not necessary, since awk can read the file sam on its own, saving some cpu cycles.

awk '{val=(($3+0)-($4+0))/(1024*1024*1024)} val > 1.3 {printf "%s %s %s %s %.1f %s %s %s %s %s\n", $1,$2,$3,$4,val,$5,$6,$7,$8,$9;}' sam

jgt · October 25, 2015, 9:15pm

 val> 1.4

You should change this to

val => 1.50

No where in your division do you truncate or round the quotient to 1 decimal place.

sam · October 25, 2015, 10:21pm

Hello Aia, I have produced 2 outputs, and compared with column 5, I see some differences,
in 2nd line for output2 its showing 1.6 but i think it should be 1.5 and vice versa for next lines . can u pkease clarify me

OUTPUT1
awk '{val=($3-$4)/(1024*1024*1024)} val > 1.3 {print $1,$2,$3,$4,val,$5,$6,$7,$8,$9;} ' file       
2015-01-19 00:12:32 4227465216 2598752128 1.51686 1.51686 333 349 23 60
2015-01-19 00:13:52 4236378112 2571921240 1.55015 1.55015 332 349 11 60
2015-01-19 00:14:25 4233428992 2454180800 1.65705 1.65705 332 349 11 60
2015-01-19 00:15:25 4242997248 2594027248 1.53572 1.53572 332 349 14 60
2015-01-19 00:16:25 4239589376 2572173856 1.5529 1.5529 332 349 29 60

OUTPUT2
awk '{val=(($3+0)-($4+0))/(1024*1024*1024)} val > 1.3 {printf "%s %s %s %s %.1f %s %s %s %s %s\n", $1,$2,$3,$4,val,$5,$6,$7,$8,$9;}' file
2015-01-19 00:12:32 4227465216 2598752128 1.5 1.51686 333 349 23 60
2015-01-19 00:13:52 4236378112 2571921240 1.6 1.55015 332 349 11 60
2015-01-19 00:14:25 4233428992 2454180800 1.7 1.65705 332 349 11 60
2015-01-19 00:15:25 4242997248 2594027248 1.5 1.53572 332 349 14 60
2015-01-19 00:16:25 4239589376 2572173856 1.6 1.5529 332 349 29 60

Also I want to print only one max value as output(I mean print the line which has highest value in column 5), not all the values greater than 1.3 it SHOULD CHECK for values greater than 1.3 and out of those print the the max one

Don_Cragun · October 25, 2015, 11:39pm

You are correct in noting that using val > 1.4 as a synonym for val >= 1.5 is wrong. But, there are a few issues here.

The awk greater than or equal comparison operator is >= ; not => .
When printing numbers with print or printf with the format specifier %.1f , the printing function will round the value being printed up or down to the closest number representable in that format to the value being printed. (So, the values 1.47 and 1.5491 would print as 1.5 .)
The >= comparison operator compares values without regard to how that value might be rounded by various print format specifiers. (So, the comparison 1.47 >= 1.5 evaluates to false even though both values would print as 1.5 if printed using printf("%.1f\n", value) .)
The double precision floating point value 1.5 happens to have an exact representation as an IEEE floating point value. The decimal values 1.3, 1.6, and 1.7, however, are not exactly representable in that format.

Aia · October 26, 2015, 12:28am

sam@sam:

Hello Aia, I have produced 2 outputs, and compared with column 5, I see some differences,
in 2nd line for output2 its showing 1.6 but i think it should be 1.5 and vice versa for next lines . can u pkease clarify me
OUTPUT2
awk '{val=(($3+0)-($4+0))/(1024*1024*1024)} val > 1.3 {printf "%s %s %s %s %.1f %s %s %s %s %s\n", $1,$2,$3,$4,val,$5,$6,$7,$8,$9;}' file
2015-01-19 00:12:32 4227465216 2598752128 1.5 1.51686 333 349 23 60
2015-01-19 00:13:52 4236378112 2571921240 1.6 1.55015 332 349 11 60
2015-01-19 00:14:25 4233428992 2454180800 1.7 1.65705 332 349 11 60
2015-01-19 00:15:25 4242997248 2594027248 1.5 1.53572 332 349 14 60
2015-01-19 00:16:25 4239589376 2572173856 1.6 1.5529 332 349 29 60
Also I want to print only one max value as output(I mean print the line which has highest value in column 5), not all the values greater than 1.3 it SHOULD CHECK for values greater than 1.3 and out of those print the the max one

The output differs because awk , always, rounds up the precision floating point. If that's not satisfactory, perhaps, one of the following more involved snippets might do what you want.

awk '
BEGIN{
    max = 0
    Gib = (1024*1024*1024)
}
function fformat(s){
    match(s, /^[0-9]+\.[0-9]/)
    return substr(s, RSTART, RLENGTH)
}
(val=fformat((($3+0)-($4+0))/Gib)) && val > 1.3{
    if(val > max){
        display = sprintf("%s %s %s %s %s %s %s %s %s %s", $1,$2,$3,$4,val,$5,$6,$7,$8,$9)
        max = val
    }
}
END{
    print display
}' sam.file

or:

awk '
BEGIN{
    max = 0
    Gib = (1024*1024*1024)
}
function fformat(s){
    match(s, /^[0-9]+\.[0-9]/)
    return substr(s, RSTART, RLENGTH)
}
(val=fformat((($3+0)-($4+0))/Gib)) && val > max{
        display = sprintf("%s %s %s %s %s %s %s %s %s %s", $1,$2,$3,$4,val,$5,$6,$7,$8,$9)
        max = val
}
END{
    print display
}' sam.file

sam · October 26, 2015, 7:40am

Hello Aia, As per my requirement I gave input parametre(limit) as 1.3, & output generated is shown in below code & from that output .........it should print only one line which contains the max value(i.e highest value) in col5, to achieve this, Its better to sort only that particular filed(clo5) in desc order and use head -1n.......I think this may work, how would you suggest

OUTPUT2
awk '{val=(($3+0)-($4+0))/(1024*1024*1024)} val > 1.3 {printf "%s %s %s %s %.1f %s %s %s %s %s\n", $1,$2,$3,$4,val,$5,$6,$7,$8,$9;}' file
2015-01-19 00:12:32 4227465216 2598752128  1.51686 333 349 23 60
2015-01-19 00:13:52 4236378112 2571921240  1.55015 332 349 11 60
2015-01-19 00:14:25 4233428992 2454180800  1.65705 332 349 11 60
2015-01-19 00:15:25 4242997248 2594027248  1.53572 332 349 14 60
2015-01-19 00:16:25 4239589376 2572173856  1.5529 332 349 29 60

RudiC · October 26, 2015, 8:02am

Try

awk '{val=(($3+0)-($4+0))/(1024*1024*1024)} val > 1.3 {printf "%s %s %s %s %.1f %s %s %s %s %s\n", $1,$2,$3,$4,val,$5,$6,$7,$8,$9 | "sort -k5,5nr | head -1"}' file
2015-01-19 00:14:25 4233428992 2454180800 1.7 332 349 11 60 3076

(BTW, are you sure the output given in post#9 comes from the awk script in there?)

sam · October 26, 2015, 5:32pm

Hello Rudi, you are right and the code works fine...

awk '{val=(($3+0)-($4+0))/(1024*1024*1024)} val > 1.3 {printf "%s %s %s %s %.1f %s %s %s %s %s\n", $1,$2,$3,$4,val,$5,$6,$7,$8,$9 | "sort -k5,5nr | head -1"}' file
2015-01-19 00:14:25 4233428992 2454180800 1.7 332 349 11 60 3076

in the code I gave limit for val as 1.3 and value greater than 1.3 will be printed ..Now suppose say values in file are less than 1.3 It means that no output will be generated, but here I need output to be printed as something like below:
ok ok ok ok ok ok ok ok ok ok

to do this first I should store output of awk in a variable, then
check if variable is empty or not,
If variable is empty then it should print output as 
ok ok ok ok ok ok ok ok ok ok
else printf "%b\n" "${variable}"
output will be like 
2015-01-19 00:14:25 4233428992 2454180800 1.7 332 349 11 60 3076

if [ -n "$var" ]; then     echo "$var"; else     echo "ok ok ok ok ok ok ok ok ok ok"; fi