Comparing all lines in a column with another is condition is met

amits22 · June 24, 2013, 2:21pm

Sorry for this noob question,
I have file with 4 columns like where columns 2 and 4 have numbers

a 55  k 3
b 59 l 3
c 79 m 277
d 255 n 277
e 257 o 267
f 267 p 287
g 290  q 287
h 290 r 287
i 310 s 900

now i want to select only those rows, where values in column 4 are greater than those in column 2 in a range of 1 to 30.
so the out put should be like

and those which exactly have a difference of 10

m 277
n 277
o 267

I will appreciate any help. I hope this can be done with awk.
Thanks

Don_Cragun · June 24, 2013, 3:54pm

amits22:

Sorry for this noob question,
I have file with 4 columns like where columns 2 and 4 have numbers
55  3
a 59 l 3
b 79 m 277
c 255 n 277
d 257 o 267
e 267 p 287
f 290  q 287
g 290 r 287
h 310 s 900
now i want to select only those rows, where values in column 4 are greater than those in column 2 in a range of 1 to 30.
so the out put should be like
m 277
n 277
o 267
p 287
q 287
r 287
and those which exactly have a difference of 10
m 277
n 277
o 267
I will appreciate any help. I hope this can be done with awk.
Thanks

I can't understand what you're trying to do. There are several problems here including:

You said your input has four columns, but the first line of your input only has two columns.
You said you "want to select only those rows, where values in column 4 are greater than those in column 2 in a range of 1 to 30", but in your 1st line showing what the output should be, 79 is not in the range 1 to 30; 277 is not in the range 1 to 30; and (277 - 79) is not in the range 1 to 30.
And, then for the second output file, you said you wanted to select lines from the first output file "which exactly have a difference of 10", but in the first line of output you show, 79 is not in the range 1 to 30; 277 is not in the range 1 to 30; (277 - 79) is not in the range 1 to 30; and (277 - 79) is certainly not 10.

Please give us a clearer statement of your requirements and give us examples that match your requirements!

amits22 · June 24, 2013, 8:05pm

Hi Don, sorry for not being clear.

Sorry that was a sloppy way of showing sample data :o. It now has 4 columns
and 3. I need to compare all the rows of column 4 with all the rows of column 2, comparison is not limited to same rows. Which is why i printed 3rd row of column 3 and 4 because '277' in column 4 has a difference in range of 1-30 from values in column 2 in rows 4,5 etc.

Please do let me know if this question does not still make a clear sense.

Thank you.

Don_Cragun · June 24, 2013, 9:45pm

Here is a simple brute force awk script that I think does what you want:

awk '
FNR == NR {
        for(i = ($2 + 1); i <= ($2 + 30); i++) r1to30
        r10[$2 + 10]
        next
}
$4 in r10 {
        print $3, $4 > "out10"
}
$4 in r1to30 {
        print $3, $4 > "out1-30"
}' file file

It produces two output files: out10 contains columns 3 and 4 of the input file lines where column 4 is 10 greater than some value in column 2 of the input file, and out1-30 contains columns 3 and 4 of the input file lines where column 4 - some value in column 2 is greater than or equal to 1 and less than or equal to 30.

amits22 · June 25, 2013, 4:08am

don cragun:

Here is a simple brute force awk script that I think does what you want:
awk '
FNR == NR {
   for(i = ($2 + 1); i <= ($2 + 30); i++) r1to30
   r10[$2 + 10]
   next
}
$4 in r10 {
   print $3, $4 > "out10"
}
$4 in r1to30 {
   print $3, $4 > "out1-30"
}' file file
It produces two output files: out10 contains columns 3 and 4 of the input file lines where column 4 is 10 greater than some value in column 2 of the input file, and out1-30 contains columns 3 and 4 of the input file lines where column 4 - some value in column 2 is greater than or equal to 1 and less than or equal to 30.