Hi, I'm trying to use awk arrays to compare values across two files based on multiple columns. I've attempted to load file 2 into an array and compare with values in file 1, but success has been absent. If anyone has any suggestions (and I'm not even sure if my script so far is on the right lines) it would be very much appreciated.
file1
A 1 10 20
B 1 35 50
C 2 40 50
D 2 65 100
E 3 10 30
F 3 20 40
G 4 25 50
H 4 45 70
file2
ID1 11 16 1
ID2 75 100 1
ID3 45 47 2
ID4 15 30 3
ID5 40 45 4
ID6 55 65 4
Essentially, if column 2 in file 1, and column 4 in file 2 match, continue.
Then, if column 2 in file 2 is >= column 3 in file 1
and column 3 in file 2 is <= column 4 in file 1
print out matching line of file 1, and column 1,2 and 3 from file2.
Desired output
A 1 10 20 ID1 11 16
C 2 40 50 ID3 45 47
E 3 10 30 ID4 15 30
G 4 25 50 ID5 40 45
H 4 45 70 ID6 55 65
Using a pseudo script based on similar problems online, I've got:
awk 'FNR == NR
{
f2[$0]++
next
}
{
for (i in f2)
{
split(i,f2_split)
if ((f2_split[4] == $2) && (f2_split[2] >= $3) && (f2_split[3] <= $4))
{print $0, f2_split[1],f2_split[2],f2_split[3]
}
}
}' file2 file1
This does a fantastic job of printing out the contents of file 2. Alas it's not what I was after. Any help would be much appreciated.