Interval merging

I do have a file (file1) with a specified interval of 500 counts

file1

0	500
500	1000
1000	1500
1500	2000
2000	2500
2500	3000
3000	3500
3500	4000
4000	4500
4500	5000
5000	5500
5500	6000
6000	6500
6500	7000
7000	7500
7500	8000
8000	8500
8500	9000
9000	9500
9500	10000
...
.....

I have another file (file2) with 3 columns, where the interval between columns 1 and 2 is 100

550 600 7
1100	1200	  8
2100 2200 5
3500 3600 8
9600 9700 7

I want to merge file1 and file2 , so that the resultant file (file3) will be merged within the interval of file 1.

File3

500	1000 550 600 7
1000	1500 1100	1200	  8
2000	2500 2100 2200 5
3500	4000 3500 3600 8
9000	9500 9600 9700 7

It would be great if I could get some help in this interval matching problem either using awk or sed. The real files have 1000's of lines.

awk 'NR == FNR {a[FNR] = $0; x=FNR; next}
  {b[FNR] = $0; y=FNR; next}
  END {m = (x > y) ? x : y;
    for(i = 1; i <= m; i++) print a, b}' file1 file2 > file3

Another awk approach:

awk '
        NR == FNR {
                A[++c] = $1 OFS $2 OFS $3
                next
        }
        {
                for ( i = 1; i <= c; i++ )
                {
                        n = split ( A, T )
                        if ( T[1] >= $1 && T[2] <= $2 )
                                print $0, A
                }
        }
' OFS='\t' file2 file1
1 Like

Does file1 have anything in it other than the ranges? (If not, it isn't needed for this project.)

Can there be more than 1 line in file2 that fits into the same 500 count interval? If there are two input lines in the same interval, are two output lines produced for that interval or are the two lines from file2 merged onto a single line in file3? (If they are merged, is there any limit to the number of lines that should be merged into a single output line?)

Is file2 always sorted by increasing numeric value in the 1st and 2nd fields (as in your example)?
--------------
Update...
Can we assume that the last line of your sample output should have been:

9500	10000 9600 9700 7

rather than:

9000	9500 9600 9700 7