Matching and extract data from a file

Gents,

Matching columns 1-19 in file1 and 20-38 in file 2, I would like to extract the data in the same order of file2.

file1

X  7494     11511  44149.00  48617.002    1  4321  44148.00  48198.00  49060.001 
X  7494     11511  44149.00  48617.002  433  8641  44160.00  48198.00  49060.001 
X  7494     11511  44149.00  48617.002  865 12961  44172.00  48198.00  49060.001 
X  7494     11511  44149.00  48617.002 1297 17281  44184.00  48198.00  49060.001 
X  7494     11511  44149.00  48617.002 1729 21601  44196.00  48198.00  49060.001 
X  7494     11511  44149.00  48617.002 2161 25921  44208.00  48198.00  49060.001 
X  7494     11611  44137.00  48641.001    1  4321  44148.00  47994.00  48856.001 
X  7494     11611  44137.00  48641.001  433  8641  44160.00  47994.00  48856.001 
X  7494     11611  44137.00  48641.001  865 12961  44172.00  47994.00  48856.001 
X  7494     11611  44137.00  48641.001 1297 17281  44184.00  47994.00  48856.001 
X  7494     11611  44137.00  48641.001 1729 21601  44196.00  47994.00  48856.001 
X  7494     11611  44137.00  48641.001 2161 25921  44208.00  47994.00  48856.001 
X  7494     11711  44137.00  48629.001    1  4321  44148.00  48210.00  49072.001 
X  7494     11711  44137.00  48629.001  433  8641  44160.00  48210.00  49072.001 
X  7494     11711  44137.00  48629.001  865 12961  44172.00  48210.00  49072.001 
X  7494     11711  44137.00  48629.001 1297 17281  44184.00  48210.00  49072.001 
X  7494     11711  44137.00  48629.001 1729 21601  44196.00  48210.00  49072.001 
X  7494     11711  44137.00  48629.001 2161 25921  44208.00  48210.00  49072.001 
X  7494     11811  44137.00  48425.001    1  4321  44148.00  48006.00  48868.001 
X  7494     11811  44137.00  48425.001  433  8641  44160.00  48006.00  48868.001 
X  7494     11811  44137.00  48425.001  865 12961  44172.00  48006.00  48868.001 
X  7494     11811  44137.00  48425.001 1297 17281  44184.00  48006.00  48868.001 
X  7494     11811  44137.00  48425.001 1729 21601  44196.00  48006.00  48868.001 
X  7494     11811  44137.00  48425.001 2161 25921  44208.00  48006.00  48868.001 

file2

44137.00  48629.001
44149.00  48617.002
44137.00  48425.001
44137.00  48641.001

output desired

X  7494     11511  44137.00  48629.001    1  4321  44148.00  48198.00  49060.001
X  7494     11511  44137.00  48629.001  433  8641  44160.00  48198.00  49060.001
X  7494     11511  44137.00  48629.001  865 12961  44172.00  48198.00  49060.001
X  7494     11511  44137.00  48629.001 1297 17281  44184.00  48198.00  49060.001
X  7494     11511  44137.00  48629.001 1729 21601  44196.00  48198.00  49060.001
X  7494     11511  44137.00  48629.001 2161 25921  44208.00  48198.00  49060.001
X  7494     11611  44149.00  48617.002    1  4321  44148.00  47994.00  48856.001
X  7494     11611  44149.00  48617.002  433  8641  44160.00  47994.00  48856.001
X  7494     11611  44149.00  48617.002  865 12961  44172.00  47994.00  48856.001
X  7494     11611  44149.00  48617.002 1297 17281  44184.00  47994.00  48856.001
X  7494     11611  44149.00  48617.002 1729 21601  44196.00  47994.00  48856.001
X  7494     11611  44149.00  48617.002 2161 25921  44208.00  47994.00  48856.001
X  7494     11711  44137.00  48425.001    1  4321  44148.00  48210.00  49072.001
X  7494     11711  44137.00  48425.001  433  8641  44160.00  48210.00  49072.001
X  7494     11711  44137.00  48425.001  865 12961  44172.00  48210.00  49072.001
X  7494     11711  44137.00  48425.001 1297 17281  44184.00  48210.00  49072.001
X  7494     11711  44137.00  48425.001 1729 21601  44196.00  48210.00  49072.001
X  7494     11711  44137.00  48425.001 2161 25921  44208.00  48210.00  49072.001
X  7494     11811  44137.00  48641.001    1  4321  44148.00  48006.00  48868.001
X  7494     11811  44137.00  48641.001  433  8641  44160.00  48006.00  48868.001
X  7494     11811  44137.00  48641.001  865 12961  44172.00  48006.00  48868.001
X  7494     11811  44137.00  48641.001 1297 17281  44184.00  48006.00  48868.001
X  7494     11811  44137.00  48641.001 1729 21601  44196.00  48006.00  48868.001
X  7494     11811  44137.00  48641.001 2161 25921  44208.00  48006.00  48868.001

Thanks for your help.

Assuming with "columns" you mean character positions, and that you swapped those in file 1 and 2, try

awk '
NR==FNR {IX=substr($0,20,19)
         T[IX] = T[IX] DL[IX] $0
         DL[IX] = "\n"
         next
        }
        {print T[$0]
        }
' file1 file2
1 Like

Dear RudiC
I dont get the disared
Should i need to change something in file2?

I'm not sure what "disared" means.

And the output you showed in post #1 in this thread does not match your requirements where you said "I would like to extract the data in the same order of file2."

Note that the 1st line in your sample file2 is:

44137.00  48629.001

and the lines in your sample file1 that contain those values in character positions 20 through 38 are:

X  7494     11711  44137.00  48629.001    1  4321  44148.00  48210.00  49072.001 
X  7494     11711  44137.00  48629.001  433  8641  44160.00  48210.00  49072.001 
X  7494     11711  44137.00  48629.001  865 12961  44172.00  48210.00  49072.001 
X  7494     11711  44137.00  48629.001 1297 17281  44184.00  48210.00  49072.001 
X  7494     11711  44137.00  48629.001 1729 21601  44196.00  48210.00  49072.001 
X  7494     11711  44137.00  48629.001 2161 25921  44208.00  48210.00  49072.001 

Which matches the 1st six lines of output produced RudiC's script (including copying the trailing space characters that are present in file1 :

X  7494     11711  44137.00  48629.001    1  4321  44148.00  48210.00  49072.001 
X  7494     11711  44137.00  48629.001  433  8641  44160.00  48210.00  49072.001 
X  7494     11711  44137.00  48629.001  865 12961  44172.00  48210.00  49072.001 
X  7494     11711  44137.00  48629.001 1297 17281  44184.00  48210.00  49072.001 
X  7494     11711  44137.00  48629.001 1729 21601  44196.00  48210.00  49072.001 
X  7494     11711  44137.00  48629.001 2161 25921  44208.00  48210.00  49072.001 
X  7494     11511  44149.00  48617.002    1  4321  44148.00  48198.00  49060.001 
X  7494     11511  44149.00  48617.002  433  8641  44160.00  48198.00  49060.001 
X  7494     11511  44149.00  48617.002  865 12961  44172.00  48198.00  49060.001 
X  7494     11511  44149.00  48617.002 1297 17281  44184.00  48198.00  49060.001 
X  7494     11511  44149.00  48617.002 1729 21601  44196.00  48198.00  49060.001 
X  7494     11511  44149.00  48617.002 2161 25921  44208.00  48198.00  49060.001 
X  7494     11811  44137.00  48425.001    1  4321  44148.00  48006.00  48868.001 
X  7494     11811  44137.00  48425.001  433  8641  44160.00  48006.00  48868.001 
X  7494     11811  44137.00  48425.001  865 12961  44172.00  48006.00  48868.001 
X  7494     11811  44137.00  48425.001 1297 17281  44184.00  48006.00  48868.001 
X  7494     11811  44137.00  48425.001 1729 21601  44196.00  48006.00  48868.001 
X  7494     11811  44137.00  48425.001 2161 25921  44208.00  48006.00  48868.001
X  7494     11611  44137.00  48641.001    1  4321  44148.00  47994.00  48856.001 
X  7494     11611  44137.00  48641.001  433  8641  44160.00  47994.00  48856.001 
X  7494     11611  44137.00  48641.001  865 12961  44172.00  47994.00  48856.001 
X  7494     11611  44137.00  48641.001 1297 17281  44184.00  47994.00  48856.001 
X  7494     11611  44137.00  48641.001 1729 21601  44196.00  47994.00  48856.001 
X  7494     11611  44137.00  48641.001 2161 25921  44208.00  47994.00  48856.001 

and that the 1st six lines of output you said you desired:

X  7494     11511  44137.00  48629.001    1  4321  44148.00  48198.00  49060.001
X  7494     11511  44137.00  48629.001  433  8641  44160.00  48198.00  49060.001
X  7494     11511  44137.00  48629.001  865 12961  44172.00  48198.00  49060.001
X  7494     11511  44137.00  48629.001 1297 17281  44184.00  48198.00  49060.001
X  7494     11511  44137.00  48629.001 1729 21601  44196.00  48198.00  49060.001
X  7494     11511  44137.00  48629.001 2161 25921  44208.00  48198.00  49060.001

do not contain any trailing spaces, and, even if they did, these lines do not appear anywhere in file1 in your sample data.

RudiC's code followed your stated requirements but did not produce the output you said you wanted to produce. So, are your requirements wrong, or is your sample output wrong?

1 Like

Do you want to extract lines from file1 (as specified) OR do you want to modify them?

Dear RudiC and Don

I wan to extract the data from file1, sorted as reference file2 .. Please I need the output like this.

X  7494     11711  44137.00  48629.001    1  4321  44148.00  48210.00  49072.001
X  7494     11711  44137.00  48629.001  433  8641  44160.00  48210.00  49072.001
X  7494     11711  44137.00  48629.001  865 12961  44172.00  48210.00  49072.001
X  7494     11711  44137.00  48629.001 1297 17281  44184.00  48210.00  49072.001
X  7494     11711  44137.00  48629.001 1729 21601  44196.00  48210.00  49072.001
X  7494     11711  44137.00  48629.001 2161 25921  44208.00  48210.00  49072.001
X  7494     11511  44149.00  48617.002    1  4321  44148.00  48198.00  49060.001
X  7494     11511  44149.00  48617.002  433  8641  44160.00  48198.00  49060.001
X  7494     11511  44149.00  48617.002  865 12961  44172.00  48198.00  49060.001
X  7494     11511  44149.00  48617.002 1297 17281  44184.00  48198.00  49060.001
X  7494     11511  44149.00  48617.002 1729 21601  44196.00  48198.00  49060.001
X  7494     11511  44149.00  48617.002 2161 25921  44208.00  48198.00  49060.001
X  7494     11811  44137.00  48425.001    1  4321  44148.00  48006.00  48868.001
X  7494     11811  44137.00  48425.001  433  8641  44160.00  48006.00  48868.001
X  7494     11811  44137.00  48425.001  865 12961  44172.00  48006.00  48868.001
X  7494     11811  44137.00  48425.001 1297 17281  44184.00  48006.00  48868.001
X  7494     11811  44137.00  48425.001 1729 21601  44196.00  48006.00  48868.001
X  7494     11811  44137.00  48425.001 2161 25921  44208.00  48006.00  48868.001
X  7494     11611  44137.00  48641.001    1  4321  44148.00  47994.00  48856.001
X  7494     11611  44137.00  48641.001  433  8641  44160.00  47994.00  48856.001
X  7494     11611  44137.00  48641.001  865 12961  44172.00  47994.00  48856.001
X  7494     11611  44137.00  48641.001 1297 17281  44184.00  47994.00  48856.001
X  7494     11611  44137.00  48641.001 1729 21601  44196.00  47994.00  48856.001
X  7494     11611  44137.00  48641.001 2161 25921  44208.00  47994.00  48856.001

The previous output was wrong.

Sorry for the inconvenience and thanks for your help

Hi, try:

awk 'NR==FNR{A[$4,$5]=A[$4,$5] $0 ORS; next} {printf "%s",A[$1,$2]}' file1 file2
1 Like

So - is the problem solved, now?

Scrutinizer

Thanks for your code, I try it but i don't get any output :frowning:

Could you copy and paste the samples that you posted in this thread back into the two files and see if it works with those?

I get:

X  7494     11711  44137.00  48629.001    1  4321  44148.00  48210.00  49072.001 
X  7494     11711  44137.00  48629.001  433  8641  44160.00  48210.00  49072.001 
X  7494     11711  44137.00  48629.001  865 12961  44172.00  48210.00  49072.001 
X  7494     11711  44137.00  48629.001 1297 17281  44184.00  48210.00  49072.001 
X  7494     11711  44137.00  48629.001 1729 21601  44196.00  48210.00  49072.001 
X  7494     11711  44137.00  48629.001 2161 25921  44208.00  48210.00  49072.001 
X  7494     11511  44149.00  48617.002    1  4321  44148.00  48198.00  49060.001 
X  7494     11511  44149.00  48617.002  433  8641  44160.00  48198.00  49060.001 
X  7494     11511  44149.00  48617.002  865 12961  44172.00  48198.00  49060.001 
X  7494     11511  44149.00  48617.002 1297 17281  44184.00  48198.00  49060.001 
X  7494     11511  44149.00  48617.002 1729 21601  44196.00  48198.00  49060.001 
X  7494     11511  44149.00  48617.002 2161 25921  44208.00  48198.00  49060.001 
X  7494     11811  44137.00  48425.001    1  4321  44148.00  48006.00  48868.001 
X  7494     11811  44137.00  48425.001  433  8641  44160.00  48006.00  48868.001 
X  7494     11811  44137.00  48425.001  865 12961  44172.00  48006.00  48868.001 
X  7494     11811  44137.00  48425.001 1297 17281  44184.00  48006.00  48868.001 
X  7494     11811  44137.00  48425.001 1729 21601  44196.00  48006.00  48868.001 
X  7494     11811  44137.00  48425.001 2161 25921  44208.00  48006.00  48868.001
X  7494     11611  44137.00  48641.001    1  4321  44148.00  47994.00  48856.001 
X  7494     11611  44137.00  48641.001  433  8641  44160.00  47994.00  48856.001 
X  7494     11611  44137.00  48641.001  865 12961  44172.00  47994.00  48856.001 
X  7494     11611  44137.00  48641.001 1297 17281  44184.00  47994.00  48856.001 
X  7494     11611  44137.00  48641.001 1729 21601  44196.00  47994.00  48856.001 
X  7494     11611  44137.00  48641.001 2161 25921  44208.00  47994.00  48856.001 

And you?

Scrutinizer

It works fine :)...

Was a problem to convert the files to unix (dos2unix).

Thanks a lot for your support