I am looking to join two files where column 1 of file A matches with column 1 of file B and column 5 of files A matches with column 2 of file B. After joining the files based on above condition, out should contain entire line of file A and column 3, 4 and 5 of file B.
Here is sample data:
File A
7507345|NIP1|0.0.0.2|127369256|17561|Closed
3323077|NIP1|0.0.0.1|142830388|34727|Closed
3323077|NIP4|0.0.0.1|773455302|386196|Closed
8171620|NIP8|0.0.0.1|176123103157225888|Closed
File B
7507345|17561|ED|BILL|0
3323077|34727|EF|LAST_BILL_T|0
I have gone through the thread "Join two files combining multiple columns and produce mix and match output* but still not clear on this.
Did you apply that command totally unmodified? After specifying in post#1 that you want to match fields 1 and 5 of fileA with fields 1 and 2 of fileB, you are using fields 2 and 3 of fileA and fields 1 and 4 of fileB. Do you see a chance this should work? On top, the field separator of your files seems to be | , but the script uses , .
A few hints on that script: It first reads fileA and tries to assign a string composition of FILENAME and fields 1 and 3 to array A indexed by fields 2 and 3 (which fails as the fields are NOT separated due to wrong FS) and similar to array B. Then, reading fileB, it tries to match fields 1 and 4 with the indices of A (which would fail even if the FS were correct), and, on positive match, prints an even stranger composition of array elements, FILENAMEs, and fields. Like
I modified the script as per my requirement but it failed to give output. I did not understand the part "tries to assign a string composition of FILENAME and fields 1 and 3 to array A indexed by by fields 2 and 3". Here is my date and requirement:
File A
7507345|NIP1|0.0.0.2|127369256|17561|Closed
3323077|NIP1|0.0.0.1|142830388|34727|Closed
3323077|NIP4|0.0.0.1|773455302|386196|Closed
8171620|NIP8|0.0.0.1|176123103157225888|Closed
File B
7507345|17561|ED|BILL|0
3323077|34727|EF|LAST_BILL_T|0
Expected Output:
7507345|NIP1|0.0.0.2|127369256|17561|ED|BILL|0
3323077|NIP1|0.0.0.1|142830388|34727|EF|LAST_BILL_T|0
why not use the join command to join files? The problem is it only can join on 1 field in each file (default the first field in each file). But after the first join it gets much easier.
Hi angshuman,
Obviously what you want to do could be done with just awk instead of using both join and awk . But if you're unwilling to show us the modified awk code that you were using, we can't show you where you went wrong.
Please help us help you. Show us your code (in CODE tags).