Match and print based on columns

HI,

I have 2 different questions in this thread.
Consider 2 files as input (input file have different line count )
File 1

1 1 625 56
1 12 657 34 
1 9 25 45 1
2 20 54 67
3 25 35 27
4 45 73 36
5 125 56 45

File2

1 1 878 76
1 9 83 67
2 20 73 78
4 47 22 17
3 25 67 99

1st Question : When column2, file1 = column2, file2 and column2, file1 !=column2,file2 anywhere in the file, the output should be

1 1 625 56 1 1 878 76
1 9 25 45 1 9 83 67
2 20 54 67 2 20 73 78
3 25 35 37 3 25 67 99
1 12 657 34 4 47 22 17
4 45 73 36 - - - - 
5 125 56 45 - - - -

so the 1st 4 lines are the lines where columns match and the last 3 lines are where columns don't match. (In general, 1st want all the matching columns to be printed followed by non matching ones).

2nd Question.The output should look like this when the columns 2 match and when they don't match from the input file1 and 2.

1 1 625 56 1 1 878 76
1 12 657 34 - - - -
1 9 25 41 1 9 83 67
2 20 54 67 2 20 73 78
3 25 35 27 3 25 67 99
4 45 73 36 - - - -
- - - - 4 27 22 17
5 125 56 45 - - - -

Thanks in advance!!!

Please don't use ICODE tags, use CODE tags instead.

How do you match records in the third last line, i.e. 1 12 657 34 4 47 22 17 Why should we select that record from file2?

HI RudiC,

I have edit the tags.

The 3rd last line is printed like that because now there isn't anything matching in file 1 and file to so the unmatched things are being printed side by side.
(If there was something else left in file 2 (printing a hypothetical example) which wasn't matching then the 2nd last line would have been printed like

4 45 73 36 10 67 89 90

For your second problem, how about this:

awk     'NR==FNR        {T[$2]=$0; next}
         $2 in T        {print $0, T[$2]; delete T[$2]; next}
                        {print $0, "- - - -"}
         END            {for (i in T) print "- - - -", T}
        ' file2 file1
1 1 625 56 1 1 878 76
1 12 657 34 - - - -
1 9 25 45 1 1 9 83 67
2 20 54 67 2 20 73 78
3 25 35 27 3 25 67 99
4 45 73 36 - - - -
5 125 56 45 - - - -
- - - - 1 15 56 57
- - - - 4 47 22 17

You're missing the 1 15 56 57 line from file2 in your sample output. Or I'm missing your problem...

1 Like

Sorry, I missed it so took it out from the thread now

---------- Post updated at 12:56 PM ---------- Previous update was at 11:53 AM ----------

Thanks RudiC,

The code for the second problem works perfectly.
I will give a try once again for the 1st problem.

Vishal