Matching tables

cowa · December 23, 2019, 9:31am

Hello I have two tables like this

Table1

Ab01 Ac01 Bd01 Xc01
Ab02 Ac02 Bd02 Xc02
Ab03 Ac03 Bd03 Xc03
Ab04 Ac04 Bd04 Xc04
Ab05 Ac05 Bd05 Xc05

and Table 2 like this

Group_01.1 Bd03
Group_01.1 Ac03
Group_01.1 Xc03
Group_01.1 Ab03
Group_02.1 Ac02
Group_02.1 Ab05
Group_02.1 Bd03
Group_02.1 Xc04
Group_03.2 Bd04
Group_03.2 Ab05
Group_03.2 Bd03
Group_03.2 Ac02

I needed to print only the group which has all elements in table1 rows like this

Group_01.1 Bd03
Group_01.1 Ac03
Group_01.1 Xc03
Group_01.1 Ab03

grep and sort wont work here

RavinderSingh13 · December 23, 2019, 9:53am

Hello cowa,

Welcome to forums, special thanks for using CODE TAGS for samples.

On UNIX.com we encourage users to show their efforts in their questions which they have put in order to solve their own problems, so kindly do add the same and let us know then.

Thanks,
R. Singh

cowa · December 24, 2019, 3:33am

I was using transposes the rows into columns, then data.frame creates a list of the columns to match the data match(data.frame(t(x)), data.frame(t(y)))

nezabudka · December 24, 2019, 4:07am

Hi,
if only one row has matches

awk '
NR == FNR               { if (!T[$2]) T[$2] = $1; next }
                        { lin = 0; for (i=1; i<=NF; i++)
                                if ($i in T) lin++
                        }
lin == NF               { for (i=1; i<=NF; i++) print T[$1] FS $i }
' Table2 Table1

otherwise

awk '
NR == FNR               { if (!T[$2]) T[$2] = $1; next }
                        { lin = 0; for (i=1; i<=NF; i++)
                                if ($i in T) lin++ 
                        }
lin == NF               { for (i=1; i<=NF; i++) D[$i] = T[$i] }
END                     { for (i in D) print D FS i }
' Table2 Table1

cowa · December 24, 2019, 4:42am

This one worked perfectly

nezabudka · December 24, 2019, 2:48pm

maybe this is very redundant for such task,
but maybe this will anticipate possible problems with more complex input
This instance with integrated version control requires gnu awk

awk '
NR == FNR       { T[$2] = $1 (T[$2]?FS T[$2]:""); next }
                { lin = 0; for (i=1; i<=NF; i++)
                        if ($i in T) lin++ 
                }
lin == NF       { for (i=1; i<=NF; i++) {
                        if (T[$i]) print $i FS gensub(/^.*\s/, "", 1, T[$i])
                        sub(/\s?[^ ]+$/, "", T[$i])
                        }
                }
' Table2 Table1

RudiC · December 27, 2019, 3:36pm

Try also

awk '
NR==FNR {T[$2] = $1
         next
        }
        {L   = 1
         OUT = ""
         for (i=1; i<=4; i++)   {L = L * ($i in T)
                                 OUT = OUT T[$i] OFS $i ORS
                                }
         if (L) print OUT
        }
' file2 file1

or

sort file2 |  awk '
NR == FNR       {T[$0]
                 next
                }
                {OUT = OUT NRS $0
                 IX  = IX NFS $2
                 NFS = OFS
                 NRS = ORS
                }
!(FNR%4)        {if (IX in T) print OUT
                 OUT = IX = NFS = ""
                }
' file1 -