Hi,
I have 25 groups and I need to perform all possible pairwise compariosns between them using the formula n(n-1)/2. SO in my case it will be 25(25-1)/2 which is equal to 300 comparisons.
my 25 groups are
FG1 FG2 FG3 FG4 FG5
NT5E CD44 CD44 CD44 AXL
ADAM19 CCDC80 L1CAM L1CAM CD44
AXL COL1A1 ADM RND3 FOSL1
CD44 COL3A1 COL1A1 BMP1 SP100
CD68 COL6A1 COL3A1 COL3A1 ACSL1
FXYD5 COL6A2 COL18A1 COL6A1 ADM
GLIPR1 COL6A3 CTGF COL6A2 A2M
GM2A COL18A1 EPAS1 COL6A3 COL1A1
L1CAM CTGF FOXC1 COL18A1 COL3A1
ADM FBN1 GAP43 CTGF COL6A2
A2M FOXC1 HMOX1 ITGA4 DUSP4
ALPK2 LOX HBEGF ICAM1 FHL2
ANGPTL2 MGP ITGA4 IL32 GAL
BMP1 MMP2 ICAM1 LOXL2 HMOX1
CALCRL POSTN IL1B MGP IL1B
CTSL1 PDGFRA IL6R POSTN IL6R
CXCL14 SPARC IL8 PCDH12 LOX
C18orf1 SPOCK1 LOX SORBS3 MGP
CCDC80 THSD4 MMP2 SPOCK1 PGF
COL1A1 TFPI2 PGF TGFB1I1 PDGFRA
COL3A1 TGFB2 PLAU TGFBI PTGS2
COL6A1 TGFBI PTGS2 TNFRSF12A RCAN1
COL6A2 VCAN SPOCK1 VCAN TGFB2
COL6A3 TGFB2
COL18A1 TNFRSF12A
CSF1 UNC5B
CTGF UNC5C
CYBRD1 ETS1
DKK1 VCAN
DKK3
EMP1
FBN1
So for the first comparison which is comparing 1st group to 2nd group( group meaning column) then the result is going to be 10/23 ( here 10 is the number of genes common between 2 functional groups and 23 is the minimum list based on the two columns that we are comparing. In my example 1st column has 32 entries and 2nd column has 23 entries so the minimum of these two is 23. So the calculation is 10/23. I need to do this for all 300 comparisons). I tried doing this by R, but I get duplicates meaning (FG1 vs FG2 and FG2vs FG1). So I am seeing if there is a simple way to do this in awk.
Thanks,