In the below awk
the output is space delimited, but it should be tab delimited. Did I not add the correct -F and OFS
? Thank you :).
The input file are rather large so I did not include them, but they are tab-delimeted files as well.
awk
awk -F'\t' -v OFS='\t' 'FNR==1 { next }
> FNR == NR { file1[$2,$4,$5] = $2 " " $4 " " $5 }
> FNR != NR { file2[$2,$4,$5] = $2 " " $4 " " $5 }
> END { print "Match:"; for (k in file1) if (k in file2) print file1[k] # Or file2[k]
> print "Missing in Reference but found in IDP:"; for (k in file2) if (!(k in file1)) print file2[k]
> print "Missing in IDP but found in Reference:"; for (k in file1) if (!(k in file2)) print file1[k]
> }' file1 file2 > out
current output (all in one field $1)
Match:
68521889 C T
167099158 A G
Missing in Reference but found in IDP:
93521604 A G
166903445 T C
Missing in IDP but found in Reference:
166210776 C T
147183143 A G
desired output (in 3 fields $1 $2 $3)
$1 $2 $3
Match:
68521889 C T
167099158 A G
Missing in Reference but found in IDP:
93521604 A G
166903445 T C
Missing in IDP but found in Reference:
166210776 C T
147183143 A G