I am trying to use awk
to print lines that satisfy either of the two conditions below:
condition 1: $2
equals CNV
and the split of $3
, the value in red, is greater than or equal to 4. ---- this is a[1] or so I think
condition 2: $2
equals CNV
and the split of $3
, the value in red --- this is a[1] or so I think, is less than or equal to 1.0 and the value in green --- this is a[3] or so I thnk in less than or equal to 1.9 and $4
matches a line in list
. I have added comments to the code as to what I think is happening. The code execcutes but all the CNV lines are printed currently. Thank you :).
file
chr1:11184539 CNV 5%:5.5,95%:2.68 Name
chr1:11184539 REF
chr1:11184539 SNV A
chr1:11184555 CNV 5%:0.9,95%:1.9 BRCA1
chr1:11184539 FUSION
chr1:11184539 INDEL G
chr1:11184555 CNV 5%:2.5,95%:2.68 Name2
chr1:11184555 CNV 5%:1.1,95%:1.8 BRCA2
list
BRCA1
BRCA2
awk
awk -F'\t' '{split($3,a,":,")} $2=="CNV" && a[1]>=4.0' file # capture condition 1 --- spilt $3 on : and , and check if $2 is CNV, compare a[1] >=4.0 ----
awk -F'\t' '{split($3,a,":,")} $2=="CNV" && a[1]<=1.0 && a[3]<=1.9 && NR==FNR{c[$1]++;next};c[$1] > 0' file list # capture condition 2 --- spilt $3 on : and , and check if $2 is CNV, compare a[1] ,=1.0 and a[3] <=1.9 and $4 is matches $1 in list ----
desired output
chr1:11184539 CNV 5%:5.5,95%:2.68 Name
chr1:11184555 CNV 5%:0.9,95%:1.9 BRCA1