I am trying using awk
to open an input file and check a column 2/field $2 and if there is a warning then that is displayed (variantchecker): G not found at position 459, found A instead.
The attached Sample1.txt is that file. If in that column/field there is a black space, then the text after the colon in $1 is displayed . Sample2.txt is that file.
The below code is close, but I can not seem to get it right. Thank you :).
awk 'NR>1 {$1=""; e=$2; print "Found error: ", $0} END{if (!e) print "No error";}' C:/Users/cmccabe/Desktop/annovar/${id}_name.txt
It is not clear to me how to distinguish the error case from the case with no errors. In sample1 we have (variantchecker):
in the second column, in sample2 it is NM_004004.5 GJB2_v001
.
Can we assume for example that parentheses at the start of the second column indicate an error?
Yes, the parentheses at the start of the second column does indicate an error. Thank you :).
Something like the following should work then:
awk 'NR>1 { if ($2 ~ /^\(/ ) {$1=""; print "Found error: ", $0} else { sub(/.*:/, "", $1); print "No error: " $1 }}' C:/Users/cmccabe/Desktop/annovar/${id}_name.txt
Works great.... if the text in the no warning (Sample2.txt), p.(Val27Ile)
was needed as well would:
awk 'NR>1 { if ($2 ~ /^\(/ ) {$1=""; print "Found error: ", $0} else { sub(/.*:/, "", $1 $8); print "No error: " $1 "," $8}}' C:/Users/cmccabe/Desktop/annovar/${id}_name.txt
Desired output displayed:
c.79G>A,p.(Val27Ile)
Separating the substitutions works:
awk 'NR>1 { if ($2 ~ /^\(/ ) {$1=""; print "Found error: ", $0} else { sub(/.*:/, "", $1); sub(/.*:/, "", $7); print "No error: " $1 "," $7}}' C:/Users/cmccabe/Desktop/annovar/${id}_name.txt
1 Like
Thank you, works great :).