HI there,
I am trying to find and replace with wildcard with
data
chr1 69511 69511 A G 1/1:0,34:791,78,0:78:34 0/1:55,60:1130,0,1513:99:116 1/1:0,28:630,63,0:63:28 0/1:0,34:626,57,0:57:34
To this
chr1 69511 69511 A G homo hetero homo hetero
Where I find and replace 0/1 with wildcard* to hetero
and 1/1 with wildcard* to homo
been experimenting with
sed 's/0\/1.*/hetero/g' file
but did achieve the desired result
sed does not understand fields / columns without a lot of effort, awk can loop through them.
Loop starts at 6 for efficiency, if the 0/1 can come in any field change it to 1.
$ awk -F"\t" -v OFS="\t" '{ for(N=6; N<=NF; N++) { if($N ~ /^1\/1:/) $N="homo" ; if($N ~ /^0\/1:/) $N="hetero" } } 1' het.txt
chr1 69511 69511 A G homo hetero homo hetero
$
1 Like
One could also try:
sed 's,0/1[^[:space:]]*,homo,g
s,1/1[^[:space:]]*,hetero,g' file
If all of your fields are <tab> separated and some fields might contain <space>s, replace each occurrences of the string [:space:]
in the above with a literal <tab> character.
If you are using a Solaris/SunOS system, and the above command doesn't work; change sed
to /usr/xpg4/bin/sed
.
1 Like