I am trying to use awk
to create (in this example) 3 seperate text file from the unique id in $1
in file, if it starts with the pattern aa
. The contents of each row is used to populate each text file except for $1
which is not needed. It seems I am close but not quite get there. Thank you :).
file tab-delimeted
aa1110-0 12 47259533 47259533 G A Comment:heterozygous_snv
aa1110-1 11 23892795 23892799 G C Comment:heterozygous_snv
2 7581601 7581601 T A Comment:heterozygous_snv
aa1110-2 1 237837422 237837422 C TTC Comment:substitution
3 7583892 7583892 G A Comment: heterozygous snv
19 23892788 23892799 G - Comment:deletion
awk
awk -F'\t' '/^aa/{ # if line starts with aa
if(!w) # if negate of w is true
f=sprintf($1"%d.txt",++n); # pre increment n, and set up variable f
w=1; # set variable w = 1
print >f; # write record/row/line to file
next # go to next line
}
{ # for which does not start with aa
close(f); # close file
w=0 # set w = 0 for next line with aa use newfile
}
' file
current output is two files with each row in them but $1
as well
Here is one:
aa1110-0 12 47259533 47259533 G A Comment:heterozygous_snv
aa1110-1 11 23892795 23892799 G C Comment:heterozygous_snv
awk
awk '{for(i=2;i<=NF;i++){printf "%s ", $i >> $1".txt"};printf "\n" >> $1".txt"; close($1".txt")}' file
current output is three files with no $1
in them but only one line in them.
Here is the same file as above:
12 47259533 47259533 G A Comment:heterozygous_snv
desired output tab-delimeted
aa1110-0.txt
12 47259533 47259533 G A Comment:heterozygous_snv
aa1110-1.txt
11 23892795 23892799 G C Comment:heterozygous_snv
2 7581601 7581601 T A Comment:heterozygous_snv
aa1110-2.txt
1 237837422 237837422 C TTC Comment:substitution
3 7583892 7583892 G A Comment:heterozygous_snv
19 23892788 23892799 G - Comment:deletion