Help with perl script

I have a little problem with the following perl script that is supposed to remove gaps (dashes) from a sequence alignment:

perl -nla -F"" -e 'if (!/^>/){$n++;for ($i=0;$i<=$#F;$i++){$a{$i}{$F[$i]}++}}END{for ($i=0;$i<=$#F;$i++){if ($a{$i}{"-"}/$n>0.5){print $i}}print "-1"}' infile | awk -vFS="" -vOFS="" 'NR==FNR{a[$0+1]++}{for (i=1;i<=NF;i++) if (i in a) $i=""}FNR!=NR' - infile > outfile

The thing is that when the gap is at the beggining of the sequence, the script will remove it along with the character in the sequence ID, example:

Unfortunately, that's enough to mess up the FASTA format which completely stops me from doing any further analysis.
Any help will be greatly appreciated!

Can you also post the link to the old thread where I provided this code? I hate reverse engineering programming (even from my own work :)).

There you!
http://www.unix.com/shell-programming-scripting/139784-removing-columns-dashes.html
Thanks!

Try:

perl -nla -F"" -e 'if (!/^>/){$n++;for ($i=0;$i<=$#F;$i++){$a{$i}{$F[$i]}++}}END{for ($i=0;$i<=$#F;$i++){if ($a{$i}{"-"}/$n>0.5){print $i}}print "-1"}' infile | awk -vFS="" -vOFS="" 'NR==FNR{a[$0+1]++}!/^>/{for (i=1;i<=NF;i++) if (i in a) $i=""}FNR!=NR' - infile
1 Like

Thank you very, very much!