Command Line Perl for parsing fasta file

I would like to take a fasta file formated like

>0001
agttcgaggtcagaatt
>0002
agttcgag
>0003
ggtaacctga

and use command line perl to move the all sample gt 8 in length to a new file. the result would be

>0001
agttcgaggtcagaatt
>0003
ggtaacctga
cat ${sample}.fasta | perl -lane 'while(<>){if /^>/}'

????? How can I achieve this?

Does it have to be perl?

awk 'length($2)>8{print RS $0}' RS=\> ORS= "${sample}.fasta"
1 Like
perl -ne 'length > 9 && $last && print "$last$_"; $last = $_'  ${sample}.fasta > result.fasta

Explanation:

length > 9 : only if the length of the line is more than 9 (to accommodate the newline character as well)
&& $last && print "$last$_"; : ...and if we have seen a line before, display that last line (<0000x\n) and the current line.
$last = $_ : keep track of the last line read.

1 Like