Help with concatenate multiple line into one line

Hi,

Do anybody experience how to concatenate multiple line into one line by using awk or perl command?

Input file:

>set1
QAWEQRQ@EWQEASED
ASDAEQW
QAWEQRQTQ
ASRFQWRGWQ

From the above Input file, it got 5 lines

Desired output file:

>set1
QAWEQRQ@EWQEASEDASDAEQWQAWEQRQTQASRFQWRGWQ

I hope to concatenate all the line exclude ">" into a line.
It means at the desired output file, it only can contain 2 line. First line is a line with ">" and another line is concatenate multiple line into one long single line.

Thanks for any advice.

An awk:

awk '{ORS=/^>/?"\n":"";print}' infile

If you have a multiple set file (and adding last rc as @RudiC proposes):

awk '{ORS=sub(/^>/,"\n>")?"\n":"";print}END{print "\n"}' infile

To make that a correct *nix text file by adding a new line char at the end, try:

awk '{ORS=/^>/?"\n":"";print} END{printf "\n"}' file

Try deleting the new line with tr:-

tr -d "\n" < infile > outfile

It works for me, but of course there is no new-line at the end, so:-

RBATTE1> cat outfile
EWQEASEDASDAEQWQAWEQRQTQASRFQWRGWQRBATTE1>

Robin

Another approach:

awk '
        />/ {
                $0 = ( NR == 1 ? $0 : RS $0 RS )
                print
        }
        !/>/ {
                ORS = ""
                print
        }
        END {
                print "\n"
        }
' file

Nice & clean @Yoda , just one point i .. i would use next to avoid the second matching this way:

awk '
        />/ {
                $0 = ( NR == 1 ? $0 : RS $0 RS )
                print
                next
        }
        {
                ORS = ""
                print
        }
        END {
                print "\n"
        }
' infile
1 Like

Another one:

awk '{$1=RS $1 ORS}NR>1' OFS= RS=\> file

(As long as there are no extra ">" in the ">" headers..)

--
otherwise

awk '/^>/{if(NR>1)print RS; $1=$1 RS}1 END{print RS}' ORS= file