my file input is with tab as delimiter, and in every line, there would be a skip of line with an unexcepted newline breaker. I'd like to remove this \n and put the information in the same line.
I have a similar doubt
i have my input file as follows:
>some lines of text
actgtg
aaactgtg
acgtcg
>some lines of text
acgtgc
agtcgt
ttgcgt
etc..etc
i want the output as
>some lines of text
actgtgaaactgtgacgtcg
>some lines of text
acgtgcagtcgtttgcgt
basically I want to remove the new line characters at the end of lines which are not starting with '>'. I tried sed '!/>/s/\n//' but to no avail. any help would be highly appreciated!
Regarding post #3, what follows ssumes that there are no blank lines in the original data.
A different tack, which uses AWK to massage the format so that a second AWK can leverage its multiline record handling capability (which simplifies the logic):
Thanks guys for the code n yeah for correcting my interpretation of sed! somehow the third answer seems to be working with my requirements..havnt really used awk in my work earlier..so explanations of the codes stated above would really help me learn something!
awk '
/^>/ { # If current record starts with > ( /^</ )
$0 = (NR > 1 ? RS $0 : $0) # If current record number is greater 1 (NR > 1) set it to newline followed by current record (RS $0)
ORS = RS # Set Output Record Separator to Record Separator (ORS = RS) [ RS is newline by default ]
}
! />/ { # If current record does not contains pattern > ( !/>/ )
ORS = "" # Set Output Record Separator to "" (ORS = "")
}
END { # END Block
printf "\n" # Print newline
} 1 # 1 == true, so print current record
' file
Note: ORS, RS, NR are special variables in awk . Please check the awk manual pages for further reference. I hope this helps.