Hi,
I am having trouble modifying this line. It is because there is a space that I dont know how to deal with.
So the file looks like this
>YM4911-Contig4 [name=joe]
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
>YM4915-Contig5 [name=bob]
MMMMMMMMMMMMMMMMMMMMMMMMMMMMMM
There are two spaces between contig# and [name=x] in some files and a single space in other files. How would write a command to deal with both?
I want to modify the file so it looks like this.
>Contig4
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
>Contig5
MMMMMMMMMMMMMMMMMMMMMMMMMMMMMM
This is the format that I am trying to get..
thanks
this will do i guess..
sed 's/\(.*\)\(Contig[0-9]\)\(.*\)/\2/g' filename
the output is like this:
Contig4
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
Contig5
MMMMMMMMMMMMMMMMMMMMMMMMMMMMMM
How do I maintain the > so it will look like this:
>Contig4
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
>Contig5
MMMMMMMMMMMMMMMMMMMMMMMMMMMMMM
make small change..
sed 's/\(>\)\(.*\)\(Contig[0-9]\)\(.*\)/\1\3/g' filename
I have another question.
the contig goes pass 9... so it will go up to like 5000
How does the sed line deal with that?
thanks
hmmmm..
sed 's/\(>\)\(.*\)\(Contig[0-9]*[0-9]\)\(.*\)/\1\3/g' filename
With awk:
awk -F" |-" '/Contig/{print ">" $2;next}1' file