Parsing a file

Hi, I am having trouble parsing a file. Perhaps it may the spaces in between. I cannot get it done with awk.

Heres how the file looks like.

>Corey barg LTG 1 (jorge k)
MMMMNNNNNNJJJJJHHHHHHHHHHHHKKKKIIIIIKKJJJJJJJJJJJJJ
>Corey barg LTG 2 (88)
MNNNNNNNIIIIIHHKKKKKHKJKJJJJJJJKKKKKKKKKKKKKKKKKK

Basically I want the final output to look like this:

>LTG1
MMMMNNNNNNJJJJJHHHHHHHHHHHHKKKKIIIIIKKJJJJJJJJJJJJJ
>LTG2
MNNNNNNNIIIIIHHKKKKKHKJKJJJJJJJKKKKKKKKKKKKKKKKKK

Note that its space seperated the I want to lose the space between the LTG and the number. thats what im having troubles with.

thanks

To keep the forums high quality for all users, please take the time to format your posts correctly.

  1. Use Code Tags when you post any code or data samples so others can easily read your code.
    You can easily do this by highlighting your code and then clicking on the # in the editing menu. (You can also type code tags and by hand.)
  2. Avoid adding color or different fonts and font size to your posts.
    Selective use of color to highlight a single word or phrase can be useful at times, but using color, in general, makes the forums harder to read, especially bright colors like red.
  3. Be careful when you cut-and-paste, edit any odd characters and make sure all links are working property.

Thank You.

The UNIX and Linux Forums
Reply With Quote

try sed..

sed 's/^>\(.*\)\(LTG [0-9]\)\(.*\)/>\2/g' filename

Awk. print , = print delimeter $OFS between arguments = default is space. Print without , will print no delimeter.

awk ' /^>Corey/ { print ">" $3 $4 ; next }
               { print $0 }
    '  inputfile

Or one way using Perl:

$ 
$ cat data.txt
>Corey barg LTG 1 (jorge k)
MMMMNNNNNNJJJJJHHHHHHHHHHHHKKKKIIIIIKKJJJJJJJJJJJJJ
>Corey barg LTG 2 (88)
MNNNNNNNIIIIIHHKKKKKHKJKJJJJJJJKKKKKKKKKKKKKKKKKK
$ 
$ perl -lne '(/>.*LTG (.).*/ && print ">LTG$1") || print' data.txt
>LTG1
MMMMNNNNNNJJJJJHHHHHHHHHHHHKKKKIIIIIKKJJJJJJJJJJJJJ
>LTG2
MNNNNNNNIIIIIHHKKKKKHKJKJJJJJJJKKKKKKKKKKKKKKKKKK
$ 

tyler_durden