sed situation

Hi,

I'm looking for someone who can think in sed. Basically, I need the trailing characters on every line in a file to be deleted. These characters are all in capitals, and always follow a number, but they often vary in number

For instance, on the line:

2006_10_9_p20_TALK

I'd want to remove the _TALK. And on the line

2006_10_9_p119_CRITICS

I'd want to remove the _CRITICS. I'm reasonably good with sed basics, but no guru. Any suggestions?

Best wishes
Laurel Maury

Not knowing the exact format of your file, this might work, or at least point you in the right direction:

echo "2006_10_9_p119_CRITICS" | sed 's/_[A-Z][A-Z]*$//g'

Result: 2006_10_9_p119

In this case the g is not required since there will be only one match:

echo "2006_10_9_p119_CRITICS" | sed 's/_[A-Z][A-Z]*$//'

would be sufficient, this should also work:

echo "2006_10_9_p119_CRITICS" | sed 's/[^0-9]*$//'

This might also help you

$ echo "2006_10_9_p119_CRITICS" | sed 's/^\(.*\)_.*$/\1/'
2006_10_9_p119
$ echo "2006_10_9_p20_TALK" | sed 's/^\(.*\)_.*$/\1/'
2006_10_9_p20

file txt content
2006007 20001 AR 2502_TXT
2006007 20001 AU 2502_TXT
2006007 20001 CL 2502_TXT
2006007 20001 CO 2502_TXT1
2006007 20001 ES 2502_TXT

sed 's/^\(.*\)_\([A-Z]*\)$/\1/' txt
ouput :
2006007 20001 AR 2502
2006007 20001 AU 2502
2006007 20001 CL 2502
2006007 20001 CO 2502_TXT1

above command will only delete those line which matches the pattern "_[A-Z]"
--Manish