String manipulation

i have a string that am looking to extract all characters following 3 consecutiv numbers.
Example my string is J1705PEAN038TDMN, i need to get TDMN

My string can have multiple 3 consecutive numbers, i need what follows last occurance

In what language? Applescript? Shell? awk? perl? Other?

shell please

$ echo "J1705PEAN038TDMN" | grep -o "[0-9][0-9][0-9][^0-9]*$"

038TDMN

$ VAL=$(echo "J1705PEAN038TDMN" | grep -o "[0-9][0-9][0-9][^0-9]*$")
$ VAL=${VAL:3}
$ echo $VAL

TDMN

$

There's probably more elegant ways depending on what your string actually comes from. You could probably extract it - or several - from a file and whittle it down to TDMN, etc in one step.

1 Like

am parsing a tab delimited text file, one line at the time, your solution works fine, thank you for your help!

Your solution will be very inefficient doing it this way. Care to show your code so I can build it in rather than calling a pointless external? It's quite possible the entire file can be handled in one awk call, instead of having to call grep over and over and over..

is it always the LAST non-numeric sequence of a string?

each line contains several tab-separated fields that determine what action script will perform. provided string is just one of these :slight_smile:
am not overly concern about performance as whole processign is not very heavy

Thanks again!

---------- Post updated at 03:51 PM ---------- Previous update was at 03:50 PM ----------

yes, it's always the last non-numeric sequence (following exact 3 digits)

echo 'J1705PEAN038TDMN' | sed 's/.*[0-9]//'

---------- Post updated at 03:53 PM ---------- Previous update was at 03:53 PM ----------

having a sample file would help. Identifying a field to be processed would help as well...

1 Like

Try:

$ s=J1705PEAN038TDMN
$ echo "${s##*[!0-9][0-9][0-9][0-9]}"
TDMN

--
If it does not need to be exactly 3 digits, but 3 or more digits then you could try:

echo "${s##*[0-9][0-9][0-9]}"
3 Likes