Regex: Extract substring between 2 separator

Hi

Input:

aa-bb-cc-dd.ee.ff.gg

Output:

dd

I want to get the word after the last '-' until the first dot

I have tried with regex lookbehind and lookahead like this:

(?<=-).*(?=\.)

but his returns too much

bb-cc-dd.ee.ff

what tool are you using?
sh/awk/sed/perl/????

I am looking for a general solution if possible. But grep/java will be fine.

echo 'bb-cc-dd.ee.ff' | sed 's/.*-\([^.][^.]*\).*/\1/'
print 'bb-cc-dd.ee.ff' | sed -n 's/.*-\([^.]*\)\..*/\1/p'

This matches all characters up to a '-', then sets a reference to any character that is not a period, followed by zero or more characters
that are not a period, until a period is found, followed by anything else, then prints the part saved in the reference.
Bottom line, it prints all characters between the last '-' and the following '.'.

So the steps is:

  1. all characters until the last -
.*-
  1. Create a backreference which does not match a dot
\([^.][^.]*\)
  1. from the first dot to the rest of the line
.*
  1. Replace everything from step 1-3 with the backreference value from step 2

pretty much...
depending how your expected patterns will be, this may or may not be what you want, e.g. xx-zz-ww.bb-cc-dd.ee.ff might not produce what you want and might want to 'tighten up' greediness of the regex.

1 Like

for shell:

 
word=a-bb-cc-dd.ee.ff.gg
word=${word%%.*}
word=${word##*-}
echo $word