Help with Regexp replace in vim/sed

Hi!

I have a file with multiple lines following this format:

<a href="xxx.aaa_bbb_ccc.yyy">xxx.aaa_bbb_ccc.yyy</a>

The goal is to replace the title (not modifying the href) so the new lines looks like this:

<a href="xxx.aaa_bbb_ccc.yyy">Aaa bbb ccc</a>

The number of underscores in the "aaa_bbb_ccc" part is unknown, and that's where I'm stuck.
So far I've managed to get this result:

<a href="xxx.aaa_bbb_ccc.yyy">Aaa_bbb_ccc</a>

using this regexp:

s/>[^.]*\.\([^.]*\)[^<]*/>\u\1/

Though one-liners are always welcome :), the solution may contain multiple commands; but no matter how I try, I can't wrap my head around is how to replace the underscores with spaces, without also replacing the underscores in the href-part of the link. Is it even possible (with regexp)?

Can sed/vim execute a substitution in only part of the line (i.e. between '>' and '<'), or can I somehow run a 's/_/ /g' on the backreference '\1' before it is substituted?

There doesn't seem to be that many places where you can find information about advanced regexp substitutions (most places focus on the matching, not the replacing).
So if anyone could shed some light on this, it would be greatly appreciated.

Thanks in advance,
Eric

If Perl is an option, then :

$
$
$ cat f2
<a href="xxx.aaa_bbb_ccc.yyy">xxx.aaa_bbb_ccc.yyy</a>
<a href="xxx.aaa_bbb_ccc.yyy">xxx.aaa_bbb_ccc.yyy</a>
<a href="xxx.aaa_bbb_ccc.yyy">xxx.aaa_bbb_ccc.yyy</a>
<a href="xxx.aaa_bbb_ccc.yyy">xxx.aaa_bbb_ccc.yyy</a>
<a href="xxx.aaa_bbb_ccc.yyy">xxx.aaa_bbb_ccc.yyy</a>
$
$
$ perl -lne '($x,$y,$z)=/^(.*">)xxx.(.*).yyy(.*)$/; $y=~s/_/ /g; $y=ucfirst $y; print "$x$y$z"' f2
<a href="xxx.aaa_bbb_ccc.yyy">Aaa bbb ccc</a>
<a href="xxx.aaa_bbb_ccc.yyy">Aaa bbb ccc</a>
<a href="xxx.aaa_bbb_ccc.yyy">Aaa bbb ccc</a>
<a href="xxx.aaa_bbb_ccc.yyy">Aaa bbb ccc</a>
<a href="xxx.aaa_bbb_ccc.yyy">Aaa bbb ccc</a>
$
$

tyler_durden

1 Like

Thanks for the answer!

I know I might end up using Perl or some other scripting language; but for now, I'm determined to find a solution using regexp (without scripting).
Surely it has to be possible?

(I know, I'm stubborn, but I want to learn what is possible, and not always stay with what I know works.)

/Eric