Shell/Perl script to convert to Capitalize case

I need a shell script which will convert the given string within a <title> tag to Capitalize case.

E.g "<title>hi man: check this out</title>"

to "<title>Hi Man: Check This Out</title>"

You can do this in Bash, although I wouldn't :slight_smile:

$ shopt -s compat31
$ string="<title>This is a test</title>"
$ [[ "$string" =~ "<title>(.*)</title>" ]] && echo ${BASH_REMATCH[1]} | tr "[:lower:]" "[:upper:]"
THIS IS A TEST

Ack! You want the first letter of every word capitalized? Okay, a bit trickier than I first thought. Here's the basic case:

perl -e -p 's/(<title[^>]*>)([^<]*)(<\/title)>/$1 . ucfirst($2) . $3/se'

Here's one way so every word is capitalized. There are always Other Ways too:

perl -p -e 'if(/(<title[^>]*>)([^<]*)(<\/title>)/) { ' \
          -e '  $lhs= $PREMATCH . $1; $rhs = $POSTMATCH . $3; '\
          -e '  $title = join(" ",map(ucfirst($_),split(" ",$2))); '\
          -e '  $_ = $lhs . $title . $rhs;'\
          -e '}'

The complex line: (1) splits the title string into words (delimited by spaces) and (2) makes every word (via map) capitalized (with ucfirst), and (3) recombines the strings with join. The the entire line is reconstituted.

This won't work if <title> is spread across multiple lines. For that, you should use one of the HTML parsers.

Hi Otheus!

Thanks for you response! But both of the above command not working...Please help...

Um, you have to provide a filename or redirect the file into it. Also, I had forgotten an apostrophe. I corrected the post. Try it now.

Or:

perl -pe's|
    (?<=<title>)
    (.*?)
    (?=</title>)
    |($x=$1)=~s/\b(\w)/\u$1/g;$x
    |gxe'  infile  

You can nest a regular expression within another?? Scarrrry! But cool.

There's got to be a better approach ...
I'm still looking for it :slight_smile:

With two shell functions...

_capword()
{
  case $1 in
    a*) _CAPWORD=A${1#?};; b*) _CAPWORD=B${1#?};;
    c*) _CAPWORD=C${1#?};; d*) _CAPWORD=D${1#?};;
    e*) _CAPWORD=E${1#?};; f*) _CAPWORD=F${1#?};;
    g*) _CAPWORD=G${1#?};; h*) _CAPWORD=H${1#?};;
    i*) _CAPWORD=I${1#?};; j*) _CAPWORD=J${1#?};;
    k*) _CAPWORD=K${1#?};; l*) _CAPWORD=L${1#?};;
    m*) _CAPWORD=M${1#?};; n*) _CAPWORD=N${1#?};;
    o*) _CAPWORD=O${1#?};; p*) _CAPWORD=P${1#?};;
    q*) _CAPWORD=Q${1#?};; r*) _CAPWORD=R${1#?};;
    s*) _CAPWORD=S${1#?};; t*) _CAPWORD=T${1#?};;
    u*) _CAPWORD=U${1#?};; v*) _CAPWORD=V${1#?};;
    w*) _CAPWORD=W${1#?};; x*) _CAPWORD=X${1#?};;
    y*) _CAPWORD=Y${1#?};; z*) _CAPWORD=Z${1#?};;
    *)  _CAPWORD=$1 ;;
  esac
}

_cap_phrase()
{
   _PHRASE="$* "
   _CAP_PHRASE=
   while [ -n "$_PHRASE" ]
   do
     _TEMP=${_PHRASE#* }
     _capword "${_PHRASE%"$_TEMP"}"
     _CAP_PHRASE="$_CAP_PHRASE$_CAPWORD"
     _PHRASE=$_TEMP
   done
   _CAP_PHRASE=${_CAP_PHRASE% }
}

...the script is simple:

title="<title>hi man: check this out</title>"

## Strip HTML tags
title=${title#*>}
title="${title%<*}"

_cap_phrase "$title"
captitle="<title>${_CAP_PHRASE% }</title>"

printf "%s\n" "$captitle"

Maybe something like this, but I don't have a shell to try it, so untested...

perl -pi -e 's/([^<\/])\b(\w)/$1\U$2/g' filename

The problem is that the above command will capitalize the words between all type of tags, not only for <title></title>.
Consider the following:

% print '<head><script>this not</script><title>this yes</title></head>'|
perl -pe's/([^<\/])\b(\w)/$1\U$2/g' 
<head><script>This Not</script><title>This Yes</title></head>

% print '<head><script>this not</script><title>this yes</title></head>'|
perl -pe's|
pipe quote>     (?<=<title>)
pipe quote>     (.*?)
pipe quote>     (?=</title>)
pipe quote>     |($x=$1)=~s/\b(\w)/\u$1/g;$x
pipe quote>     |gxe'
<head><script>this not</script><title>This Yes</title></head>
#! /usr/bin/perl
open FH,"Leo.pm";
while(<FH>){
	s/(\w+)/ucfirst($1)/eg;
	s/<(.*?)>/'<'.lcfirst($1).'>'/e;
	s/\/(.*)>$/lcfirst($1).'>'/e;
	print;
}