Removing all except couple of html tags from html file

I tried to find elegant (or at least simple) way to remove all but couple of html tags from html file, but all examples I found dealt with removing all the tags.

The logic of the script would be:

  • if there is <li> or <ul> on the line, do nothing (=write same line to output)
  • if there is:
    font class="titleA"
    substitute it with:
    <h2>
  • otherwise if there is html tag, remove it (=write the lines to output without tags, just content)

Could please someone tell me how to approach this problem? I know some perl but my skills are rusty (years from last time I used perl).