Replace HTML tags using sed regex

I need all the end tags of </font> to be replaced with new line yet enclosing tag to be retained </font>. Please help me in this regard.

Input:

<font>abc</font>def<font>ghi</font>

Output:

<font>abc</font>
def
<font>ghi</font>

Hello Badhrish,

Following may help you in same, if your all input data is in same provided form as per your shown input.

awk -F"</font>" '{for(i=1;i<=NF;i++){if($i ~ /^</ && $i !~ /^$/){print $i FS} else if($i !~  /^</ && $i !~ /^$/){gsub(/</,"\n&",$i);print $i FS}}}' Input_file

Output will be as follows.

<font>abc</font>
def
<font>ghi</font>

EDIT: Added a non one liner form of solution on same.

awk -F"</font>" '{
                        for(i=1;i<=NF;i++){
                                                if($i ~ /^</ && $i !~ /^$/)             {
                                                                                                print $i FS
                                                                                        }
                                                else if($i !~  /^</ && $i !~ /^$/)      {
                                                                                                gsub(/</,"\n&",$i);
                                                                                                print $i FS
                                                                                        }
                                          }
                 }
                ' Input_file

Thanks,
R. Singh

1 Like

Works like charm. thank you Ravinder

Try also

sed -r 's/([^^>])(<font>)/\1\n\2/g;s/(<\/font>)([^$>])/\1\n\2/g' file
<font>abc</font>
def
<font>ghi</font>
1 Like