Replace multiple lines between tags using sed

I have a file example.txt with content look like this:

<TAG>
1
2
3
</TAG>

and I use a sed command to replace everything between <TAG></TAG> as below:

sed -e 's/\(<TAG>\)[^<]*\(<.*\)/something/g' example.txt > example.txt.new

But unfortunately, the command failed to replace as i want, it only work if the content between the tags are not break into multi-line. Could someone please explain how to solve this case?

Thanks a lot.

Try this:

awk '/<TAG>/{p=1;print}/<\/TAG>/{p=0}!p' file

many thanks Franklin!

Your awk command doesn't replace the content between the tags, but it deletes them. Now, i can use sed command to add new content as expect. Thanks again for your help.

Try:

sed -n '/<TAG>/,/<\/TAG>/p' < file | sed  '/TAG/d'

To replace the text with something you can try this:

awk '/<TAG>/{p=1;print;print "something"}/<\/TAG>/{p=0}!p' file 

I really appreciated for your help dennis. But the Franklin's command works very well in my situation. Maybe i'll need your helps for my future's troubles, but i owed you this time :slight_smile:

Sorry to bring up this old thread but I couldn't find a better and more relevant place.

I have a similar situation, where I wish to remove the code between two tags in many thousands of files.

Here is the code snippet:

<AFFILIATECODEBEGIN>

<p align="center">
<script type="text/javascript"><!--
auctionads_ad_client = "editedforprivacy";
auctionads_ad_campaign = "42efbc14b4c2adfae40ff87882f07569";
auctionads_ad_width = "120";
auctionads_ad_height = "240";
auctionads_ad_kw =  "japan";
auctionads_color_border =  "CC0000";
auctionads_color_bg =  "FFFFFF";
auctionads_color_heading =  "000000";
auctionads_color_text =  "000000";
auctionads_color_link =  "FFFFFF";
--></script>

<script type="text/javascript" src="http://ads.auctionads.com/pagead/show_ads.js">
</script>

</p>

</AFFILIATECODEBEGIN>

I have tried running

awk '/<AFFILIATECODEBEGIN>/{p=1;print}/<\/AFFILIATECODEBEGIN>/{p=0}!p' festivals.html

And also using * instead of naming a specific file. I am output the contents of the file, but nothing is removed or changed.

Thanks in advance for any suggestions.

Try this:

awk '/<AFFILIATECODEBEGIN>/{p=1}/<\/AFFILIATECODEBEGIN>/{p=0;next}!p' festivals.html

Try this :

 
TESTBOX>awk '/<AFFILIATECODEBEGIN>/ { print ; print "something "; next }
> /<\/AFFILIATECODEBEGIN>/ { print ;} ' html.txt
 
o/p :

<AFFILIATECODEBEGIN>
something
</AFFILIATECODEBEGIN>

Thanks to you both. Franklin your command output the contents of the file but no changes, and the output of panyam was:

awk: cmd. line:1: /<AFFILIATECODEBEGIN>/ { print ; print "something "; next } > /<\/AFFILIATECODEBEGIN>/ { print ;} 
awk: cmd. line:1:                                                             ^ syntax error

The syntax error being the ">" between "next }" and "/<\/"

This is what I get:

$ cat file
abc
abc
abc
<AFFILIATECODEBEGIN>

<p align="center">
<script type="text/javascript"><!--
auctionads_ad_client = "editedforprivacy";
auctionads_ad_campaign = "42efbc14b4c2adfae40ff87882f07569";
auctionads_ad_width = "120";
auctionads_ad_height = "240";
auctionads_ad_kw =  "japan";
auctionads_color_border =  "CC0000";
auctionads_color_bg =  "FFFFFF";
auctionads_color_heading =  "000000";
auctionads_color_text =  "000000";
auctionads_color_link =  "FFFFFF";
--></script>

<script type="text/javascript" src="http://ads.auctionads.com/pagead/show_ads.js
">
</script>

</p>

</AFFILIATECODEBEGIN>
xyz
xyz
xyz
$
$
$ awk '/<AFFILIATECODEBEGIN>/{p=1}/<\/AFFILIATECODEBEGIN>/{p=0;next}!p' file
abc
abc
abc
xyz
xyz
xyz
$

Hi jdv,

FYI :

avalon:/disk1/jvsh/TEST>awk '/<AFFILIATECODEBEGIN>/ { print ; print "something "; next }
> /<\/AFFILIATECODEBEGIN>/ { print ;} ' html.txt

> is the character u will get on screen wen u press ENTER , it means the command is continuing in in the next line. it is not the part of the command.

If you put in a single line

 
 
awk '/<AFFILIATECODEBEGIN>/ { print ; print "something "; next } /<\/AFFILIATECODEBEGIN>/ { print ;} ' input_file.txt
]# awk '/<AFFILIATECODEBEGIN>/ { print ; print "something "; next }
> /<\/AFFILIATECODEBEGIN>/ { print ;} ' festivals.html
<AFFILIATECODEBEGIN>
something 
</AFFILIATECODEBEGIN>

then nano festivals.html shows nothing has changed in the actual file.

:confused:

Thanks

Ofcourse nothing will change in the actual file .

You need to redirect the output of the command to some other file to store the data.

sorry to be a pain but how does one do that? Will is still retain all the rest of the code in the files?

awk '/<AFFILIATECODEBEGIN>/ { print ; print "something "; next } /<\/AFFILIATECODEBEGIN>/ { print ;} ' input_file.txt >> output_file.txt

input_file.txt content will remain same .

output_file.txt content will be

<AFFILIATECODEBEGIN>
something
</AFFILIATECODEBEGIN>

Thanks.. but how as per my first question how can I do this for thousands of files?

 
for file in `ls *`
do
awk '/<AFFILIATECODEBEGIN>/ { print ; print "something "; next } /<\/AFFILIATECODEBEGIN>/ { print ;} ' $file >> $file"_changed"
done

Thanks. But I cannot change the filename, and I need to do it recursively. I have .html and .htm files in many subdirs, all of which need to have this code removed but cannot have filenames changed or otherwise modified. Thank you for your kind help and apologies for not being more knowledgeable on this

Javed,

thers is a flaw in my script, it will remove all text( other than b/w tags, which it should not in all cases), better use the solution suggested by Franklin. The below code will do the job for you. Test is throughly before using in production.

 
 
for i in `find . -name "*\.html"`
do
awk '/<AFFILIATECODEBEGIN>/{p=1;print;print"something in b/w tags";next}/<\/AFFILIATECODEBEGIN>/{p=0;print;next}!p' $i >> $i"_Chng"
mv $i"_Chng" $i
done