Hello,
is there a way to go through a file and remove certain html tags with bash? If it needs sed or awk, that'll do too.
The reason why I want this is, because I have a monitor script which generates a logfile in HTML and every time it generates a logfile, the tags are reproduced. The tags I want removed are </body> and </html> and are the last two lines in the HTML file.
I found similar topics, but none of them do what I need.
Thanks in advance for the help.
Try this:
awk '/<\/body>/ || /<\/html>/{next}1' file
Regards
1 Like
It kinda works, but somehow I have to forward the output to a new file.
awk '/<\/body>/ || /<\/html>/{next}1' file.html > file2.html
is there a way to make it return the output to the original file? (file.html)
When I use:
awk '/<\/body>/ || /<\/html>/{next}1' file.html > file.html
I get a blank file.
All the code before the </body> and </html> tags should remain in the file.
Thanks
1 Like
You can't redirect the output to the inputfile. Redirect the output to a temporary file and move it to the original file, something like this:
awk '/<\/body>/ || /<\/html>/{next}1' file.html > file1.html
mv file1.html file.html
Regards
1 Like
Just figured it out some minutes ago, same way like you wrote the code, before you replied. Thanks for all the help