Minor editing of mass HTML files


I'm manipulating a batch of about 2,000 HTML files. I just need to make some small changes, but to all the files at once.

For example, I want to delete the lines that have "embed_music" in all the files, or change all instances of the word "Paragraph" to "Absatz".

This is my pseudo-code:

open target folder of html files (/project/html/)
read in all html files
*do the stuff here:
check for lines containing "embed_music", if yes delete
string replace for words with "Paragraph" to "Absatz"
close folder

Is my logic correct? I'm attempting to do this with Python, would another language work better? Would appreciate any help or feedback!

I guess I'd have gone with a simple shell script letting find, and sed do the hard work:

find /project/html -name "*html" | while read filename
    if [[ ! -f $filename- ]]    # if a backup exists, don't do anything
        mv $filename $filename-     # make backup
        sed '/embed_music/d; s/Paragraph/Absatz/;' $filename- >$filename # make changes

Makes a backup of the original file (I like that safety net) and then makes the changes. If the backup file exists, no action is taken -- prevents overlaying your original file should something not work right and the script is run again.

Python certainly will work, but this I think is easiest.

To Delete all line having word "embed_music" from files reside in DIR , Run below sed from base dir ..

# find .  -type f -exec sed -i '/embed_music/d;' {} \;

To Replace all "Paragraph" to "Absatz" , Run below sed from base dir .

#  find . -type f -exec sed -i 's/Paragraph/Absatz/' {} \;

OR Run both in one line as below from base dir

find . -type f -exec sed -i '/embed_music/d; s/shirish/shukla/g' {} \;

--Shirish Shukla