Minor editing of mass HTML files

Hello,

I'm manipulating a batch of about 2,000 HTML files. I just need to make some small changes, but to all the files at once.

For example, I want to delete the lines that have "embed_music" in all the files, or change all instances of the word "Paragraph" to "Absatz".

This is my pseudo-code:

open target folder of html files (/project/html/)
read in all html files
*do the stuff here:
check for lines containing "embed_music", if yes delete
string replace for words with "Paragraph" to "Absatz"
*
close folder

Is my logic correct? I'm attempting to do this with Python, would another language work better? Would appreciate any help or feedback!

I guess I'd have gone with a simple shell script letting find, and sed do the hard work:

find /project/html -name "*html" | while read filename
do
    if [[ ! -f $filename- ]]    # if a backup exists, don't do anything
    then
        mv $filename $filename-     # make backup
        sed '/embed_music/d; s/Paragraph/Absatz/;' $filename- >$filename # make changes
    fi
done

Makes a backup of the original file (I like that safety net) and then makes the changes. If the backup file exists, no action is taken -- prevents overlaying your original file should something not work right and the script is run again.

Python certainly will work, but this I think is easiest.

To Delete all line having word "embed_music" from files reside in DIR , Run below sed from base dir ..

# find .  -type f -exec sed -i '/embed_music/d;' {} \;

To Replace all "Paragraph" to "Absatz" , Run below sed from base dir .

 
#  find . -type f -exec sed -i 's/Paragraph/Absatz/' {} \;

OR Run both in one line as below from base dir

find . -type f -exec sed -i '/embed_music/d; s/shirish/shukla/g' {} \;

--Shirish Shukla