sed Very Slow

Hi

We are using sed to clean up a file of a pattern and its talking a lot of time on XML output file

The command that we are using is

sed -e "s/tns1://g" $OUTPUTFILENM > $TEMPFILE

Where $OUTPUTFILENM is the file to be cleaned and $TEMPFILE is the cleaned output

Can you please help me to optimise this command or suggest some other command that can clean up say the string tns1: and replace it with space across all xml file.

I will really appreciate any suggestion. For a 45k record file its taking more than an hour in each record we have around 10 instances of string tns:

Also the string can appear any where its not fixed width location since we have XML file that we are cleanning

Thanks

Something is very wrong here. sed can process 10+ GB's/hour on a modern machine.
Please show the output of:

ls -l [one of your xml files]
wc -l [the same xml file]

Here are the outputs of the above commands

-rw-rw-rw-    1 inf  inf    103876292 Oct 11 18:16
wc -l returns 0 as its a single root tag XML file

That looks like one 103 MB long line.
sed is not optimized for this.
Try another sed version,
or perl

perl -pe "s/tns1://g" $OUTPUTFILENM > $TEMPFILE
1 Like