sed working slow on big files

HI Experts ,

I'm using the following code to remove spaces appearing at the end of the file.

 
sed "s/[ ]*$//g" <filename> > <new_filename>
 
mv <new_filename> <filename>

this is working fine for volumes upto 20-25 GB.

for the bigger files it is taking more time that it is required for the original file generation. :slight_smile:

I need to process files which will be 4-5 times bigger than this.

Please suggest a faster way.

Not sure how much this buys you but: sed 's/ *$//' file > file1 seems like it should be a little faster.

Thanks ! I can see slight saving on time.

but , is there any other way to do it ? any other command than sed ?

Some process must write the file. Rewrite the process to omit the trailing spaces. Or pipe that process through the sed command as the file is written. As for faster, maybe perl:

perl -pe 's/ *$//'

. For real speed a custom c program is needed. But not writing the spaces to start with would be optimum.

1 Like

Thank you!

you are right. one last question is cut faster than sed ?

to avoid spaces I have modified the code but getting 2 junk characters at the beginning of every line . want to use

cut -c 3- <filename>

to remove those.

What software is producing this file? Might be easier to fix at source?

Can you post say four sample lines with control codes visible. Just wondering if this not a proper unix text file.

sed -n l four_sample_lines.txt