sumoka
February 15, 2011, 6:35am
1
HI Experts ,
I'm using the following code to remove spaces appearing at the end of the file.
sed "s/[ ]*$//g" <filename> > <new_filename>
mv <new_filename> <filename>
this is working fine for volumes upto 20-25 GB.
for the bigger files it is taking more time that it is required for the original file generation.
I need to process files which will be 4-5 times bigger than this.
Please suggest a faster way.
Not sure how much this buys you but: sed 's/ *$//' file > file1
seems like it should be a little faster.
sumoka
February 16, 2011, 6:15am
3
Thanks ! I can see slight saving on time.
but , is there any other way to do it ? any other command than sed ?
Some process must write the file. Rewrite the process to omit the trailing spaces. Or pipe that process through the sed command as the file is written. As for faster, maybe perl:
perl -pe 's/ *$//'
. For real speed a custom c program is needed. But not writing the spaces to start with would be optimum.
1 Like
sumoka
February 17, 2011, 1:41am
5
Thank you!
you are right. one last question is cut faster than sed ?
to avoid spaces I have modified the code but getting 2 junk characters at the beginning of every line . want to use
cut -c 3- <filename>
to remove those.
methyl
February 18, 2011, 6:28pm
6
What software is producing this file? Might be easier to fix at source?
Can you post say four sample lines with control codes visible. Just wondering if this not a proper unix text file.
sed -n l four_sample_lines.txt