I have a cross reference file which contains 86000 records. The data is old number:new number. There are 100s of files where i need to search for old number and append corresponding new number (preceded by @) to the line containing old number. The files contain millions of records.
Currently I am using sed command as below:
sed "/$v_s_replace_string/s/$/$v_s_new_string/" $v_s_file_name > tempfile1
mv tempfile1 $v_s_file_name
v_s_replace_string = contains old number
and v_s_new_string = contains new number
I need to replace this sed command with and awk command as awk is faster than sed.
I am trying that.
Is there any other option that you can suggest. I need to speed up the processing as much as possible. It is a production run and my script should not take more than 5 mins per file.
For the awk command... could you please modify it to append the new number at the end of the line where it finds a match with old number. The command you gave is adding the new number to begining of the line. I am new to awk. Please help.
I am not trying to advocate the use of sed as I think that awk is a better all-round tool but I think you underestimate sed when it comes to a simple string replacement.
I just did a little test on a string replacement in a big file (more than a million lines). Here are the results:
$ time awk '{gsub("B1058","zzz1058")}1' ventes_all > /dev/null
real 0m3.730s
user 0m3.648s
sys 0m0.040s
$ time sed 's/B1058/zzz1058/g' ventes_all > /dev/null
real 0m2.855s
user 0m2.828s
sys 0m0.028s
$ wc -l ventes_all
1205794 ventes_all