I have a scenario where I have to form sql queries in a file using echo command.
I have file1 which has more than 10 lac numbers. For each Number in file 1, I need to create 5 sql queries resulting in 50 lac queries totally.
For doing this I use the below simple method,
for i in `cat file1 | tr -d '\r'`
do
echo "update table set id1=0 where no='$i';" >>query.unl
echo "update table set id2=0 where no='$i';" >>query.unl
echo "update table set id3=0 where no='$i';" >>query.unl
echo "update table set id4=0 where no='$i';" >>query.unl
echo "update table set id5=0 where no='$i';" >>query.unl
done
It takes me more than 12 hours to form the query.unl. Any other efficient and simple way to do this without modifying the file 1 (source file) ???
I'm not surprised... That's a dangerous use of backticks and useless use of cat. It works as a quick fix only but when you use it for anything big, as you've discovered, it has major problems. Like storing the entire file in memory while it's working on it, and possibly truncating the end if it's just too large.
It also produces 5 update statements when you only needed 1, and reopens the query file 5000 times to process 1000 lines when you only needed to open it once for all of them.
SPLIT="$(printf "\r\n");
while IFS="${SPLIT}" read i
do
echo "update table set id1=0,id2=0,id3=0,id4=0,id5=0 where no='$i';"
done > output < input
Or a version in awk which may be faster yet:
awk -v RS="\r\n" '$1 { printf("update table set id1=0, id2=0, id3=0,id4=0,id5=0 where no=\"%s\";", $1); }' filename > output
If you're not on Linux, use gawk or nawk.
---------- Post updated at 01:08 PM ---------- Previous update was at 01:08 PM ----------
Depending on what your database is, it's probably also possible to pipe that output directly into your database, so it can process it as it happens, instead of having to store it in a gigantic temp file in the meantime then start over from the beginning when done.