Hi
Say I am interested in processing a big data set over shell, and each process individually takes a long time, but many such processes can be pipe-lined, is there a way to do this automatically or efficiently in shell?
For example consider pinging a list addresses upto 5 times each. Of course I can be more efficient if i can run many pings parallel. Other than splitting input per shell manually, is there an automatic way?
pseudocode
cnt=0;
while read somedata
do
cnt=$(( $cnt + 1 ))
somecommand $somedata >> mylogfile &
if [[ $(( $cnt % 10 )) -eq 0 ]; then # run 10 at the same time
wait
fi
done < file_with_somedata.dat
wait # catch the processes that we did not wait for earlier
1 Like
Hi Jim,
That seems to do the trick in most cases. say I have ping in the place of somecommand, it works perfectly, but when I put another script there, it seems to fail totally, all the outputs seem to have the same input.
Any idea why this may happen??
Thanks a lot..
It would really depend what the script is, what it does, and how you're running it...
Hi
Managed to solve it, I was redirecting outputs to the same tmp file n processing from that.
Thanks guys!
That's quite possibly a bad idea. They may not necessarily be guaranteed to do so cleanly.
1 Like