Parallel processing for functions in xargs

I have a script (ksh) which tries to run a function in parallel for performance gains. I am also trying to limit the number of parallel child processes to avoid overloading the system by using a variable to count triggered processes and waiting for completion e.g.

do_something () 
{
...
}

process_cnt=0
while 
do
  if [ $process_cnt -eq 3 ]; then 
    wait
  fi
  do_something &
  let process_cnt=process_cnt+1
done < param_file
wait

The problem with this approach is that it is only as good as the worst performer in a batch, i.e. the wait command will wait for all child processed in a batch to finish before triggering the next batch. There is a more flexible way of doing this by using xargs --max-procs command, but unfortunately that seems to be built for shell commands only.

Is there a way I can use xargs to trigger user-defined functions in parallel?

Please use CODE tags (it has an icon saying: CODE), not ICODE for multiline code.
You're around here for ~3 years, you should know the difference.

It doesnt matter if its a file, a function or a string.

How to run and 'manage' paralell commands has been discusses quite a few times, please use the search function. (run background jobs/task/scripts/commands paralell)
There are also great examples available.

Specificly in this closed thread: [BASH] Script to manage background scripts (running, finished, exit code)
But there are up to 5 pages of search results using combination of the above search words.

hth

1 Like

Install GNU Parallel and do this:

$ foo() { echo $*; }                                          

$ export fun="`typeset -f`"; parallel 'eval "$fun";'foo ::: works
works

$ export fun="`typeset -f`"; parallel 'eval "$fun";'foo :::: param_file

I have been out for a while, and tags seem to have changed since then. The way I remember it, this site used to be about people responding to questions instead of pointing out duplicity. I will have a look at the links that you posted, and will see if that works, but honestly speaking - a suggestion to "search for the right answer" is not really something that strikes me as useful (its not something I would ever do).

Why would one to repeat oneself if its already been said?
Sometimes people (me included) are too focused on the words (or specific tools, your case) they have in their mind, rather than the situation.

There are not too many different ways ways on how to run X numbers of background jobs, and this year this is at least the 3rd thread of its kind, having the same requierments.

One could ask: did you google?
One could ask: did you search the forum?
IMO that is a very legit question with the aim to solve your own question.

I like to help others, even provide tools (with the aim) to help others.
But i dont like to repeat myself, so i redirect to whats already been said, if i'm aware of it (which applies in this case).

Sorry if that hurts your feelings.
Have a nice evening.

EDIT:
A forum is NOT "set and forget".

3 Likes
3 Likes

I appreciate the pointer to the updated code tags by sea. And I also agree with his later comment that mentions that I may have been fixated on a specific tool for my solution. However...

I will not appreciate a pointer to "google it" or "search the forums" with a thanks. Thats as good as "look at the man pages". Technically true, but practically? No. That implies I did not do my research, and makes assumptions about ME. I have a very specific question, and that question has a specific answer (that I am going to detail below). Whether or not something that has already been posted in a forum that is sprawling and growing by the second is irrelevant and wastes everyone's time. I completely disagree with rule # 5.

Why do I have that inclination? Because I have been on the other end a few times. I see a question that is posted and I try to answer that based on my knowledge. That not only helps the specific person who asks, but also reiterates my experience with that stuff. It may not be the most efficient way of resolving questions, but it is the quickest - which is why I used to visit these forums so often (and which is why countless others do). I never felt burdened by a repeat question... because I knew that the person asking the question was a different person, who may be tried and did not get what he needed from google or other resources.

That - in a nutshell - is why I reacted aversely to the original post. Back to OT:

#!/bin/ksh
typeset -f fun
function func
{
 echo $1
}

seq 10 | xargs -I% --max-procs=5 ksh -c "$(typeset -f func)"';func "%"'

The code above allows you to pass a function as a parameter to xargs. The key is declaring the function within the ksh -c string using "typeset -f".

I came to realize that implementing this within my solution would require too much complexity for a small performance gain. I would use parallel but that requires installation of the utility (not an option for me unfortunately.). I am going with my original approach for now.

There is no reason to moan about rule 5. You joined the forum and accepted the rules. This forum is not meant to grow by redundant threads so it's absolutely legit to make you notice.
The rules are made by the people that provide this forum - it is not your forum.

So please stop moaning about the absolutely legit answers. If you have the feeling to continue making a drama where there is no need at all, you can do this elsewhere.
So the choice is yours and if you need help with that, speak up.

2 Likes

Hi.

The post by tange ( the author of parallel -- History of GNU Parallel - GNU Project - Free Software Foundation ) says install, but, for me, that is simply placing the code (a perl script) someplace in my PATH (like many people, I use ~/bin ). Not complicated, does not require special permissions.

I update parallel every now and then, and am currently using parallel GNU parallel 20111122 . The latest appears to be from 22-Apr-2015.

Best wishes ... cheers, drl

1 Like

I did make the choice of not contributing to this forum anymore shortly after checking out which is why I did not see your reply. Sorry if this comes too late, but I believe you are in violations of rules 1, 2 and 3 yourself.

Frankly, the whole thread leaves a bad taste in my mouth. I did not believe that a technical forum could have such a high level of finger pointing for a question that is definitely not "how do I list out files in my current directory" simple.

I find your post offensive, irrelevant to the discussion and your usage of certain terms (drama, moan) obnoxious, but that's just me. Others may see it differently and agree with you because believe it or not, I agree that the original post by sea had merit. But know that your manner of pointing it out was the absolute worst amongst all who cared to do so.