Rsh reboot in a loop?

Hi folks. I'm trying to get the following script working for rebooting a bunch of clients. Up to now I've been using PSSH, but when they all startup again at the same time I get a few mount problems. So, I'm trying to stagger the reboot command. I know reboot will depend on what's running at the time. According to everything I've found the code attached should work.

But this script exits after the first iteration. I'm guessing the rsh command loses connection without getting a return so it produces an error "closed by remote host" which isn't getting caught

Could someone please help me out, this is starting to drive me nuts! I could do the same in python, but then I'm not learning anything.

Thanks.

#!/bin/bash
set +e

cat /nodes/nodes-128 | while read LINE; do
        echo "Attempting to reset - $LINE"
        rsh pi@$LINE sudo reboot now || true
        sleep .5
done

That is a useless use of cat, don't do that.

I suspect rsh is trying to read from standard input and eating all the following lines. ssh does that too. You can work around that by using a different file descriptor.

while read -u5 LINE
do
        echo "Attempting to reset - $LINE" >&2
        rsh "pi@$LINE" sudo reboot now || true
done 5< /nodes/nodes-128

Ok, well incorporating those couple of things (I'm still using cat. I like cat. I have one!) does make a difference, it now loops for two nodes, it resets at least one of them, them closes my ssh to the header machine.

Just before my session is closed the error tcserror: Input/output error is thrown.

I have since tried nohup and disown, these do similar, they run for a couple of loops then just end.

I should add at this point I've checked and checked, there is no problem with my nodes file. So I continue to be confused.

The problem surely is I want to explicitly ignore "Connection lost', not an error. Connection lost isn't an error, is a loss of connection, it's not a return from the command.

So, now I'm very confused.

EDIT: pssh manages fine which is a python script, so it must be possible with bash surely?

Using cat requires a pipe, and this requires the loop to 1. run in a sub shell and 2. read from stdin (descriptor 1, default).

  1. A subshell is more overhead, and you cannot modify shell variables in the main shell.
  2. rsh (and ssh) read from stdin, that competes with a read from stdin. Work-arounds are: rsh -n ... or rsh </dev/null ...

I think reboot does not take arguments like now , is misleading at least.

Because the connection might be dropped before the command finishes, it is safer to run it in the background wirh a little delay.

rsh -n remotehost "(sleep 1; reboot) &"
1 Like