Re: run command on all nodes

From: MacMullan, Hugh <no email>
Date: Wed, 4 Jun 2014 11:18:17 +0000

 There is 'parallel-ssh' (and parallel-scp). I believe that's part of the public StarCluster images. Good stuff!


On Jun 4, 2014, at 3:27, "Jacob Barhak" <<>> wrote:

Hi Cedar,

There is no direct command I know of. However, there are several simple workarounds.

1. You can certainly qsub - however you do not know which node will run the command unless you specify it using modifiers, so this is probably not what you need.
2. You can use sshmaster and sshnode from your PC to run the command. you can write a simple batch script in either windows or Linux environment to run through the nodes and execute this command for each node. I tried this in the past.
3. You can write a bash script that you can put on the master and that loops doing ssh node with the command you want - you can use & to run this command on each node in parallel to save time. I have not tried this - yet it seems easy enough.
4. You can write a starcluster plugin that will run the command automatically for you for each node. Note, however, that there are already many plugins ready and there might be a plugin that does what you want. For example, there is a plugin that will pip install a python library for you on certain nodes or NFS mount for you. I am using these two plugins.

I agree that a starcluster command such as sshall starcluster that submits a shell command to all nodes in parallel would be helpful. However, I do not know of such functionality.

I hope you find those workarounds helpful.


On Tue, Jun 3, 2014 at 5:56 PM, Cedar McKay <<>> wrote:
What is the easiest way to run terminal commands on each and every node? I can qsub a command as many times as I have hosts, but (perhaps if nodes are busy) nodeX could run the command multiple times, and nodeY not at all.

Our local cluster has something that works like this:

$ rocks run host compute "COMMAND"

That runs COMMAND on every "compute" host.

I'm sort of surprised this functionality isn't totally standard (or is it?) on all clusters. I have gotten tmux going, but it seems both over-complicated, and a bit fragile. Plus, will it scale to a 100 node cluster? Any tips?


