StarCluster - Mailing List Archive

Re: [Starcluster] Multiple MPI jobs on SunGrid Engine with StarCluster

From: Justin Riley <no email>
Date: Mon, 21 Jun 2010 23:28:24 -0400

Hash: SHA1

Hi Damian,

> So I don't need to specify the number of slots to use in mpirun? The
> Sun GridEngine will somehow pass this information to mpirun? Is the
> mpirun argument -n 1 by default?

Yes, OpenMPI is "Sun Grid Engine aware". mpirun will recognize when it
is being executed by SGE and 'do the right thing' passing in the
appropriate options (-host, -np, -n, etc). You shouldn't need the
- --byslots/--bynodes options from mpirun because this is handled by the
parallel environment "orte" that StarCluster configures for you. I just
tested things with a small two-node c1.xlarge cluster and everything
worked as expected without those options to mpirun.

> It's not a requirement but reflects some ignorance on my part. <aybe
> I'm confused about why the first node is called master. I was assuming
> it had that name because it was performing some kind of special
> coordination.

The first node is called master because it's the 'qmaster' for Sun Grid
Engine, the NFS-head node, and also the machine that all EBS volumes are
attached to. For this reason you might choose to limit the number of
slots available (if any) on the master node if you're suffering
performance issues. For now this can be accomplished by:

$ qconf -mq all.q

And then modifying the "slots" line to change the number of slots for
the master node:

- ------------------------------------------------------------
slots 1,[ip-10-194-13-219.ec2.internal=2], \
- ------------------------------------------------------------

In the above example the master node has been modified to have only 2
slots instead of 8.

> Do I still need to provide the -hostfile option? Or is this automatic now?
This should be automatically determined from Sun Grid Engine if all is
working correctly.

Hope that helps,

Version: GnuPG v2.0.14 (GNU/Linux)
Comment: Using GnuPG with Mozilla -

Received on Mon Jun 21 2010 - 23:28:25 EDT
This archive was generated by hypermail 2.3.0.


Sort all by: