StarCluster - Mailing List Archive

Re: 100 nodes cluster

From: Rayson Ho <no email>
Date: Fri, 28 Oct 2011 09:41:15 -0700 (PDT)

Paolo - from Grid Engine point of view, as long as the execution (slave) nodes can connect to the master and vice versa, then joining 2 or more clusters under the control of a single qmaster is possible.

The minimium configuration files that need to be merged or modified are:

- /etc/hosts (merge)
- AWS security group (may need to check if there is any additional work)

- $SGE_ROOT/default/default/common/act_qmaster (set to the hostname of the "real" qmaster)


And then NFS needs to be remounted if your applications rely on it (NFS is not a hard Grid Engine requirement).

I will need to experiment a bit more to see if other things need to be done -- again, I am a bit busy, but I will have more time to look into StarCluster in mid Nov.


Matt - He's using StarCluster 0.92, unless he has downgraded to an older version.


Also, for parallel jobs, the execution nodes need to see each other (for things like MPI, Hadoop, etc).


Lastly, if someone wants to play with merging 2 starclusters, then setup a 1-node and a 2-node StarClusters, and see if anything additional needs to be done...


Rayson

=================================
Grid Engine / Open Grid Scheduler
http://gridscheduler.sourceforge.net





----- Original Message -----
From: Matthew Summers <quantumsummers_at_gentoo.org>

What version of starcluster are you using, Paolo?

-- 
Matthew W. Summers
Gentoo Foundation Inc.
_______________________________________________
StarCluster mailing list
StarCluster_at_mit.edu
http://mailman.mit.edu/mailman/listinfo/starcluster
Received on Fri Oct 28 2011 - 12:41:17 EDT
This archive was generated by hypermail 2.3.0.

Search:

Sort all by:

Date

Month

Thread

Author

Subject