StarCluster - Mailing List Archive

Re: Are heterogeneous clusters possible in StarCluster?

From: Fran Campillo <no email>
Date: Thu, 11 Dec 2014 13:06:29 +0100

     Thank you very much, Jennifer and Jin. This is very useful
information. The only detail that I miss now is how to add nodes and
stop them from the master (this is, from inside the cluster). With that,
I would be able to automate easily the whole process. Does any of you
guys happen to know whether this is possible?



On 09.12.2014 18:35, Jennifer Staab wrote:
> I am not sure exactly what you are trying to do regarding a single
> cluster. But in my experience the setup of the "master" node is
> propagated to the worker nodes. I would set the master as I wanted
> and then add "workers" that had different attributes ( different AMI's
> and EC2 types, could be spots and/or on-demand). The workers always
> inherited attributes from the master node, specifically NFS volumes
> from the master were always shared amongst the workers ( this included
> other EBS volumes I had mounted using Starcluster config file). When
> you submit the "starcluster addnode" command you just have to make
> sure you specify the attributes(AMI, instance type, spot/on-demand,
> etc.) you want for the added nodes.
> In my experience, the added worker nodes have all their resources
> dedicated to the SGE queue(s). You would use "qconf" command to adjust
> how you want things set up regarding each added node and how you want
> to set up your queues. One good thing about this setup is you can have
> your master as an on-demand or reserved instance and your workers as
> spot instances (bid with cheaper hourly rates). This way jobs running
> on spots that are terminated (due to bid pricing) are just resubmitted
> back to the queue as long as you issue the resubmit option in your
> qsub calls.
> One word of caution is that I didn't mixing and match on OS type or
> virtualization type (PV, HVM) within a single cluster. My thought was
> that there might be underlying incompatibilities in the systems such
> that the propagation of attributes from master to worker nodes might
> not work seamlessly.
> Good Luck.
> -Jennifer
> On 12/9/14 11:30 AM, Fran Campillo wrote:
>> Hi Jin!
>> Thank you so much for your answer :) . Yes, I can try to do that,
>> but then I would need to do manually some stuff in the new nodes that
>> StarCluster usually takes care of (like the password-less ssh and
>> sharing /home and potentially other EBS volumes). Is my assumption right?
>> Thanks again!
>> Fran.
>> On 09.12.2014 16:52, Jin Yu wrote:
>>> Hi Fran,
>>> You can start a master node first and then add different types of
>>> nodes later. You may setup the SGE to define the job allocation
>>> behavior among these nodes.
>>> -Jin
>>> On Mon, Dec 8, 2014 at 6:19 AM, Fran Campillo <
>>> <>> wrote:
>>> Hi,
>>> I began using StarCluster a couple of weeks ago for my
>>> research,
>>> and I find it really useful framework. I had to setup SGE myself
>>> several
>>> times in the past, and StarCluster makes our life way easier.
>>> I still don't know many of the features of StarCluster, and
>>> I would
>>> like to ask the community whether certain things I want to do are
>>> actually possible with the current version from StarCluster. In
>>> particular, I would like to create a heterogeneous cluster on
>>> StarCluster (this is, with different kinds of instances, that I
>>> could
>>> have in different SGE queues. In the problem I have to solve
>>> there are
>>> some stages that need GPU and others that do not, and I would
>>> like to be
>>> able to setup the complete cluster at the beginning of the
>>> process and
>>> work like this:
>>> ------------
>>> 1.- Init: setup of the heterogeneous cluster {gpu_clust,
>>> no_gpu_clust).
>>> 2.- No cuda tasks:
>>> 2.1.- stop gpu_clust instances.
>>> 2.2.- run tasks in no_gpu_clust.
>>> 3.- Cuda tasks:
>>> 3.1.- stop no_gpu_clust.
>>> 3.2.- start gpu_clust.
>>> 3.3.- run tasks in gpu_clust.
>>> 4.- No cuda tasks:
>>> ...
>>> ------------
>>> Is this currently possible with StarCluster? I guess that I can
>>> already fake this behavior creating both clusters from another
>>> Amazon
>>> instance and run the tasks with one or another via ssh, but I
>>> find it a
>>> worse solution.
>>> Thank you very much in advance!
>>> Cheers!
>>> Fran.
>>> _______________________________________________
>>> StarCluster mailing list
>>> <>
>> _______________________________________________
>> StarCluster mailing list
Received on Thu Dec 11 2014 - 07:06:37 EST
This archive was generated by hypermail 2.3.0.


Sort all by: